Monte Carlo‐based Spaced Repetition Simulator - open-spaced-repetition/fsrs-rs GitHub Wiki

This document outlines the architecture of a discrete-time, event-driven simulator designed to forecast learning metrics. The simulator uses Monte Carlo methods to model the stochastic nature of human memory and user behavior within the FSRS framework.

1. Problem Formulation

The primary goal of the simulator is to predict the evolution of a user's learning process over a finite time horizon, $$T_{max}$$. It answers questions such as:

  • What will the daily review count and new card count be?
  • What is the expected daily workload (cost)?
  • How many items will be in a "memorized" state on any given day?

The simulation models a single user interacting with a deck of cards, where each interaction is a stochastic event.

2. Core Concepts

2.1. State Representation

  • Card State: Each card is an independent agent in the simulation. Its state is defined by:

    • Stability ($$S$$): The current stability of the memory.
    • Difficulty ($$D$$): The intrinsic difficulty of the card.
    • Due Date ($$t_{due}$$): The discrete day on which the next review is scheduled.
    • Last Review Date ($$t_{last}$$): The day of the last interaction.
  • System State: The state of the entire system at any time $$t$$ is the collection of all individual card states and the cumulative daily statistics (e.g., review counts, costs).

2.2. Event-Driven Simulation

The simulation does not simply iterate day by day. Instead, it processes events in chronological order. The primary event is a "card review".

  • Event Queue: A priority queue is used to manage all scheduled events. Each card is an item in the queue.
  • Prioritization: The priority of a card in the queue is determined primarily by its due date ($$t_{due}$$). Cards with earlier due dates have higher priority. Ties are broken by prioritizing new (learning) cards over review cards, followed by a customizable priority score.

2.3. Stochastic Elements

The simulation's Monte Carlo nature comes from modeling random events using a pseudorandom number generator (RNG):

  1. Recall Success: When a card is due, its probability of recall ($$\mathcal{R}$$) is calculated. The simulator then performs a random trial (a Bernoulli trial) with probability $$\mathcal{R}$$ to determine if the review was a success or failure (a lapse).
  2. User Rating:
    • For a new card, the initial rating is sampled from a predefined probability distribution $$p_{init}(k)$$.
    • For a successful review, the rating ($$k \in {2, 3, 4}$$) is sampled from a separate distribution $$p_{review}(k | \text{success})$$.
    • For a failed review, the rating is deterministically set to 1 (Again).

3. The Simulation Algorithm

3.1. Initialization

  1. Card Deck Creation: The simulation initializes a deck of cards. This can be a mix of pre-existing cards (with defined $$S$$ and $$D$$) and new, unlearned cards. New cards are assigned sequential due dates starting from day 0, spaced according to the daily new card limit.
  2. Queue Population: All cards are inserted into the event priority queue based on their initial priority (primarily their due date).
  3. Statistics Initialization: Arrays to hold daily statistics (e.g., review_count_per_day, cost_per_day) are initialized to zero for the entire simulation period.

3.2. Main Simulation Loop

The simulation proceeds by repeatedly processing the highest-priority card from the event queue. The loop continues until the queue is empty or all remaining events are scheduled after the simulation horizon $$T_{max}$$.

For each card popped from the queue:

  1. Get Current Time: The card's due date becomes the current time of the simulation: $$t \leftarrow t_{due}$$.

  2. Check Constraints:

    • Time Horizon: If $$t \ge T_{max}$$, the card is retired from the simulation. Its contribution to the "memorized" count is calculated until $$T_{max}$$.
    • Daily Limits: The simulator checks if processing this card would exceed the user-defined daily limits for time $$t$$ (e.g., review_limit, learn_limit, max_cost_per_day).
    • Postponement: If a daily limit is reached, the card's review is postponed. Its due date is incremented ($$t_{due} \leftarrow t + 1$$), and it is re-inserted into the priority queue with its new, lower priority. The simulation then proceeds to the next card in the queue.
  3. Process Event (Card Review): If no constraints are violated, the review event is executed.

    • If the card is new (learning phase):
      1. An initial rating $$k_{init}$$ is sampled.
      2. The initial state ($$S_{init}$$, $$D_{init}$$) is calculated based on $$k_{init}$$.
      3. Optionally, a sub-process simulating the "learning steps" (e.g., pressing "Again" multiple times) is executed to determine the final state ($$S$$, $$D$$) and cost after the learning session.
      4. Daily learning statistics for day $$t$$ are updated.
    • If the card is old (review phase):
      1. The probability of recall $$\mathcal{R}$$ is calculated based on the card's current $$S$$ and the interval since its last review.
      2. A random trial determines if the recall was a success or failure.
      3. A rating $$k$$ is determined based on the outcome.
      4. The card's state is updated to ($$S'$$, $$D'$$) using the FSRS state transition functions $$f_S(S, D, k; \theta)$$ and $$f_D(D, k; \theta)$$. For lapses, this may also include a "relearning steps" sub-process.
      5. Daily review statistics for day $$t$$ are updated.
      6. The "memorized" count is updated for each day in the interval $$[t_{last}, t)$$.
  4. Schedule Next Review:

    1. A new interval, $$ivl$$, is calculated using the updated stability $$S'$$ and the global desired retention $$R$$: $$ivl \leftarrow f_{ivl}(S', R; \theta)$$.
    2. The card's new due date is set: $$t'_{due} \leftarrow t + ivl$$.
    3. The card is re-inserted into the priority queue with its new priority corresponding to $$t'_{due}$$.

3.3. Output

Upon completion of the loop, the algorithm returns a set of time-series vectors representing the daily metrics over the simulation horizon $$T_{max}$$, such as:

  • review_cnt_per_day
  • learn_cnt_per_day
  • cost_per_day
  • memorized_cnt_per_day
  • correct_cnt_per_day