Anki Statistical Reports - ghrgriner/anki-stats GitHub Wiki
This document is based on v25.02 of the Anki desktop application.
Relevant Data Structures
Anki Database Structure
The data for an Anki collection is stored in a sqllite database. Relatively complete documentation of the Anki database schema (subject to the caveats listed below) is available elsewhere in the ankidroid repository which was based on the work of Shawn A. Williams.
We will not duplicate the above documentation here. However, note the above documentation is not up-to-date and we note several corrections and clarifications here. Furthermore, we have not investigated whether AnkiDroid and the Anki desktop application use identical database schemas and our comments in this document are based on the desktop application.
Lastly, note user information and (sometimes) technical information is also available in the Anki Manual.
Cards
Cards have several fields that track the state. The information stored in the
type and queue attributes is sometimes redundant. Separate fields exist
so that type always stores which of the four learning states the card
is in, while queue will change if a card is buried or suspended. When
the card is unburied or unsuspended, then queue is reset based on
the value of type.
Card Type
A card can be one of four learning types. The valid types are
listed in the enum below:
// code excerpt from Anki repository
pub enum CardType {
New = 0,
Learn = 1,
Review = 2,
Relearn = 3,
}
The general idea is that a card starts as New and is set to
Learn at the first review. When it graduates from the Learn
phase, it becomes Review. If it is answered wrong in Review,
it becomes Relearn. It can then graduate from Relearn
back to Review, with this process of switching between Review
and Relearn continuing indefinitely. Additional details
are on the Card Type page.
Card State
The Rust backend defines a structure called CardState that
incorporates (in most cases) the type of the card as well as other
information. Therefore, care should be used when referring to a card's
'state' to make it clear whether state is meant in a general or this
specific sense. Understanding of the CardState is not needed when
creating the statistical reports. However, additional detail for
is provided on the Card State page.
Queue
The queue variable is used when ordering the cards for study. Generally, cards
of New Type are in the New queue, cards of Learn or Relearn type
are in the Learn or DayLearn queue, and cards of Review type are in the
Review queue, with negative queue values used used to indicate suspended
or buried cards. The queue variable is sometimes used in the standard
statistical reports to exclude suspended or buried cards. Additional details
are on the Card Queue page.
// code excerpt from Anki repository.
pub enum CardQueue {
New = 0,
Learn = 1,
Review = 2,
DayLearn = 3,
PreviewRepeat = 4,
Suspended = -1,
SchedBuried = -2,
UserBuried = -3,
}
Card (Deck) Preset
Every non-filtered deck has an associated property called the preset.
Recall that a deck can have subdecks. For example, there can be a
deck Math and a subdeck Math::Algebra. This defines a tree
structure for all (non-filtered) decks, where Math is the parent
of Math::Algebra and Default is the parent of Math.
If the deck options for a deck or subdeck is never changed, then the
preset is the preset of the parent. The "Default" preset will still
exist even if the default deck is deleted (i.e., the deck named
Default that is created when the collection is created). (TBD:
Check previous sentence.)
If a card is moved to a filtered deck, the non-filtered deck it was
moved from is stored and called the 'original deck'. Cards can be
selected by their preset or the preset of their original deck
(whichever is applicable) in the browser using preset:[some_preset_name].
The relevance of this for our purposes is because when setting the
FSRS parameters for a deck, there is a search box that defines
which cards will have their memory state reset, and the default
for the search box is preset:"preset_of_current_deck" -is:suspended.
Review Log
Information about reviews that were performed are stored in the
revlog table. This also contains information about reviews that
were rescheduled.
Terminology
If a review was rescheduled manually, it's possible
the user actually reviewed the material on the card before deciding
to reschedule it. It's also possible the user set the due date in
the browser without looking at the card contents. For reviews
rescheduled when FSRS parameters were initialized or changed, the
user would not have seen the card at all during the rescheduling.
Nevertheless, for brevity, we refer to all entries in the revlog
table as 'reviews' rather than 'review log entries'
in this documentation. This should not cause confusion, since the
charts that analyze data in the revlog table always exclude
these Manual and Rescheduled reviews either explicitly using
the 'review kind' variable described below, or implicitly by
limiting to reviews where the answer button was pressed.
Review Kind
In general, the review kind aligns with the type of the card
at the time of the review. However, cards with CardType::New will have
Learning as the review type for their first review. In addition, there
are review types Filtered, Manual, and Rescheduled with the meanings
described in the comments below.
// code excerpt from Anki repository, our comments start with two '//'
// while repository comments start with '///'
pub enum RevlogReviewKind {
Learning = 0,
Review = 1,
Relearning = 2,
/// Old Anki versions called this "Cram" or "Early". It's assigned when
/// reviewing cards before they're due, or when rescheduling is
/// disabled.
// In particular, note that the above applies even if the card is in a
// Filtered deck.
// If scheduling is disabled, then the `factor` field will be set to 0.
// Otherwise, `factor` is set to a non-zero value.
Filtered = 3,
// By (1) selecting 'Set Due Date' for a card. This is the entry made
// at the time the due date is set. Once the due date is reached and the
// review occurs another entry will be made in `revlog`.
// Or by (2) selecting 'Reset card' for a card. In this case, `factor`
// in the database (= `ease_factor` in the Rust code) will be 0.
Manual = 4,
// Set after selecting 'Reschedule cards on change' in the FSRS options
// for a deck
Rescheduled = 5,
}
Impact of Resetting or Deleting a Card on the Review Log
If a card is reset, existing records remain in the revlog table.
As noted in the comments for the struct above, an entry will be
added to the table with RevlogReviewKind::Manual. The impact
on the card is discussed on the Card Type page.
If a card is deleted, all records for the card are deleted from
revlog (as the cards record is also deleted).
Database Field Names and Rust Variable Names
Listed below are the database field names in the revlog table
and the corresponding Rust field names in the RevlogEntry structure.
Refer to the documentation linked above for additional details on the
database fields.
| Database Field Name | Rust Data Type | Rust Field Name (members of RevlogEntry) |
|---|---|---|
id |
RevlogId (i64) |
id |
cid |
CardId (i64) |
cid |
usn |
Usn (i32) |
usn |
ease |
u8 |
button_chosen |
ivl |
i32 |
interval |
lastIvl |
u32 |
last_interval |
factor |
u32 |
ease_factor |
time |
u32 |
taken_millis |
type |
enum |
review_kind |
Anki Statistical Reports
The table below lists the statistical reports generated in
the Anki statistics window
(obtained by clicking the Stats button in the main window).
| Title | Population | Comments |
|---|---|---|
| Today | Reviews | Omits RevlogReviewKind::Manual and RevlogReviewKind::Rescheduled reviews. |
| Future Due | Cards | Omits all CardType::New and CardQueue::Suspended cards. Buried cards (CardQueue::SchedBuried or CardQueue::UserBuried) due on or before the current day are also omitted. |
| Calendar | Reviews | Omits RevlogReviewKind::Manual and RevlogReviewKind::Rescheduled reviews. |
| Reviews | Reviews | Omits RevlogReviewKind::Manual and RevlogReviewKind::Rescheduled reviews. Counts are stratified by RevlogReviewKind, except reviews of kind RevlogReviewKind::Review are reported as 'Young' or 'Mature' based on whether last_interval < 21. |
| Card Counts | Cards | If 'excluding inactive' is checked, suspended and buried cards are omitted (CardQueue::Suspended or CardQueue::SchedBuried or CardQueue::UserBuried) |
| Review Intervals | Cards | Limit to CardType::Review or CardType::Relearn |
| Card Ease (non-FSRS decks) | Cards | Analysis of card.factor / 10, where factor is the database field name and Python name. In Rust, this is called card.ease_factor. |
| Card Stability (FSRS decks) | Cards | Limit to cards where card.memory_state is not null. [a] This is an analysis of card.memory_state.stability (using Rust or Python name). In the database, this is stored in the card.data field with other FSRS information. Stability is rounded to the nearest integer in the back-end before binning. [b] |
| Difficulty (FSRS decks) | Cards | Limit to cards where card.memory_state is not null. [a] The code that extracts this also filters by (CardType::Review or CardType::Relearn), but this is redundant. This is an analysis of card.memory_state.difficulty (using Rust or Python name). In the database, this is stored in the card.data field with other FSRS information. |
| Retrievability (FSRS decks) | Cards | Limit to cards where the card.memory_state is not null. [a,c] |
| Hourly Breakdown | Reviews | Omits RevlogReviewKind::Filtered, RevlogReviewKind::Manual, and RevlogReviewKind::Rescheduled reviews. The hour is calculated as the time of the review (stored as epoch time, the number of seconds since 1/1/1970 in UTC) plus the current time zone offset (allowing for daylight savings time). [d] |
| Answer Buttons | Reviews | Omits RevlogReviewKind::Manual and RevlogReviewKind::Rescheduled reviews. RevlogReviewKind::Learning, RevlogReviewKind::Relearning, and RevlogReviewKind::Filtered reviews are all reported as 'Learning', while RevlogReviewKind::Review reviews are reported as 'Young' or 'Mature' based on whether last_interval < 21. |
| Added | Cards | No cards excluded |
| True Retention | Reviews | Population: button_chosen > 0 (not rescheduled) and (not RevlogReviewKind::Filtered or ease_factor != 0) and (RevlogReviewKind::Review or last_interval <= -86400 or last_interval >= 1). Note that here a 'Young' review is any review where last_interval < 21 and 'Mature' reviews are the remaining. This is unlike the 'Reviews' and 'Answer Buttons' charts which only partition the RevlogReviewKind::Review reviews into 'Young' and 'Mature'. |
[a] Once FSRS is enabled, a card will have its memory state set once the card is answered (no longer New type). If an existing deck is converted to FSRS, the free-text box containing default text preset:"preset_of_current_deck" -is:suspended defines the
cards which have their FSRS state assigned, except New type cards are always also excluded and cards with all reviews
prior to the date set in the Advanced > Ignore cards last reviewed before option are also excluded.
[b] This is unusual. For other histograms in the report, the variable is binned without rounding.
[c] The retrievability calculated by this package will not always match the retrievability presented in Anki. See here for details.
[d] In particular, note that the time zone or time zone offset is not stored for each review. If a user always reviews between 6 and 7 in the morning local time and then moves to a time zone 5 hours earlier, the reviews will appear in the 1:00 - 2:00 am bin of the histogram. A similar issue will occur if the local time zone offset changes due to daylight savings time.
Filtering By Last 12 Months or All History
There is a radio button at the top of the statistics window where users can switch between viewing data from the last 12 months and all history. This option only affects figures / charts where the analysis population is 'Reviews'.
Rollover Hour and Filtering Review History
When calculating the past or (scheduled) future day of a review or
filtering review history, the rollover hour is used if it was set in the application
(i.e., by setting the Next day starts at option to something other than
0 hours past midnight in the Tools > Preferences > Review menu). For example,
when filtering the 'Hourly Breakdown' or 'Answer Buttons' chart by amount of
review history (1 month, 3 months, 1 year), the day and hour of the next review day is
calculated and reviews that occurred 30, 90, or 365 days before this are ignored. (Here a day is
defined as 86400 seconds.) Other charts that filter past or future expected
reviews by time period behave similarly, although the exact cut-offs used for the
time periods sometimes vary by chart. In particular, the 'Reviews' chart cuts off the
'3 months' and '1 year' charts at 89 days and 364 days before the next review day, respectively,
while the 'Added' chart cuts off the '1 month' chart at 31 days prior to the start
of the next review day, and the 'Future Due' chart cuts off the '3 month' chart at 89 days
after the current day.