Background
I was curious as to how many WK reviews I should expect to do in total to burn every single item. I know that I could probably try to look it up somewhere, and I could definitely apply some of what I learned in my statistics class in uni to figure this out mathematically, but instead I decided to do a Monte Carlo simulation.
How many reviews you need to do of course depends on your accuracy, so what I did was go through a number of different accuracy intervals, and do simulations in each step to see how many reviews you would need before reaching the last SRS level, on average. I start the simulations at 61% because it gets slow lower than that (asymptotically so as you approach 50%), and I doubt many have such low accuracies.
The reason I do this is that I want to find out how far I have come, in terms percentage of reviews that I will need to do to burn everything.
Method
The interval 61%-100% is evenly sampled at 40 points. For each of these 40 different accuracies 100 simulations are run where all items are reviewed, with a probability of success equivalent to the pointâs respective accuracy level, until they are burned. In each simulation the total number of reviews to burn all items is tracked, and the average over the 100 simulations is recorded.
Source Code
import random
import time
class Item:
"""Class for items to be reviewed. Only attributes are SRS level and how many cards the item has."""
def __init__(self, multiplicity):
"""Create a new item."""
self._SRS_level = 1 # Don't count lessons
self._multiplicity = multiplicity # Indicates how many "cards" you have per "note", to use Anki terminology
def review_item(self, p):
"""Evaluates based on probability whether the item passes or fails a review."""
# p**self._multiplicity is used here becuase a user has to pass both the meaning and reading review when multiplicity is 2.
p_observed = random.random()
if p_observed < p**self._multiplicity:
self._SRS_level += 1 # If review is successful item goes up one SRS level
review_count = self._multiplicity # if it's a radical we did one review, else 2
else:
if p_observed < p: # If this is true then we failed one review and passed one
review_count = 3
else:
review_count = 2*self._multiplicity
if self._SRS_level != 1: # Don't change SRS level is item is already at the lowest
if self._SRS_level <= 4: # If item is an apprentice then reduce SRS by 1 level
self._SRS_level -= 1
else: # Else 2 levels
self._SRS_level -= 2
return self._SRS_level, review_count
def create_items(count, double):
"""Creates a hash of items to review."""
items = {}
for i in range(1, double + 1):
items[i] = Item(2)
for i in range(double + 1, count + 1):
items[i] = Item(1)
return items
def review_items(count, items, max_srs, p):
"""Reviews all items until they reach the final SRS level."""
reviews = 0
while count > 0:
keys = [key for key in items]
# Review all items once
for i in keys:
srs_level, review = items[i].review_item(p)
# If item reaches last SRS level remove it from the queue
if srs_level == max_srs:
del items[i]
count -= 1
reviews += review
return reviews
def repeat_run(runs, single, double, max_srs, p):
"""Repeats the same simulation a number of times and returns the average."""
total_reviews = 0
for i in range(1, runs+1):
count = single + double
# Create new items
items = create_items(count, double)
# Review items until all reach the last SRS level
reviews = review_items(count, items, max_srs, p)
total_reviews += reviews
return total_reviews
def add_to_table(accuracy, total_reviews, runs):
"""Adds data to the discourse table we're making."""
data = "| " + str(accuracy) + " | " + str(round(total_reviews / runs))
if (accuracy-1) % 5 == 0 and accuracy:
data += "\n"
else:
data += "| \| "
return data
def parse_time(seconds):
"""Makes sense of seconds."""
minutes = 0
hours = 0
if seconds > 60:
minutes = seconds//60
seconds %= 60
if minutes > 60:
hours = minutes//60
minutes %= 60
return hours, minutes, seconds
def simulate(highest_accuracy, interval_length, lowest_accuracy, runs, number_of_single_items, number_of_double_items, total_estimate, max_srs, estimate=False):
"""Starts the whole simulation."""
# do stuff
table_data = "| % | Reviews | \| | % | Reviews | \| | % | Reviews | \| | % | Reviews | \| | % | Reviews |\n|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-\n"
reviews_done = 0 # Will contain number of reviews done
accuracy = highest_accuracy # Rename variable
t0 = time.time() # Start time
while accuracy >= lowest_accuracy:
p = accuracy / 100 # Accuracy as a probability
# Redo the simulation a number of times for the current accuracy level
total_reviews = repeat_run(runs, number_of_single_items, number_of_double_items, max_srs, p)
reviews_done += total_reviews
# Add to the table
table_data += add_to_table(accuracy, total_reviews, runs)
# Print stuff
if not estimate:
time_elapsed = time.time()-t0
seconds = round(total_estimate / reviews_done * time_elapsed - time_elapsed)
hours, minutes, seconds = parse_time(seconds)
if total_estimate is False:
progress = ""
time_left = ""
else:
progress = str(round(reviews_done / total_estimate * 100)) + '%'
time_left = str(hours) + "h " + str(minutes) + "m and " + str(seconds) + "s remaining."
text = '- ' + progress + ' - ' + time_left
print(accuracy, '-', 'Average Reviews:', total_reviews//runs, text)
# Go to new accuracy level
accuracy -= interval_length
return table_data, reviews_done
def main():
# Settings
highest_accuracy = 100 # Percent
interval_length = 1 # Percent
lowest_accuracy = 61 # Percent
runs = 100 # Per level of accuracy
max_srs = 9 # Number of SRS levels
number_of_double_items = 2027 + 6300 # Number of items with two cards
number_of_single_items = 477 # number of items with one card
# Estimate how many total reviews we can expect by doing one run
# The estimate is used to calculate time left
if runs >= 10:
table_data, reviews_done = simulate(highest_accuracy, interval_length, lowest_accuracy, 1, number_of_single_items, number_of_double_items, 0, max_srs, estimate=True)
total_estimate = reviews_done*runs
else:
total_estimate = False
# Simulate
table_data, reviews_done = simulate(highest_accuracy, interval_length, lowest_accuracy, runs, number_of_single_items, number_of_double_items, total_estimate, max_srs)
print('Total reviews:', reviews_done)
print(table_data)
main()
Assumptions
2027 kanji,
6300 vocab words,
477 radicals.
Kanji and vocabulary reviews each have a meaning an reading, and count as two.
There are 17131 âitemsâ to review in total.
4 apprentice levels,
2 guru levels,
1 master level,
1 enlighten level,
1 burn level.
There are 9 SRS levels in total.
A user does not fail a review item more than once per session.
Results
These are the results after a total of 20,407,018,564 simulated reviews.
The percentage columns indicate the accuracy level. The review columns indicate the average number of reviews needed to burn everything, given the adjacent average accuracy.
% | Reviews | | | % | Reviews | | | % | Reviews | | | % | Reviews | | | % | Reviews |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
100 | 137,048 | | | 99 | 145,906 | | | 98 | 155,534 | | | 97 | 166,106 | | | 96 | 177,775 |
95 | 190,611 | | | 94 | 204,773 | | | 93 | 220,687 | | | 92 | 238,860 | | | 91 | 258,023 |
90 | 281,032 | | | 89 | 306,655 | | | 88 | 336,041 | | | 87 | 369,310 | | | 86 | 408,331 |
85 | 453,812 | | | 84 | 505,797 | | | 83 | 566,916 | | | 82 | 641,099 | | | 81 | 729,555 |
80 | 834,471 | | | 79 | 959,757 | | | 78 | 1,114,089 | | | 77 | 1,302,871 | | | 76 | 1,533,677 |
75 | 1,822,811 | | | 74 | 2,182,566 | | | 73 | 2,626,532 | | | 72 | 3,193,191 | | | 71 | 3,916,460 |
70 | 4,828,658 | | | 69 | 5,993,856 | | | 68 | 7,517,466 | | | 67 | 9,468,785 | | | 66 | 12,034,521 |
65 | 15,398,389 | | | 64 | 19,811,452 | | | 63 | 25,668,246 | | | 62 | 33,436,116 | | | 61 | 43,932,400 |
Conclusion
My average accuracy is 95.61%, and I have done 112,663 reviews so far. Looking at the 95% cell I see that I will need to do 190,611 reviews to burn everything. As such I can calculate that I am 59.1% (112,663/190,611) of the way to burning everything.
You can find your own (current) total number of reviews, and your total accuracy, on www.wkstats.com (see highlighted parts of screenshot), and calculate how far youâve come in your journey to burn everything.