WaniKani Estimator

Howdy!

I thought I share a little tool that I made for myself called WaniKani Estimator. It helps me decide on how many new radicals/kanji/vocab items to learn each day to make sure that I don’t get overflown with an SRS avalanche later on. In the past I completely gave up on WaniKani because I was getting around 100 reviews per day and couldn’t keep up. From what I read here this burnout experience happens to at the very least some other people, so hopefully my “estimator” page would be helpful!

Note that it makes a few assumptions and simplifies few things so that it won’t necessarily match WaniKani experience 1-to-1, but I think it is pretty useful anyway.

Enjoy!

27 Likes

A lot of time has passed and I think the 10 lessons per day approach still works for me, even though as it turned out there was a bug in the estimator code that prevented it from learning any vocabulary items during the simulation!

I have updated the page so that now it has this bug fixed. In addition to that I tweaked the controls to make review/lesson frequency more straightforward, and added apprentice cap setting. Now the charts match exactly what I saw in my learning:

(This was generated with max 10 lessons per day and 8% mistake rate when reviewing.)

Hope y’all will find it useful! If anything this helped me understand how not to get overwhelmed by a SRS slapback!

3 Likes

Hi, interesting. I just started WaniKani from a fresh start after gotten burned out at lvl24. I was always doing all lessons and reviews as soon as possible. I got to a point where I couldnt handle it and it really feld overweahlming. So after a break of about 4 month, I decided to reset my progress and do only 5 items a day. Now I’m wondering if 10 a day might be a slightly better compromise, since 5 could be very slow.
So one question, when you re talking about 10 a day, you mean literally 10 items a day. No matter what kind of item it is (radicals, kanji or vocs). I ve read that some people do similar approaches, but with extra rules for the radicals (doing them all at once/as soon as possible).
Thanks for sharing

3 Likes

Glad that you decided to give it another go! I understand where you come from completely.

I also do have a separate rule for radicals that I didn’t include in the “Estimator” web page. Since they have 2 times less information than vocab or kanji, I just learn them all at once. That being said, this only matters for the first 15-16 levels, I think? Number of radicals drops to <= 10 after that.

1 Like

I think I will try do the same and try to be sensitive in the beginning on how it stressed me in order to pull on the brakes early enough.

Your tool is amazing by the way. Thanks for your effort.

3 Likes

This is really cool, thanks. I have been debating whether I should cap the number of Apprentice items because I’ve also burned out in the past. I think I’m going to keep it around 50 (it’s a busy year and Japanese isn’t the only language I’m studying, so I want to keep it fairly low), which would put me around level 20 a year from now if I can do it consistently. I’m happy with that.

This seems pretty complex. How does this work? Did you scrape the assignments/subjects to figure out which items have prerequisites?

3 Likes

I think it’s fine to be a little bit flexible with this. I started at 50, but it was a bit too slow for me, so I upped it to 75. I wanted to speed up more, so I went to 100, but that’s too fast. I’ll probably go down to 75 again soon. Flexibility and adapting to your situation is a sign of a sound mind, I think.

5 Likes

This seems pretty complex. How does this work? Did you scrape the assignments/subjects to figure out which items have prerequisites?

The web page does a true simulation of the process with the assignments and their prereqs indeed! Closer to the bottom of the page there is an ALL variable that you can expand to see what they look like.

Thanks!

3 Likes

I thought this was an interesting problem, so I wrote my own version in Python. It’s pretty clunky and I haven’t checked it yet (in hindsight, it probably wasn’t the best idea to do this in one sitting without planning anything), but I had fun and learned a bit.

import math
import matplotlib.pyplot as plt
import os
import pickle
import random
import requests


class Simulation(object):
    class SimulationUnit(object):  # It would probably be more elegant if this
                                   # were a subclass of Subject, but I want to
                                   # be explicit that this is only for the
                                   # simulation, while Subjects are more general
        def __init__(
            self,
            subject,
            f_correct=lambda: True  # Accuracy function
        ):
            self.subject = subject
            self.f_correct = f_correct
            self.unlocked = False
            self.srs_level = 0
            self.next_review = math.inf

        def simulate_review(self, hour):
            # See https://knowledge.wanikani.com/wanikani/srs-stages/ for more
            # information about the penalty factor and incorrect adjustment ct.
            srs_penalty_factor = 2 if self.srs_level >= 5 else 1
            if self.f_correct():
                self.srs_level += 1
            else:
                incorrect_times = 1
                while (not self.f_correct()):  # I am not completely sure how
                                               # this works, but it seems like
                                               # it's based on the number of
                                               # reviews you miss in a single
                                               # session of review
                    incorrect_times += 1
                incorrect_adjustment_ct = math.ceil(incorrect_times / 2)
                self.srs_level -= incorrect_adjustment_ct * srs_penalty_factor
                self.srs_level = max(1, self.srs_level)  # >= Apprentice 1
                
            if self.srs_level == 9:
                self.next_review = math.inf
            else:
                self.next_review = (
                    hour + self.subject.srs_hours[self.srs_level - 1]
                )

        def simulate_lesson(self, hour):
            self.srs_level = 1
            self.next_review = hour + self.subject.srs_hours[0]

    def __init__(
            self,
            subjects,
            max_apprentice=None,
            max_lessons_day=None,
            max_reviews_day=None,
            lesson_batch=5,          # Batch size for lessons; 5 by default
            lesson_times=range(24),  # Hours at which lessons are done (24-hour)
            review_times=range(24),
            hours_to_simulate=24 * 365
        ):
        self.units = dict(
            [(subject.identity, self.SimulationUnit(subject))
             for subject in subjects.values()]
        )
        self.max_apprentice = max_apprentice
        self.max_lessons_day = max_lessons_day
        self.max_reviews_day = max_reviews_day
        self.lesson_batch = lesson_batch
        self.lesson_times = lesson_times
        self.review_times = review_times
        self.hours_to_simulate = hours_to_simulate
        self.current_level = 1
        self.update_unlocks()

    def update_unlocks(self):
        for unit in self.units.values():
            if unit.unlocked or unit.subject.level > self.current_level:
                continue  # Don't bother with unlocked items or higher levels
            # Disqualify (don't unlock) items for which the prerequsities aren't
            # completed (at least a level of Guru 1)
            meets_prereqs = True
            if any([self.units[prereq].srs_level < 5
                    for prereq in unit.subject.i_depend_on]):
                meets_prereqs = False
            unit.unlocked = meets_prereqs

    def evaluate_level_up(self):
        kanji_in_level = [num for num, unit in self.units.items()
                          if (
                              unit.subject.level == self.current_level
                              and unit.subject.classification == "kanji"
                          )
                         ]
        kanji_passed_in_level = [num for num in kanji_in_level
                                 if self.units[num].srs_level >= 5]
        if (
            len(kanji_passed_in_level) / len(kanji_in_level) >= 0.9
            and self.current_level <= 59
        ):
            self.current_level += 1
            self.update_unlocks()
            print(f"Level up: {self.current_level}")
            return True
        return False

    def get_number_apprentice(self):
        return len([unit for unit in self.units.values()
                    if unit.srs_level >= 1 and unit.srs_level <= 4])

    def fetch_lesson_unit_nums(self):
        # Sorting ensures lessons are presented in the appropriate order
        return sorted(
            [num for num, unit in self.units.items()
             if unit.unlocked and unit.srs_level == 0],
            key=lambda num: (
                self.units[num].subject.level,           # Sort first by level
                self.units[num].subject.lesson_position  # and second by index
                                                         # within the level
            )
        )

    def fetch_review_unit_nums(self, hour):
        review_units = [num for num, unit in self.units.items()
                        if unit.next_review <= hour]
        random.shuffle(review_units)  # By default, reviews shouldn't appear in
                                      # a particular order
        return review_units

    def simulate(self):
        hour = 0
        lessons_today, reviews_today = 0, 0
        level_up_hours = []
        # By default (without limits), it's possible to do lessons whenever
        # there are lessons available; any limits on this are self-imposed
        lesson_decider = lambda: len(self.fetch_lesson_unit_nums()) > 0
        # Limiting both the Apprentice count and the review count
        if self.max_apprentice and self.max_lessons_day:
            lesson_decider = lambda: (
                self.max_lessons_day - lessons_today > 0
                and self.get_number_apprentice() < self.max_apprentice
                and len(self.fetch_lesson_unit_nums()) > 0
            )
        # Limiting only the Apprentice count and doing as many reviews as
        # possible (note that you can technically go over max. Apprentice count
        # with my algorithm because I allow lessons to start even if the sum of
        # the batch size and the Apprentice count is higher than the maximum)
        elif self.max_apprentice:
            lesson_decider = lambda: (
                self.get_number_apprentice() < self.max_apprentice
                and len(self.fetch_lesson_unit_nums()) > 0
            )
        # Limiting only the number of lessons completed each day
        elif self.max_lessons_day:
            lesson_decider = lambda: (
                self.max_lessons_day - lessons_today > 0
                and len(self.fetch_lesson_unit_nums()) > 0
            )

        while hour < self.hours_to_simulate:  # There's some repetition here,
                                              # but I prefer to write things out
                                              # multiple times when it makes
                                              # the process clearer

            for hour_within_day in sorted(set(
                list(self.review_times) + list(self.lesson_times)
            )):
                if hour_within_day in self.review_times:
                    available_review_nums = self.fetch_review_unit_nums(
                        hour + hour_within_day
                    )
                    if self.max_reviews_day:
                        review_ct = max(0, self.max_reviews_day - reviews_today)
                        available_review_nums = (  # Limits reviews based on the
                                                   # user-imposed maximum
                            available_review_nums[:review_ct]
                        )
                    for num in available_review_nums:
                        self.units[num].simulate_review(hour + hour_within_day)
                    reviews_today += len(available_review_nums)
                    # Completing a review session can result in kanji moving up
                    # to Guru 1, so we need to check whether the simulated user
                    # has leveled up
                    has_leveled_up = self.evaluate_level_up()
                    if has_leveled_up:
                        level_up_hours.append(hour + hour_within_day)

                if hour_within_day in self.lesson_times:
                    self.update_unlocks()  # Makes sure newly available lessons
                                           # are represented in the first call
                                           # to lesson_decider() below
                    while (lesson_decider()):
                        available_lesson_nums = (
                            self.fetch_lesson_unit_nums()
                        )[:self.lesson_batch]  # Add lessons by batch
                        lessons_today += len(available_lesson_nums)
                        for num in available_lesson_nums:
                            self.units[num].simulate_lesson(
                                hour + hour_within_day
                            )

            hour += 24
            lessons_today, reviews_today = 0, 0
            print(f"Fraction complete: {hour / self.hours_to_simulate}")

        level_up_intervals = (
            [level_up_hours[0]]
            + [level_up_hours[i + 1] - level_up_hours[i]
               for i in range(len(level_up_hours) - 1)]
        )
        level_up_intervals_days = [each / 24 for each in level_up_intervals]
        level_up_levels = range(1, len(level_up_hours) + 1)

        fig, ax = plt.subplots()
        plt.bar(level_up_levels, level_up_intervals_days)
        plt.xlim((0, max(level_up_levels) + 1))
        plt.xlabel("Level")
        plt.ylabel("Days in level")
        plt.show()
        plt.close()


class Subject(object):
    def __init__(self, json):
        self.identity = json["id"]
        self.classification = json["object"]
        data = json["data"]
        self.level = data["level"]
        self.lesson_position = data["lesson_position"]
        self.document_url = data["document_url"]
        # Annoyingly, the SRS timings for the first couple levels are different
        # (see https://knowledge.wanikani.com/wanikani/srs-stages/ for timings);
        # I'm just going to store the timings in each Subject instance, which is
        # inefficient but makes my life much easier
        self.srs_hours = [
                  # Start at Apprentice 1 upon completing the associated lesson
            4,    # Apprentice 2
            8,    # Apprentice 3
            24,   # Apprentice 4
            48,   # Guru 1
            168,  # Guru 2
            336,  # Master
            720,  # Enlightened
            2880  # Burned
        ]
        if self.level <= 2:
            self.srs_hours[0:4] = [2, 4, 8, 24]
        # In my experience, the keys for dependencies don't appear in the JSON
        # result unless there are actually dependencies; we need to handle these
        # differently and assume they do not exist
        keys = data.keys()
        self.depends_on_me = []
        self.i_depend_on = []
        if "amalgamation_subject_ids" in keys:
            self.depends_on_me = data["amalgamation_subject_ids"]
        if "component_subject_ids" in keys:
            self.i_depend_on = data["component_subject_ids"]


def cached_subject_info_fetch(
    request_url,
    parameters,
    filename="wk_estimation_cache.pickle",
    force_refresh=False
):

    def subject_info_fetch(request_url, parameters):
        subjects = {}  # Because we don't know the number of subjects, we need
                       # to define this dynamically
        # Recursively populate the subject information
        def continue_fetch(request_url, parameters):
            response = requests.get(url=request_url, headers=parameters)
            json = response.json()
            for item in json["data"]:
                subjects[item["id"]] = Subject(item)
            # The WaniKani API uses pagination to limit the size of responses;
            # in order to fetch all subject information, we need to send another
            # GET request with next_url, included in the response (str or None)
            if json["pages"]["next_url"]:
                continue_fetch(json["pages"]["next_url"], parameters)
        continue_fetch(request_url, parameters)
        return subjects

    # This isn't completely safe, so make sure you have the permissions set up
    # properly and don't trust random files from strangers
    if os.path.exists(filename) and (not force_refresh):
        with open(filename, "rb") as cache_file:
            subjects = pickle.load(cache_file)
    else:
        subjects = subject_info_fetch(request_url, parameters)
        with open(filename, "wb") as cache_file:
            pickle.dump(subjects, cache_file)

    return subjects


if __name__ == "__main__":
    request_url = "https://api.wanikani.com/v2/subjects"
    parameters = {
        "Wanikani-Revision": "20170710",
        "Authorization": "Bearer <your read-only API token>"
    }

    subjects = cached_subject_info_fetch(request_url, parameters)
    for subject in subjects.values():
        print(subject.__dict__)

    simulation = Simulation(subjects)
    avails = simulation.fetch_lesson_unit_nums()
    for avail in avails:
        print(subjects[avail].document_url, subjects[avail].lesson_position)

    simulation.simulate()

I haven’t really tested this thoroughly.

Edit: The performance is horrendous relative to the script made by @indutny-wani. There are a number of places where I could avoid calculating things multiple times. It works for me, though, and I’m not a professional programmer.

Edit 2: I fixed the logic for determining review and lesson times, which were independent before but now allow for staggered lessons and reviews throughout the day. This gives a better estimate for the time required to level up, I think, and it seems to reflect the timings I’ve seen in some of the faster level 60 celebration posts:

The fastest possible time to reach level 60 is 352 days and 20 hours.

2 Likes

Wow! This is a really cool tool! Under-rated for sure, since this is only the first time I’ve heard of it! Gonna recommend it on the :durtle_hello: Let’s Durtle the Scenic Route :turtle: thread-group. Should help some of us quite a bit, I think!

1 Like

If you were to do every lesson and review as soon as they became available without missing anything, this is what your SRS stages would look like over time.

It’d take 352 days and 20 hours to reach level 60 and 534 days and 8 hours to burn all 9,261 subjects. You’d start burning toward the end of level 26.

1 Like

I don’t think this is entirely accurate anynmore since the “today’s lessons” feature was added (but it’s still very close).

The reason I think that is the chart of lessons done per day has dips below 15 for some days. I assume that’s because in the old lesson order, you could run out of lessons waiting for the second set of kanji to be unlocked. The new lesson picker is pretty good at making sure that never happens at 15 lessons/day.

2 Likes