About that workload graph... Is it correct?

Here is a better version of the workload chart btw, with a description what we are seeing on the x and y axis:

image

I do however agree that calling this chart the “workload” chart is misleading. Since it shows number of items in the SRS, not the number of reviews. . We would have to create a chart that is also based on hours instead of weeks to see how different our charts would look to the original workload chart. Everything looks more drastic when it starts and ends at 0.

9 Likes

Ah, so the original graph was just number of items in the system. Thank you, I was not aware of that. When I tried to do it by hours it just looks spiky. Even days looked spiky actually, dropping to weeks was a necessary smoothing adjustment.

2 Likes

spiky

Yeah, I think the chart that we now know as “workload chart” was also simplified to make a point. Probably by someone who was slightly annoyed by the always repeated question of “why is WaniKani so sloooooow in the beginning?” (which I think is better these days because the intervals where not always shortened in the first two levels?).

But it would have been nice to have a legend for the axis at least. Charts without axis labels are evil! :see_no_evil:

5 Likes

At least it’s accurately inversely proportional to a user’s free time

Agreed.

Here’s one last graph then I really should get back to studying. I added a small chance of failure (1% at apprentice decaying to 0.1% for enlightened).

So these are a little random now of course, but it has this crazy dip where the same radical kept failing to get out of apprentice and dropping back by three days. But the overall shape remains similar, more or less. Thank you for the input.

None of these charts look right, something seems significantly off as far as a measurement of workload. That or the data is poorly presented. As an analyst, completely unlabeled charts are a pretty big pet peeve of mine…

Also, I think it’s a poor assumption to think that your accuracy will get better and better to say a 0.1% failure rate at burn reviews. That’s a total fantasy IMO. Burn reviews are hard, and it’s a twofold effect.

One, it’s a test of whether you truly remember the information after a long span. Very easy for little bits of rendaku or odd on/kun combination readings to bite you. Two, while you may truly conceptually know the meaning of a kanji, remembering the specific answer WK expects is another story. You’ll find yourself putting in your own synonyms quite often.

I think I have at least average if not better than average accuracy or progression through WK, and I’d be thrilled if my burn reviews had as low as a 10% miss rate. I’m sure it’s quite a bit higher than that. The number of extra cards it drops down to the Guru level when you get them wrong absolutely makes a step change in your workload.

6 Likes

I understand your concern about lack of labels. I made my graphs for me but I should’ve been clearer in explain what the axes represented. The X-axis in my graphs are weeks since starting and the Y-axis is the total number of reviews and lessons per week.

So for better representation of workload I would first be trying to estimate the relative work involved in different items. Lessons are more work than reviews and kanji and vocab are more work than radicals, so that would likely be a step better.

The fail rates were an experiment to see if the overall shape was affected. I can change the numbers but it seems likely that different people learn in different ways. I could keep the fail rates at zero through apprentice perhaps to get rid of the weirdness that occurs when it runs out of new items (not that that can’t happen just it seems like a distraction) and see if how a 10% fail on the final stage affects the long tail.

It’s a model though, nothing more. I appreciate knowing that it doesn’t reflect your experience.

Feel free to use my accuracy levels:

This is 98.68% accuracy across all items; the only reason it’s this high is because this is my third time doing levels 1-20. My first time, my accuracy was much lower.

2 Likes

Okay, based on those ratios:

The overall shape is still very spiky. Does this match experience? Were there weeks where you’re reviews dropped off a lot?

Quoting from my WK Guide, which is the main advertiser of this graph. I wrote the imperfections of it there, so that everything was clear. I also wasn’t the one creating it btw :slight_smile:

If anyone gets to build a better one, @ me and I’ll consider adding it to the guide :slight_smile:

How are you measuring this? By sides of a card (as in, getting meaning right and reading wrong gives 50% accuracy) or by items (getting just 1 side wrong gives 0% accuracy? If the former, I’d say 80-90% might be a good average. If the latter… 70-80% :thinking:

I’m calculating the all-around accuracy by answers I provide, not by items. So for kanji, that’s 2 answers; if I get meaning wrong but reading correct, that’s counted as 50% accuracy.
image

2 Likes

Would you say they’re… the Axis of Evil? :stuck_out_tongue:

12 Likes

Not exactly. This is from my activity tracking spreadsheet, and keep in mind that I emptied my review queue daily (except for 2 days):

Number of items (reading+meaning, meaning only for radicals) I answered + number of lessons I did:


You’ll notice that I slowed down around level 15. I got into shakier territory + the number of daily reviews was getting to me. You can also tell that even though I slowed down, my workload didn’t change drastically.

Number of reviews the next day + available lessons:

ETA: You should keep in mind that I don’t do all my lessons as soon as they’re available. My brain is not big enough. (Most people on the forum don’t do them ASAP – sometime around level 5 it becomes unfeasible for a lot of people, including me.)

5 Likes

Okay, so I limited my model to 20 lessons per day and used your progression ratios and here’s what I got:

image

The overall contour is pretty close to your actually. Mostly level on a week by week sort of basis but with a sharper dip here and there. This is encouraging, thank you.

3 Likes

Yep, that looks about right.

It’s going to get worse though, because as soon as I hit level 21, I’ll be in new territory and with a lot of new errors. You also have to keep in mind that the more advanced SRS stages feel more difficult, and also that the more levels and items under a person’s belt, the more leeches (items that bounce from high SRS stage to low to high to low to high to low …) they accrue.

I do think that the original graph isn’t all that wrong, however. Here’s the graph that compares my active items (apprentice to enlightened) to burnt items:

From here on now the active items part is going to get a little lower, and then probably stay the same until the end – it’ll depend on how many lessons I do and what my accuracy is like, though.

1 Like

Okay, I’m done with these for now. I’m sharing all the source files here in case anyone wants to take it anywhere as well as screenshots of my final graphs and the following explanation.

I added weightings so that “activity” no longer just means reviews and lessons. Lessons are weighted twice as heavily as reviews and vocabulary and kanji are weighted twice as heavily as radicals. So those big spikes at the end are because for a few levels the new material becomes very vocabulary heavy, hitting both those weightings. Hopefully this is closer to what “workload” should actually mean. Also it smooths the start a little which was previously very high because it had disproportionately many radical reviews.

Secondly I modified it so that the user is definitely inactive (sleeping) for a consistently spaced 8 hours per day, just to be sure nothing weird was going on there. I don’t think it changed much, maybe extended the whole thing by a week.

Thirdly I’m using the progression ratios from @konekush but I’m overriding them for apprentice radicals and kanji to avoid weirdness in the overall advancement caused by insufficient learning items. I don’t think that’s necessarily realistic, just a compromise to make the graph more broadly representative. Otherwise the question will always be “why are there huge dips”.

HTML Graphs + Python Script

Thank you for the input, everyone!

@jprspereira I’m not sure if these graphs would be better for your guide or not. They certainly seem to require a fair amount of explanation but then so does the one you are using. But I’ll leave that up to you.

12 Likes

Would it be difficult to do a graph with 90, 80 or 70% success rates on reviews? That would be really interesting!

Good work though, thanks! It`s really interesting!

That depends on if the percentage a person gets wrong is mostly kanji, or vocab, and on at what stage of srs the fails tend to happen, since certain categories don’t necessarily affect level up, but do affect workload. Or maybe this is all no issue at all, I am no statistician, or know anything about making these sorts of graphs. But it just sounds more complicated to me.

If you want complicated, why not some manner of applet that lets you pick accuracy levels in each category on sliders or whatever, then generate the corresponding graph on the fly? :slightly_smiling_face:

I do believe what was intended was 90% (et cetera) accuracy across the board.

1 Like