Wildly inflated accuracy

WK seems to believe I am far more accurate than I deserve. I will explain what I mean, and give details of what I do on the site, below. I couldn’t find anything relevant, but sorry if it’s a common question / answer.

For starters, this is ultimately based on what displays for me in WK Stats. Here is a screenshot:

While that level of accuracy would be awesome, there is z e r o % chance that I really am that accurate. I know that WK Stats is not WK, though, so I always just assumed there was an issue with my stats transferring. However, today, I went to my review page on the WK website for the first time in a while (more on that below), and it summarized my last review session as 1,000-some correct, and only 22 wrong. Now… I mean, no. Couldn’t be me.

Details of Usage:

-I use no userscripts on my own via a browser. (Not being a weird stickler, I just don’t understand how to use them and don’t care to figure it out).
-I do use Tsurukame app for iOS a lot. The app has userscripts built in.
-On Tsurukame, I will frequently manipulate the number of lessons displayed, and I use the app to reorganize my reviews (i.e. show me the ones from the current level first).
-I almost never use any other feature for “cheating,” such as “Ask Me Again Later.” For typos, like if I accidentally hit Enter when I’m still typing an answer and it gets marked wrong, I will use the feature. But that is not very common, and it simply causes the question to reappear in reviews. As far as I can tell, if you treat those typos like they never happened, my accuracy % is still correct. When I finish a session, it is 88-92% usually. Way lower on bad sessions. Occasionally higher if it’s just a batch of new words I remember from a few hours ago.

Is any of that commonly a trigger for messed up accuracy?
The reason I noticed it’s not limited to WK Stats is because reviews on Tsurukame are “saved up” and displayed on your Review Summary screen the next time you do reviews on the WK site. (Example: At 8 am, I do 20 reviews normally, on the WK site. I finish. It tells me I got 0 wrong, 20 right. After that, at 9 am, I use Tsurukame to do 20 reviews–5 wrong, 15 right. At 10 am, I use Tsurukame again for 20 reviews–0 wrong, 20 right. Later, I get back on the WK website and go to do some reviews. The Reviews Summary screen (where you go to click “Start Session”) tells me that my last session was 40 reviews–5 wrong, 35 right. Basically, it takes all the Tsurukame reviews I’ve done since the last time I used the website and shows me the combined total.)

So when I used Tsurukame exclusively for a while, and loaded the Reviews Summary today, it showed me that I got 22 wrong, in the same period that I got over 1,000 correct. That is simply… impossible. I have been so bad during the last 2 levels.

Here is some more detail if you’re intrigued / inclined:

-From January-May 2020, I covered about a level per week, and did maintain extremely high accuracy (upper 90’s).
-Then, I reset from Level 22 or 23 back to 20.
-From roughly June through September 2020, I was very busy at work and took a break.
-Ever since I re-started in October with daily reviews and lessons, I am proceeding more carefully (slowly) through lessons, but my accuracy is way lower. When I complete a review session, I feel good if it’s 90%.
-I have now done about as many levels at this new, lower accuracy as I did at the old one, yet my stats have remained very high.

Sorry for length, appreciate your thoughts.

Just to be clear, when you talk about your “end of session” accuracy, is that what shows up on the review summary page and not the percentage that was showing in the corner when you were on the last item still?

Review summary percentage is “correct items / total items”. It will always be a lower percentage than any calculations that are " correct reviews / total reviews" since an item consists of multiple reviews and you need to get them both right for it to be a "correct item.”

So if you’re used to looking at correct item percentage, correct review percentage will be higher.

For example, you do 10 vocab items. You get the meaning wrong one time, but get it right after that.

You got 90% (9/10) on items. That is what will show on the summary page.

You got 95.2% (20/21) on reviews. That is what shows on the stats page.

7 Likes

I suppose if I were good at visualizing numbers I wouldn’t be in this situation…

When I say that I usually get 88-92% for a review session, but sometimes worse, I mean the number that displays on the “Review Summary” at the end. For example, I just did reviews for 24 vocab words. While I was doing them, the number in the corner said 94% at the end. The Review Summary that displays after said it was: “Correct Answers: 87% (21/24).”

I have always understood the “corner %” and the “after %” were based on different things, as you described.

I am still having trouble seeing how I would have those numbers on my WK Stats in any case. And certainly, no matter how it’s measured 22 / 1,000+ is not something I achieved.

Clearly, it’s not the end of the world, as using a 3rd party site to see my stats isn’t the point, and has no impact on my learning. Maybe there’s no error and I am bad at visualizing averages (which is true to some extent either way).

(You added the examples at the end of your post after I had started replying. With those numbers, I can see how it would be possible that the stats have stayed really high, and I’m just bad at conceiving of it. But I’ve been convinced it’s wrong for so long and every review session in the 70s and 80s for the items has reinforced it. I guess math is crazy and I will delete this pending any other comments).

The reason you’re seeing a difference is because there are actually 2 accuracy measures: one reporting the total number of answers correct, and one reporting the total number of items correct. The first one counts meanings and readings separately, whereas the second one counts them as a single item. The first number is often higher, since unless you get both the meaning and the reading wrong, or mess up the same item a lot of times during a single session, you’ll still get part of the answer right.

WKstats reports the first higher number, which is the same number you see when doing your reviews. After a review, the summary screen reports the second number, which is the lower number, hence the difference.

As an example, say you get (ひと) as a vocabulary review. If you mess up the reading once, but got the meaning right in one go, you’ll have given 3 answer (2 readings + 1 meaning), and gotten 2 of them correct (the first meaning, and the second reading question), thus the first review accuracy will be 66%, whereas you got 0 items right in total, so your end-of-review accuracy will be 0%. The first number often seems a lot higher than what you’d expect from your accuracy.

The final thing that throws off the total reported on WKstats is history. Since it counts reviews from all the way back during the first levels, most people will have a bit of a “buffer” of sorts against mistakes if they completed the earlier levels with a high accuracy. The further you get the less likely you are to see your accuracy fluctuate. So even if you’re scoring really low for a month, if you had a decent accuracy all the months before that, you’re unlikely to see it vary more than a percentage point or 2.

All in all, while the accuracy is a rough indication of how well you’ve been doing since you started, it’s not that informative of a measure by itself.

6 Likes

By the way, those numbers are ignoring a lot of things, since they are aggregated over a long period of time. Let’s say there’s an item you reviewed five times. Got it wrong four times and finally right. The aggregate summary will only show it once, right.
It’s not hard to have 22 wrong out of a lot of correct reviews, since a lot of wrong items just got overwritten in the overview.

Tl;dr the summary page does not give the full data when used with a 3rd party app like Tsurukame.

4 Likes

Thank you to all three of the very kind people who have taken the time to add important information to the topic…

Especially with the realization that the reviews summary page would overwrite previously wrong answers when I eventually get it right in a different session… I suppose my high-B accuracy-per-question could just be displayed as a solid A-per-item for the various reasons discussed.

3 Likes

Congrats. I applaud to your full disclosure but never really understand these type of posts. Why do you need to explain, or justify, to anybody but yourself that you are not “cheating” when doing WK?

We are all in it to learn Japanese and in the end the actual accuracy has nothing to do with our ability to speak, read, understand Japanese. Just learn at your own pace, using whatever techniques work for you and enjoy it.

Sorry to be so candid, and nothing personal, but I’ve been seeing too many of these types of posts lately around here and they seem too much of a humblebrag to say it nicely;) I guess I need to take a break from the WK forums for a little :rofl:

Peace and out :facepunch:

1 Like

Yeah, I mean, I fully recognize that’s how this reads now. However, please try reading my original post charitably–I get like 70-80% right when I do reviews (in the way I understood it), and I truly believed that WK had been malfunctioning for me in calculating the stats. I was curious as to why it was happening, not whether it was happening.

I acknowledged early on that it really doesn’t matter if it’s miscalculating it or not, but it felt like it was worth asking after 15 months when I saw the 22-1,000 screen. I was detailing how I use the app in case the features could explain the miscalculation.

The community of people learning Japanese is notorious for being unnecessarily competitive (there are lots of threads about it on here, even in my short time), and it annoys me as much as it appears to annoy you. I have posted many times about learning to ignore ex-pats who always try to one-up. Sorry if this raised your alarm–I truly don’t think there’s a wrong way to learn Japanese, and I can promise you I did not set up an elaborate ruse to brag about my review accuracy without userscripts.

5 Likes

It looks like I might have “assumed”, or read too much into your post. That’s on me. I apologize for not reading your initial post thoroughly, and a bit of “snarkiness” in my reply. Keep up the good work👍

4 Likes

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.