New Combined User Statistics!

Woo, stats! :partying_face:

So, I have recently had an idea of how to get more in-depth publicly available data from WaniKani users. Here are the aggregate statistics that came out of analyzing data from around 89000 WaniKani users.

You probably have to take these with a grain of salt because the data could be skewed as I can only include users in the data set that have a WaniKani community account (don’t know if it really changes something but you never know).

Difficulty Of Items Per Level

The difficulty in the following charts is calculated by the amount of people having an item on their wall of shame and that number is divided by the amount of people on the level of that item and above. This is because the difficulty should be how many people that have already seen the item have difficulties with it.

Kanji

Here are the most difficult kanji of each level:

Kanji with big spikes, meaning they are very difficult, are:

  • 午: noon, lvl 3 (probably because it looks similar to 牛/cow from that same level)
  • 未: not yet, lvl 7 (looks similar to 末/end)
  • 評: evaluate, lvl 21 (???)
  • 莫: endless, lvl 25 (the radical 莫 is called greenhouse)
  • 又: again, lvl 51 (the radical 又 is stool; very big spike, maybe because it was lvl 2 some time ago)
  • 慕: yearn for, lvl 60 (???)

It seems that the free levels still have the biggest spikes. Probably because the users there still have to get used to the system?

Vocab

Here are the most difficult vocabulary items of each level:

These are the difficult vocab items with some of the highest spikes:

  • 生: fresh, lvl 3 (unfitting meaning because kanji 生 means “life” and different reading; very high spike)
  • 行う: to perform something, lvl 6 (kanji 行 means “going”)
  • 都合: one’s convenience, lvl 12 (very strange meaning because kanji are 都/metropolis and 合/suit)
  • 評判: reputation, lvl 21 (weird reading ひょうばん and meaning because kanji are 評/evaluate and 判/judge; very high spike)
  • 滞る: to be overdue, lvl 35 (probably a mix of kanji 滞 meaning “stagnate” and the long ass reading とどこおる)
  • 惨敗: crushing failure, lvl 53 (really don’t know about this one???)

Again, earlier levels have the most soul-crushingly difficult vocab according to this. But I can confirm that 都合 and 行う haunted me for a long time; I still have 評判 only at around guru or so :sweat_smile:

Difficult Items For Users Of A Certain Level

These are different from the above charts in that here the levels of the users having a difficult item are analyzed instead of the level of the item itself. That means one item can come up multiple times because users of multiple levels can find it difficult. This time the height of the graph represents the amount of people having difficulties with that item.

Kanji


The y-axis is logarithmic, meaning each horizontal line represents an increase of times 10 instead of say plus 10. The different colors represent if a kanji was introduced on the same level as it is difficult in or not.

Linear Kanji Graph (Not Logarithmic)

This shows that 午 is a difficult kanji all the way up to level 60 (it occupies the most levels on this graph) because of its very close resemblance with the cow kanji which I can indeed confirm myself to be troubling.
In comparison to that, kanji like 評 that aren’t inherently difficult but are just a little unintuitive don’t take long to get sorted out (here it only lasts as the most difficult kanji for one level).

Vocab


The y-axis is logarithmic, meaning each horizontal line represents an increase of times 10 instead of say plus 10. The different colors represent if a vocab item was introduced on the same level as it is difficult in or not.

Linear Vocab Graph (Not Logarithmic)

生 (なま) seems to be the consistently most difficult vocab item. But still this chart is a bit more diverse than its kanji counterpart.

Reading/Meaning Analysis

For these stats I looked at how many times a user has had to review a difficult items reading and meaning and split them into separate numbers. As before, I scaled this by the number of people that already have seen this item to make the difficulty uniform.
Although I have the concrete numbers I will hide them because they don’t really tell you anything useful. With that out of the way, let’s get started! :grin:

Kanji

The meaning and reading difficulty for selected kanji:

Seeing this, we can see that the reading mnemonic for 了/finish didn’t stick that well for many people. In the beginning the りょう/row boat mnemonics (as used for 了) didn’t work for me either but after many such readings I now automatically connect row boats to the sound りょう.

I think the kanji analysis here isn’t as interesting as the vocab one, but still here’s some more difficult kanji:

1 2 3 4 5 6 7 8 9 10
SINGLE *
TOTAL
READING
MEANING

* for “SINGLE” I only divided the amount this item occurred by the number of users on that single level rather than all the levels above too

Vocab

The meaning and reading difficulty for selected vocab items:

Especially the meaning for 評判 seems to cause many problems, even though I think the reading is also difficult to remember. There are also some numbered day words, such as 八日, up there on the difficulty list for reading as they are often irregular.

Here are some more difficult vocab items:

1 2 3 4 5 6 7 8 9 10
SINGLE * 評判 疎か 滞る 歳暮 哀悼 惨敗 奔放 呆気 募る
TOTAL 評判 用いる 公用 外す 入力 都合 惨敗
READING 評判 疎か 哀悼 逃がす 入力 八日 大した 用いる
MEANING 評判 哀悼 陥落 逃がす 大した 公用 疎か 入力

* for “SINGLE” I only divided the amount this item occurred by the number of users on that single level rather than all the levels above too

Other Item Stats

Here are just some more statistics that didn’t fit into the other categories.
The difference between “NORMAL” and “PERCENT” in the following stats is that for “NORMAL” it considers just the total number of people that find an item difficult. “PERCENT” is the average percentage of correct reviews to total reviews for that item. That’s why “PERCENT” has more of the items that are specifically hard for a single level.

Without Free Level Users

Analysis for only people that are not still in the free levels (1-3).

KANJI 1 2 3 4 5 6 7 8 9 10
NORMAL TOTAL
NORMAL READING
NORMAL MEANING
PERCENT READING
PERCENT MEANING
VOCAB 1 2 3 4 5 6 7 8 9 10
NORMAL TOTAL 用いる 入力 大した 八日 公用 少ない 外す
NORMAL READING 八日 大した 入力 用いる 公用 人生 少ない
NORMAL MEANING 公用 大した 入力 用いる 八日 少ない 人生 評判
PERCENT READING 人生 複雑 覆面 読む 諮る 警察官 資本 足跡 身振り
PERCENT MEANING 無言 熱心 研究 童話 落ちる 見直す 記念日 豊か 遠く

Only (Mainly) Fast Level Users

Analysis for users that are level 50 or over. This is around the time where the fast levels kick in.

KANJI 1 2 3 4 5 6 7 8 9 10
NORMAL TOTAL
NORMAL READING
NORMAL MEANING
PERCENT READING
PERCENT MEANING
VOCAB 1 2 3 4 5 6 7 8 9 10
NORMAL TOTAL 公用 大した 入力 用いる 八日 評判 惨敗 広げる
NORMAL READING 公用 評判 哀悼 疎か 大した 八日 逃がす 入力
NORMAL MEANING 公用 哀悼 評判 陥落 大した 人生 逃がす 観測
PERCENT READING 人生 浸透 結果 顧問 建前 火山 発射する 終える 総体的 見当たる
PERCENT MEANING 露呈 観測 謙虚 広げる 女子 十日 怠る 少女 衝突 上る

Difficult Kanji In Vocab

These are kanji of difficult vocab items. For this I went through all of the difficult vocab items and split them into its kanji. Then I added the difficulty of the vocab to all the kanji it makes up. So basically, this is the difficulty of vocab items that have that kanji as a part of them. Here are the hardest of them:

You can see, for example, that the reading of 日 (にち、ひ、じつ、か…) inside words is very difficult to get right but the meaning is not that bad. The trend shows that the reading of vocab is harder to decipher from its kanji than the meaning, most of the time.

Here are some more:

1 2 3 4 5 6 7 8 9 10
TOTAL
READING
MEANING

Horoscope

If your username begins with L or Z you are more likely to have trouble with the kanji 了. Otherwise 午 is your archenemy.
If your usernames starts with H then 入力 is difficult for you. If you happen to have a username starting with X though 大した is very hard to remember for you. Otherwise it’s 生.

Even though this was compiled from the data I have gathered, I am in no way, shape, or form responsible for the accuracy of these predictions. :innocent:

SRS Item Distribution

The charts are split into the different colors based on the SRS stages of the items. They all use the corresponding colors except for burned because of the convention of coloring it golden.

Here you can see the average number of items a user of a certain level has.
image

Apparently, some people are waiting on level 59 (at least according to this graph). You know who you are, show yourself :face_with_raised_eyebrow:
Also, there are many people who rushed to 60 probably without having completed they’re lessons. That’s why there is a dip at 60, I presume.

I turned this graph into a pure percentage graph without absolute values:
image

There are people on level 1 with burned items? surprised pikachu face

User Distribution

The following graphs use data from all of the around 109000 WaniKani forum users instead of the subset with 89000 users.
This has already been done by Kumirei and others but I want to still include it for completeness sake and as an updated version of those statistics. Here is the link to the one @Kumirei made previously:
What is the level distribution on here? - Reply 12 by Kumirei

By Subscription

Forum users split into “Free”, “Paid”, and “Lifetime” users by their WaniKani subscription.

The same graph but logarithmic:


This one can be misleading. For example, on levels 5 and above it looks like there are more free users than the others but this is not the case; this is just an artifact of the logarithmic projection of the data.

By Trust Level

For more info on trust levels read this discourse article.

Here are all the members (so, people who have trust level 2 on the forum) which means they have taken part in conversation on the forum for an extended period of time:

And, the Regulars. We all know them, we all love them :heart_eyes: and we all want to have that sweet Regular title too


There are actually 47 Regulars currently but 20 of them are not represented in this graph because they either have their level hidden or are staff members.

After seeing that chart, the best tip for becoming a Regular: be level 60.

And that’s about it with the charts.

Scraping Method

First I got all the usernames and levels from the basic badge page (Basic badge on WaniKani Community) because everyone who entered the forums has got this badge. Collecting this data took a few minutes, I think.

The main thing that made me be able to collect this additional data was that many people use the same name for their forum account as for their WaniKani account. I took their forum name and then scraped their WaniKani account website at https://www.wanikani.com/users/{username} which has a lot more data. Thus, collecting this data took sooo long. It was around 16 hours because the WaniKani website has a rate limit of 100 requests per minute (so it translates directly to only 100 users per minute).

If you want the code or anything just ask but be warned it’s very ugly :person_shrugging:

Links/Documents And Stuff

Here the spreadsheet to all the data I collected:
WaniKani user data 2022-09-24 (editable) - Google Spreadsheet
I made it so that anyone can edit so you can make your own charts if you want.

In case anything happens to the old spreadsheet (:thinking:) here is also a version that’s not editable: WaniKanji user data 2022-09-24 (uneditable) - Google Spreadsheet

Thanks to @ctmf for the idea. I hope you’re seeing this :)

Also, if you want to have a userscript that shows you the difficulty of an item based on these stats or something similar, feel free to leave a comment!

81 Likes

I’ll tag @ctmf here because he seems to be in desperate need for aggregate stats :grin:

I don’t think this is all of what you wanted but it should at least be some of it, I hope! And, your idea of creating a userscript for the difficulty of items to be shown in lessons: should I maybe do that and do you still want it?

2 Likes

So that’s the secret! Dammit, all this time I was trying so hard, and for what…?!

~rushes to achieve level 60 in under a year~

8 Likes

Perhaps it’s more like, best tip for making it to level 60: become a regular and stay regular :wink:

14 Likes

That’s a lot more in depth and put together than I ever bothered to do! This is great

12 Likes

Love it!

Never had an problem with these in the wild because i know if i am reading a menu or a schedule. On wani however, big problems!

行う this friend has recently come back to visit me…

Interesting about the number of burned items per level, I’m still on 0.

生 was a surprise for me to see in the difficult list because for me it’s one of the easiest. I have 100% for this one. Maybe I’ve had more nama biru than most?

6 Likes

are there official stats of how many users finished WK or what level average they drop completely?

1 Like

Update: I added a “difficult kanji as part of vocab items” chart to the main post.

That’s the thing, there are very few official stats for WaniKani. There is the stat that Koichi once put out that 20 people finish WaniKani per month. And the dropout rate is discussed in this thread: WK Dropout Rate. The amount of people who are level 60 (according to the data I have) is 2% (2211 out of 109468 people in total). But there are still active users who are going to get to level 60. So being generous one could estimate that around 95% of users drop out before getting to level 60.

Glad you like it!

pssst… don’t tell the others or they’re all going to be Regulars soon. this is our secret!

6 Likes

I would honestly expect it to be close to 95% never paying for WK (and never visiting the forums)

6 Likes

No one loves me. :angry:

6 Likes

This is true

5 Likes

JK I love you Kazz

5 Likes

Great work! I’d love to do some analysis of my own like this one day.

Ironically my regular hayday was back when I was level 37. It’s shame you can’t see historical data to see what levels each user gained/lost regular status.

2 Likes

Gotcha, my lips are sealed

1 Like

Yeah that’s true, I really wish the WaniKani team would make more stats public! I already expected many people to have gotten their Regular status way before level 60 but the conclusion sounded more catchy the other way around ;D

1 Like

These are really cool statistics, thank you for doing the work of putting them all together like this! Also I love the horoscope lol

1 Like

Thank you!

Btw I don’t know if it’s obvious but I didn’t just bs the horoscope even though it might look like it :sweat_smile: It’s actually based on the data I collected with starting letters of usenames. I thought there would be more variety in users with different starting letters but the data set was probably to big for the statistical randomness to still be visible…

2 Likes

I had thought that it might not be bs, but I’ve been burned similarly before. That’s great though lol

1 Like

I’m so confused, I don’t remember writing the meaning of 莫 as endless at all, I always wrote greenhouse. Was greenhouse accepted in the reviews too or is my memory really bad.

1 Like

I just unlocked that kanji recently, and yeah, it does accept “greenhouse”

image

2 Likes