Woo, stats!
So, I have recently had an idea of how to get more in-depth publicly available data from WaniKani users. Here are the aggregate statistics that came out of analyzing data from around 89000 WaniKani users.
You probably have to take these with a grain of salt because the data could be skewed as I can only include users in the data set that have a WaniKani community account (don’t know if it really changes something but you never know).
Difficulty Of Items Per Level
The difficulty in the following charts is calculated by the amount of people having an item on their wall of shame and that number is divided by the amount of people on the level of that item and above. This is because the difficulty should be how many people that have already seen the item have difficulties with it.
Kanji
Here are the most difficult kanji of each level:
Kanji with big spikes, meaning they are very difficult, are:
- 午: noon, lvl 3 (probably because it looks similar to 牛/cow from that same level)
- 未: not yet, lvl 7 (looks similar to 末/end)
- 評: evaluate, lvl 21 (???)
- 莫: endless, lvl 25 (the radical 莫 is called greenhouse)
- 又: again, lvl 51 (the radical 又 is stool; very big spike, maybe because it was lvl 2 some time ago)
- 慕: yearn for, lvl 60 (???)
It seems that the free levels still have the biggest spikes. Probably because the users there still have to get used to the system?
Vocab
Here are the most difficult vocabulary items of each level:
These are the difficult vocab items with some of the highest spikes:
- 生: fresh, lvl 3 (unfitting meaning because kanji 生 means “life” and different reading; very high spike)
- 行う: to perform something, lvl 6 (kanji 行 means “going”)
- 都合: one’s convenience, lvl 12 (very strange meaning because kanji are 都/metropolis and 合/suit)
- 評判: reputation, lvl 21 (weird reading ひょうばん and meaning because kanji are 評/evaluate and 判/judge; very high spike)
- 滞る: to be overdue, lvl 35 (probably a mix of kanji 滞 meaning “stagnate” and the long ass reading とどこおる)
- 惨敗: crushing failure, lvl 53 (really don’t know about this one???)
Again, earlier levels have the most soul-crushingly difficult vocab according to this. But I can confirm that 都合 and 行う haunted me for a long time; I still have 評判 only at around guru or so
Difficult Items For Users Of A Certain Level
These are different from the above charts in that here the levels of the users having a difficult item are analyzed instead of the level of the item itself. That means one item can come up multiple times because users of multiple levels can find it difficult. This time the height of the graph represents the amount of people having difficulties with that item.
Kanji
The y-axis is logarithmic, meaning each horizontal line represents an increase of times 10 instead of say plus 10. The different colors represent if a kanji was introduced on the same level as it is difficult in or not.
This shows that 午 is a difficult kanji all the way up to level 60 (it occupies the most levels on this graph) because of its very close resemblance with the cow kanji which I can indeed confirm myself to be troubling.
In comparison to that, kanji like 評 that aren’t inherently difficult but are just a little unintuitive don’t take long to get sorted out (here it only lasts as the most difficult kanji for one level).
Vocab
The y-axis is logarithmic, meaning each horizontal line represents an increase of times 10 instead of say plus 10. The different colors represent if a vocab item was introduced on the same level as it is difficult in or not.
生 (なま) seems to be the consistently most difficult vocab item. But still this chart is a bit more diverse than its kanji counterpart.
Reading/Meaning Analysis
For these stats I looked at how many times a user has had to review a difficult items reading and meaning and split them into separate numbers. As before, I scaled this by the number of people that already have seen this item to make the difficulty uniform.
Although I have the concrete numbers I will hide them because they don’t really tell you anything useful. With that out of the way, let’s get started!
Kanji
The meaning and reading difficulty for selected kanji:
Seeing this, we can see that the reading mnemonic for 了/finish didn’t stick that well for many people. In the beginning the りょう/row boat mnemonics (as used for 了) didn’t work for me either but after many such readings I now automatically connect row boats to the sound りょう.
I think the kanji analysis here isn’t as interesting as the vocab one, but still here’s some more difficult kanji:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|
SINGLE * | 又 | 午 | 了 | 陵 | 喧 | 瑞 | 莫 | 久 | 嘱 | 抑 |
TOTAL | 午 | 又 | 了 | 慕 | 出 | 少 | 分 | 上 | 正 | 牛 |
READING | 午 | 了 | 上 | 又 | 出 | 慕 | 入 | 正 | 抑 | 力 |
MEANING | 午 | 了 | 又 | 慕 | 上 | 概 | 猶 | 出 | 抑 | 少 |
* for “SINGLE” I only divided the amount this item occurred by the number of users on that single level rather than all the levels above too
Vocab
The meaning and reading difficulty for selected vocab items:
Especially the meaning for 評判 seems to cause many problems, even though I think the reading is also difficult to remember. There are also some numbered day words, such as 八日, up there on the difficulty list for reading as they are often irregular.
Here are some more difficult vocab items:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|
SINGLE * | 生 | 評判 | 疎か | 滞る | 歳暮 | 哀悼 | 惨敗 | 奔放 | 呆気 | 募る |
TOTAL | 生 | 評判 | 用いる | 土 | 公用 | 外す | 入力 | 都合 | 内 | 惨敗 |
READING | 生 | 評判 | 土 | 疎か | 哀悼 | 逃がす | 入力 | 八日 | 大した | 用いる |
MEANING | 評判 | 生 | 哀悼 | 陥落 | 土 | 逃がす | 大した | 公用 | 疎か | 入力 |
* for “SINGLE” I only divided the amount this item occurred by the number of users on that single level rather than all the levels above too
Other Item Stats
Here are just some more statistics that didn’t fit into the other categories.
The difference between “NORMAL” and “PERCENT” in the following stats is that for “NORMAL” it considers just the total number of people that find an item difficult. “PERCENT” is the average percentage of correct reviews to total reviews for that item. That’s why “PERCENT” has more of the items that are specifically hard for a single level.
Without Free Level Users
Analysis for only people that are not still in the free levels (1-3).
KANJI | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
NORMAL TOTAL | 午 | 了 | 出 | 上 | 少 | 正 | 分 | 右 | 力 | 牛 |
NORMAL READING | 午 | 了 | 出 | 上 | 正 | 少 | 右 | 刀 | 力 | 夕 |
NORMAL MEANING | 午 | 了 | 出 | 少 | 上 | 正 | 右 | 分 | 牛 | 夕 |
PERCENT READING | 助 | 丸 | 表 | 中 | 住 | 放 | 祭 | 鏡 | 保 | 訴 |
PERCENT MEANING | 飽 | 詩 | 綺 | 偽 | 浄 | 清 | 二 | 伺 | 敬 | 職 |
VOCAB | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
NORMAL TOTAL | 生 | 土 | 用いる | 入力 | 大した | 八日 | 公用 | 少ない | 外す | 内 |
NORMAL READING | 生 | 土 | 八日 | 大した | 入力 | 用いる | 公用 | 内 | 人生 | 少ない |
NORMAL MEANING | 生 | 土 | 公用 | 大した | 入力 | 用いる | 八日 | 少ない | 人生 | 評判 |
PERCENT READING | 人生 | 複雑 | 覆面 | 読む | 諮る | 謎 | 警察官 | 資本 | 足跡 | 身振り |
PERCENT MEANING | 無言 | 熱心 | 研究 | 童話 | 落ちる | 血 | 見直す | 記念日 | 豊か | 遠く |
Only (Mainly) Fast Level Users
Analysis for users that are level 50 or over. This is around the time where the fast levels kick in.
KANJI | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
NORMAL TOTAL | 了 | 午 | 出 | 上 | 力 | 刀 | 入 | 慕 | 分 | 少 |
NORMAL READING | 了 | 午 | 出 | 上 | 刀 | 慕 | 授 | 猶 | 入 | 力 |
NORMAL MEANING | 午 | 了 | 概 | 慕 | 出 | 猶 | 刀 | 上 | 授 | 少 |
PERCENT READING | 少 | 陵 | 玉 | 泰 | 潮 | 犯 | 班 | 益 | 監 | 瞬 |
PERCENT MEANING | 土 | 少 | 遺 | 唆 | 叫 | 戚 | 享 | 探 | 鬼 | 錯 |
VOCAB | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
NORMAL TOTAL | 公用 | 生 | 大した | 土 | 入力 | 用いる | 八日 | 評判 | 惨敗 | 広げる |
NORMAL READING | 公用 | 土 | 生 | 評判 | 哀悼 | 疎か | 大した | 八日 | 逃がす | 入力 |
NORMAL MEANING | 公用 | 哀悼 | 評判 | 生 | 土 | 陥落 | 大した | 人生 | 逃がす | 観測 |
PERCENT READING | 人生 | 浸透 | 結果 | 顧問 | 建前 | 火山 | 発射する | 終える | 総体的 | 見当たる |
PERCENT MEANING | 露呈 | 観測 | 謙虚 | 広げる | 女子 | 十日 | 怠る | 少女 | 衝突 | 上る |
Difficult Kanji In Vocab
These are kanji of difficult vocab items. For this I went through all of the difficult vocab items and split them into its kanji. Then I added the difficulty of the vocab to all the kanji it makes up. So basically, this is the difficulty of vocab items that have that kanji as a part of them. Here are the hardest of them:
You can see, for example, that the reading of 日 (にち、ひ、じつ、か…) inside words is very difficult to get right but the meaning is not that bad. The trend shows that the reading of vocab is harder to decipher from its kanji than the meaning, most of the time.
Here are some more:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|
TOTAL | 人 | 日 | 大 | 生 | 力 | 上 | 入 | 用 | 外 | 下 |
READING | 日 | 生 | 人 | 大 | 力 | 上 | 入 | 土 | 用 | 八 |
MEANING | 生 | 人 | 日 | 大 | 用 | 力 | 上 | 入 | 土 | 外 |
Horoscope
If your username begins with L or Z you are more likely to have trouble with the kanji 了. Otherwise 午 is your archenemy.
If your usernames starts with H then 入力 is difficult for you. If you happen to have a username starting with X though 大した is very hard to remember for you. Otherwise it’s 生.
Even though this was compiled from the data I have gathered, I am in no way, shape, or form responsible for the accuracy of these predictions.
SRS Item Distribution
The charts are split into the different colors based on the SRS stages of the items. They all use the corresponding colors except for burned because of the convention of coloring it golden.
Here you can see the average number of items a user of a certain level has.
Apparently, some people are waiting on level 59 (at least according to this graph). You know who you are, show yourself
Also, there are many people who rushed to 60 probably without having completed they’re lessons. That’s why there is a dip at 60, I presume.
I turned this graph into a pure percentage graph without absolute values:
There are people on level 1 with burned items? surprised pikachu face
User Distribution
The following graphs use data from all of the around 109000 WaniKani forum users instead of the subset with 89000 users.
This has already been done by Kumirei and others but I want to still include it for completeness sake and as an updated version of those statistics. Here is the link to the one @Kumirei made previously:
What is the level distribution on here? - Reply 12 by Kumirei
By Subscription
Forum users split into “Free”, “Paid”, and “Lifetime” users by their WaniKani subscription.
The same graph but logarithmic:
This one can be misleading. For example, on levels 5 and above it looks like there are more free users than the others but this is not the case; this is just an artifact of the logarithmic projection of the data.
By Trust Level
For more info on trust levels read this discourse article.
Here are all the members (so, people who have trust level 2 on the forum) which means they have taken part in conversation on the forum for an extended period of time:
And, the Regulars. We all know them, we all love them and we all want to have that sweet Regular title too
There are actually 47 Regulars currently but 20 of them are not represented in this graph because they either have their level hidden or are staff members.
After seeing that chart, the best tip for becoming a Regular: be level 60.
And that’s about it with the charts.
Scraping Method
First I got all the usernames and levels from the basic badge page (Basic badge on WaniKani Community) because everyone who entered the forums has got this badge. Collecting this data took a few minutes, I think.
The main thing that made me be able to collect this additional data was that many people use the same name for their forum account as for their WaniKani account. I took their forum name and then scraped their WaniKani account website at https://www.wanikani.com/users/{username}
which has a lot more data. Thus, collecting this data took sooo long. It was around 16 hours because the WaniKani website has a rate limit of 100 requests per minute (so it translates directly to only 100 users per minute).
If you want the code or anything just ask but be warned it’s very ugly
Links/Documents And Stuff
Here the spreadsheet to all the data I collected:
WaniKani user data 2022-09-24 (editable) - Google Spreadsheet
I made it so that anyone can edit so you can make your own charts if you want.
In case anything happens to the old spreadsheet () here is also a version that’s not editable: WaniKanji user data 2022-09-24 (uneditable) - Google Spreadsheet
Thanks to @ctmf for the idea. I hope you’re seeing this :)
Also, if you want to have a userscript that shows you the difficulty of an item based on these stats or something similar, feel free to leave a comment!