For those of you who don’t know, the kanji kentei (kanken) is a test of japanese knowledge, namely kanji/vocab/yojijukugo, spanning many levels. Beyond the scope of the 2136 joyo kanji lies over 800 kanji jinmeiyou kanji. Beyond the scope of these 3000 kanji lies an extra 3000+ kanji covered by the highest level of the KanKen for a total of ~6300. Seeing as they aren’t included on the “common use” kanji list nor the jinmeiyou list, you might assume these 3200+ kanji are rare and the words containing them are not worth learning, but is that really true for all of them?
On the other end of the spectrum. We have Wanikani pleasant levels. Wanikani level 1-10. These are super simple words that everyone knows. Surely you’ll see them more than anything that didn’t make the joyo cut and the kanken doesn’t even bother to add until the notoriously difficult level 1, right?
The Game:
Using jpdb.io, I have gathered the frequency in novels for 15 kanken level 1 words and 15 wanikani pleasant words. I also have 2 yojijukugo at the end for a total of 17 questions. Using your best guess, you need to try and pick the one that is more common within the series on jpdb’s database.
Low frequency words might differ only by a few percent, but higher frequency words differ by ~10% or more.
The frequency is for the word exactly as it appears. No alternate forms, and no kana forms for anything that is written in kanji.
Question 15 surprised me. 投擲 I’ve never seen before. Looks like I need to up my light novel game.
I was also pretty sure that words like 囁く and 呟く were used more frequently than their counterparts, I guess I underestimated 相手 and … 星? Haha
I’m surprised anyone could get number 4. This is one of those “makes sense after the fact” things that should be near impossible before seeing the answer.
Sure, 女性 or 女の子 would be expected more than 女の人, but I’d still expect it to occur once, randomly, somewhere in a whole novel.
More seriously, 12/17, using the knowledge that the original is biased towards light novels.
My answers and reasons
咆哮 because light novels, raging barbarians fighting dragons and what not. Both sides are responsible for the roaring.
病気 all those protagonists falling in ponds and getting sick.
火事 I really hesitated for that one. 儚い is often used in descriptions, but I have encountered quite a few places set on fire… well,
躊躇 everybody is hesitant at some point, but 女の人 isn’t a common word usage in novels.
呟く I also hesitated a lot. Muttering also happens a lot… but I guess 相手 is just that common. Every time you talk to someone, they are your 相手, so it’s bound to come up.
医者 just felt more common in what I read. And it was. So for once my gut feeling was right
囁く If 村上 taught me one thing, it’s that nobody looks at the night sky. I was lied to.
曰くcomes up all the time. Meanwhile, I can remember only one instance of 助力, maybe.
終焉 I’ve never seen 決 in the wild, I think.
邂逅 same argument, except I have seen 自決 once or twice.
中止 I’ve seen it a lot. 漲る isn’t uncommon, but it just doesn’t show up as often.
戸 I hesitated a lot on that one as well. Well, I was lucky.
赤ちゃん I have seen both, but 赤ちゃん just feel like it would have more chances to come up… well, never mind (or just bias)
杞憂 part of my bias, it just showed up every chapter or so in the last LN series I read. Well, I was not wrong
投擲 Volcano is just too specific, but people throw stuff all the time.
魑魅魍魎 I keep telling everyone and their talking dog that it’s a common word. Might as well put my money where my mouth’s at.
天真爛漫 because I like that word and 十月 in full kanji seems too specific.
Great quizz I got tricked left and right. Very interesting how the fact that the frequency come from novels influence the results.
Also a while ago I was playing with some script matching WK vocabs to the the Balanced Corpus of Written Japanese. If you want to create a round 2, here are all the words from level 1-10 that are outside the first 20000 most frequent words of the BCWJ. Some are quite surprising too.
You don’t remember how somehow Miya Kazuki freaking love that word and use it all the time in 本好き? (already 6 occurrences just in the volume I’m reading at the moment)
Honestly, I was a bit surprised by this one as well. The best reasoning I could come up with is its usage in similes, but looking through the books I have read that seems to be an underwhelming minority. Japanese people really just like talking about being under the stars or talking about the stars light it seems. Or the lack of stars in a handful of cases.
I went for some level of trickery in selecting some words whos frequencies were different than you would expect from their definition or complexity. This is very apparent in 魑魅魍魎 (Evil spirits of rivers and mountains) vs. 円い (round). 躊躇う and 躊躇 are quite common and aren’t completely overshadowed by 迷う. On the other hand, 女の人 is a bit more overshadowed by the other options for woman in novels. Still not rare by any means, doe.
I went for words with a similar frequency and tried to pick common words this time around with a couple exceptions. I thought it would be less exciting if I did rare words since, well, finding out that a kanken 1 word is exceedingly rare and seen less than a wanikani word is probably what a lot of people would expect haha. The problem was actually finding wanikani words that weren’t obviously rare. I looking for words in the 30% range was tough. It was either a really low percentage or 50%+.
I’m not opposed to doing another quiz with the actually hard kanken 1 words I know, but the problem is that there will only be a couple users on here who will actually know them, so it will be 100% guessing for everyone else haha.
I got 11/17 reading a lot really helps to get a feeling for such statistics. Though he purposfully did use some really rare words from the pleasent levels (marui). The 2 you were surprised by were also the 2 I was surprised by the most.
I got 10/17. Most were hard picks (well, other than the 相手, 病気, and 終焉 ones) , but 朦朧 being more common than 赤ちゃん surprised me. I feel like 赤ちゃん appears in most things I’ve read, while 朦朧 is more inconsistent.
Alright, maybe I did this incorrectly then. But FWIW, I download their zip and and compared their frequency based on your list above…I was too curious.
The grey box is considered BCCWJ 全体の順位 overall ranking, I didn’t look into too deeply how they come up with that but this is supposedly the resource allocation.
In this zip spreadsheet, it has 841,912 entries (ranking of 1 to 536048) and many are shared rankings (not sure why but looks it’s just based on the frequency counter coming in as ties). And some searches didn’t yield the individual word so I posted the listing of the compound that came up (assuming most frequent from the search and the spreadsheet word search worked correctly).
Q1
Q2
Q3
Q4
Could not parse 女の人 for some reason, perhaps it’s the kana
Q5
There were 295 matches for 相手, 相手方 was the top it appears
To be honest, if 6’s first option was “Careless”, and 7’s first option was “Whisper”, than 8’s should have been “Never” and 9’s should have been “Dance” Then perhaps “Guilty” and “Feet”.