While working on KameSame, I’ve become really interested in better sorting a “re-study” experience for the WK curriculum of vocab & kanji. One thing I’ve learned at level ~53 is that the vocab WK introduces at higher levels is a bit stilted/academic/unusual to my conversation partners, which I think is a function of the fact that WK vocab is always demanded by a kanji, and some joyo kanji aren’t very 常用 at all.
As a result, I’ve started looking into ways to establish a sort based on practical usage. Unfortunately, every social media & search API has really restricted access to aggregate searching to find this out. While I look into academic corpuses like ninjal’s, I decided to pay for the Azure/Bing search API and simply ask the number of search results for each term in WK.
Here are the results for you to take a look at:
What do you think of these? There are some very obvious false positives, especially near the top, and some false negatives (cases where WK uses kanji in words that Japanese almost only use hiragana, for instance).