Sorting WaniKani by usage frequency

searls · September 16, 2018, 1:09pm

While working on KameSame, I’ve become really interested in better sorting a “re-study” experience for the WK curriculum of vocab & kanji. One thing I’ve learned at level ~53 is that the vocab WK introduces at higher levels is a bit stilted/academic/unusual to my conversation partners, which I think is a function of the fact that WK vocab is always demanded by a kanji, and some joyo kanji aren’t very 常用 at all.

As a result, I’ve started looking into ways to establish a sort based on practical usage. Unfortunately, every social media & search API has really restricted access to aggregate searching to find this out. While I look into academic corpuses like ninjal’s, I decided to pay for the Azure/Bing search API and simply ask the number of search results for each term in WK.

Here are the results for you to take a look at:

Naive search results count for every kanji & vocab character string in WaniKani against the Bing Web Search API v7 · GitHub

What do you think of these? There are some very obvious false positives, especially near the top, and some false negatives (cases where WK uses kanji in words that Japanese almost only use hiragana, for instance).

rfindley · September 16, 2018, 3:36pm

You could look at the free BCCWJ word frequency list for comparison [here]. It separates the occurrence count by source type, so you can exclude sources like legal documents and textbooks if you want.

I assume you’re familiar with the BCCWJ since you know about ninjal.

system · September 16, 2019, 3:49pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
WK kanji/vocab choices WaniKani	13	826	April 25, 2023
WaniKani order/layout WaniKani	24	795	September 11, 2023
Frequency distribution of kanji readings in Wanikani WaniKani	8	751	May 29, 2023
Sequencing kanji based on amount of associated vocabulary Requesting Help	16	1585	January 9, 2019
WaniKani Vocab WaniKani	13	1166	February 17, 2022

Sorting WaniKani by usage frequency

Related topics