Hi everyone,
I’m trying to develop a small app, where you can input a japanese document and it will tell you the amount of the used kanji per level. That way it will be easier to determine beforehand, if you are able to read/understand a certain book/manga/form etc.
Does anyone know of an easy way to get the WaniKani-dataset with all the kanji and their level? Something like:
{
{kanji: 木, level: 1},
{kanji: 売, level: 9}...
}
would be perfect, but any format would be okay for me. I could just scrape WK and/or do tons of requests while using the app, but I would like to just have a small local database (or json file) with all the kanji/level-pairs.
Thanks in advance
You can easily copy this and it’s in JSON format (but can just be pasted in Javascript as well). Do you really only need the level and the kanji itself? Anyway, it should look like this:
Thank you so much, you’ve made my day! <3
Yes, for now thats totally fine. I just want an output like: “With your current Level X you can read Y% of all the kanji in this document” or maybe a nice little graph, showing the “difficulty” of the document. Later I’ll probably add vocab too, but for the beginning thats totally fine. May I ask how you created the json file?
I used the tool that @Gorbit99 mentioned called WKOF. If you use that in the browser it’s this command that you can use to get the list I posted here:
await wkof.Apiv2.get_endpoint('subjects').then(data => Object.values(data)
.filter(e => e.object == 'kanji') // sort out everything but kanji
.map(e => ({kanji: e.data.characters, level: e.data.level}))) // format data
It uses the WaniKani API behind the scenes which can give you a lot of information about WaniKani items and reviews and so on (this is also what drives most userscripts here).
Yeah, and you should bear in mind that you need to create an API token for that (on the WaniKani main page under settings) to be able to send requests at all. In your code this has to be put in a global variable called token.
I ended up writing a script to get the running total of kanji per WK level (for something unrelated to this topic), here’s the current output for reference: