I'm looking for a spreadsheet with Wanikani Kanji by level

Does anyone have one? A comma-separated format would be fine as well.

I sense something shady about to go down


11 Likes

I don’t think something like that should be shared

10 Likes

image

3 Likes

It wouldn’t be too difficult for you to make one yourself, no?

3 Likes

If there isn’t, you can go to WK stats (Items Section) and copy the Kanji there. Highlight all the Kanji on a level, copy and paste on a text file. They appear to be separated by two commas though. If you know programming, you can take care of that and make your own CSV file.

2 Likes

copy paste in Word, replace all “;;” by “;” :smiley:

2 Likes

I just want to compare against the spreadsheets of Kanji per chapter on the Tobira website to get an exact measure of Wanikani level vs Tobira to balance my WK vs textbook time investment. E.g. Wanikani level X translates to knowing Y percent of Tobira kanji at chapter Z. I know programming and copy+paste from wkstats website into text file looks good enough for me.

2 Likes

Ah, I’m pretty sure I’ve done that exact comparison already. Lemme rummage


Edit: No, it was Genki. Well, if you could make me a list of all the kanji in Tobira, I could do that for you.

Edit 2: Never mind, found the list on the Tobira website. Working on it now. :slightly_smiling_face:

Edit 3: And done. Same link as above. Colour-coding is same as Tobira: black = taught to read and write; blue = taught to read only; pink = read-and-write kanji that were taught as write-only on a previous level (I deleted the write-only versions of those to avoid them messing with percentages). Summary of findings: you’ve learnt more than 80% of the Tobira kanji by the time you hit level 30.

It got somewhat less un-kosher when they stopped restricting public access to the premium levels. What objection were you thinking of?

7 Likes

That was exactly my thought too. What ‘shady’ thing could possibly go down?

I suppose you are right about that 251446983644938240

Grab material from the site + shove into Anki deck = free WaniKani.
Or worse, repackage content and offer for sale elsewhere.

Granted, it wouldn’t be too hard to utilize the API to “steal”.

1 Like

but isn’t all that info available for everyone to see anyway? If this guy really wanted to he could easily just copy the kanji from every level himself.

True, but still, should we make it easy for people that want to do that?
Taking something that would be a fair amount of work, and packaging it up all nice nice for everyone, is still a bad thing in my opinion. It would just encourage people to do the wrong thing, even if the severity of ‘wrong’ is subjective. “Boy, this would be a lot of work and I’m not sure how to go about it” turns into “oh snap, free”. This in turn leads to lost revenue, hurting WK, and us.
Is it technically stealing if its all there available for free? No. Is it still shady? Yes IMO

1 Like

fair enough. If I was him though and serious I wouldn’t make a post here about it lol.

2 Likes

Awesome. I started to make one before I saw this, although I only had about an hour, so it does not yet include the pre-requisite kanji in the calculation. The only advantage of mine is that it breaks down the percentage by Tobira chapter.

Basically the difference is:
Mine shows Wanikani level X translates to knowing Y percent of new Tobira kanji at chapter Z

Belthazar’s shows Wanikani level X translates to knowing Y percent of all Tobira kanji (including the pre-requisite Kanji).

Tobira Chapter (New kanji) vs Wankikani Level

I will include pre-requisite kanji and add axis labels later when I have time.

2 Likes

Aye, I realised exactly what it was you were asking on my second reading. Not exactly sure how you did that calculation, but it looks very impressive.

If I could make one suggestion, though: change the data format to percentage (with at most two decimal places). Makes it much easier to read, and you simply don’t need ten decimal places. 0.2452830189 out of 503 kanji represents an accuracy of five hundred millionths of a kanji, or maybe ten picometres of a stroke. :slightly_smiling_face:

I wonder if Google sheets can do 3D graphs


Yeah I created it using a Python script and was very time-constrained.

If you did want to achieve something like that (human-readable multi-dimensional data) you would probably want to do something called a “break down” (for lack of a better term) using some data visualization software (or maybe some R or Python library). The basic idea is that you would input the data with however many dimensions you want:

(Wanikani level, Tobira level, Kanji name, Dimension D, etc)

And then you would just select whatever dimensions you want to display (e.g. Breakdown by WK level and Tobira level or Breakdown by Wanikani level and filter to only show Tobira level 30) using some automatically generated interactive chart. That way, even if you have 10-dimensional data, you would most likely only ever want to look at 2 dimensions at given time, maybe with some filters on other dimensions.

(Personally I didn’t need anything this fancy but just thought I’d mention it.)

I maintain an incomplete WaniKani toolkit (mostly untouched nowadays). Sometimes I find posts like these and implement more for fun (I made a patch today).

If you have Bash or something similar:

$ git clone git@github.com:iamorphen/wktoolkit.git
$ cd wktoolkit/python

$ for level in {1..60}
> do
> ./get_subjects.py <your_wk_username> --token <your_api_v2_token> --level $level --kanji --characters-only |\
> awk -v level=$level '{ print $0 ", " level }'
> done

The loop is because I haven’t taken the time to implement the trivial repeated GET to handle pagination. Also, I don’t deal with CSV, so we can exploit the loop counter and awk (or a stream editor of your choice) to format the subjects as CSV.

This will give you

...
ć·, 1
ć·„, 1
戀, 2
期, 2
...

which you can stream to a file.

1 Like

Anyone can find an Anki deck with Wanikani online. I download one so I could check the old mnemonics that they didn’t keep (KKK, Joseph Stalin, etc
)