Export of WaniKani radicals, kanji, and vocab?


#1

To avoid having to query the api’s myself - has anyone already exported the data that is in WaniKani?

I’m wanting to build an Anki deck based on the Accel World light novel and to work out link back to WaniKani content for stuff that already exists, and to work out how much stuff doesn’t exist.

I can’t find anything via searching the forums and I’m just trying to avoid doing the work myself :slight_smile:


#2

Ok, I’ve hacked together a node.js script to download the resources for radicals, kanji, vocabulary.

The github project https://github.com/baerrach/wanikani_exporter has the script and instructions on how to use the script. An export from Dec 2014 of radicals.json, kanji.json, and vocabulary.json.

I’ll see about getting this added to https://github.com/WaniKani as well.



#3

Thanks this is exactly what I was looking for :slight_smile:


#4

I did my best to follow the instructions, and yet failed miserably. 
The error I get is:

SyntaxError: Unexpected token <

    at Module._compile (module.js:439:25)

    at Object.Module._extensions..js (module.js:474:10)

    at Module.load (module.js:356:32)

    at Function.Module._load (module.js:312:12)

    at Function.Module.runMain (module.js:497:10)

    at startup (node.js:119:16)

    at node.js:929:3

Any idea as to what I did wrong? Thanks

#5
mlsacg said... I did my best to follow the instructions, and yet failed miserably. 
The error I get is:

SyntaxError: Unexpected token <
    at Module._compile (module.js:439:25)
    at Object.Module._extensions…js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:929:3

Any idea as to what I did wrong? Thanks

I’m running node --version
v0.10.35

Can you paste in the commands you have done and where it is failing?

The alternative is to not bother at all!
Just use the *.json files that I have already exported.
I dont believe there has been any changes since I did that export.

#6

In the end I used the Anki exporter to retrieve the text files, which I imported into a spreadsheet. All I wanted was an offline list of what I'd learnt so far. 

Yet, I'm still curious as to where it went wrong:


Last login: Fri Jan 23 12:13:06 on console

junior:~ mlsacg$ node wanikani_export.js radicals


module.js:340

    throw err;

          ^

Error: Cannot find module '/Users/mlsacg/wanikani_export.js'

    at Function.Module._resolveFilename (module.js:338:15)

    at Function.Module._load (module.js:280:25)

    at Function.Module.runMain (module.js:497:10)

    at startup (node.js:119:16)

    at node.js:929:3

junior:~ mlsacg$ cd /Users/mlsacg/wanikani 

junior:wanikani mlsacg$ node wanikani_export.js radicals


/Users/mlsacg/wanikani/wanikani_export.js:5

<!DOCTYPE html>

^

SyntaxError: Unexpected token <

    at Module._compile (module.js:439:25)

    at Object.Module._extensions..js (module.js:474:10)

    at Module.load (module.js:356:32)

    at Function.Module._load (module.js:312:12)

    at Function.Module.runMain (module.js:497:10)

    at startup (node.js:119:16)

    at node.js:929:3

junior:wanikani mlsacg$ 




#7

Did you git clone my repository?
Can you have a look inside the wanikani_export.js file?
I suspect that it is a 404 error html file and not a javascript file as there is no <!DOCTYPE html> in that file.

Oh, and the exported doesn’t export your stuff, it exports everything.
It still needs an API key to find everything though.


#8

Hi! Quick question!
Does it extract the vocab audio as well? 
If not, do you guys know if there any Wanikani Anki Vocabulary deck with audio in it? 
I would really like to have a system like in Core2000 but with the vocab of wanikani on it. 
That way I could practice my recalling from audio, and my production, all in one anki pack. 
Thanks! 


#9

The vocab stuff is not available in the api.
And there is no easy way to work out the audio link - its an amazon s3 url which is a large unique identifier.

You can scrape this stuff yourself with some scripting.
I’ve not yet got around to trying it, but another thrid party api tool is doing it via Javascript and jQuery,
here is the bit I’ve pulled out for when I get around to it (I forgot which one I stole this bit from)

    .get('<a href="https://www.wanikani.com/vocabulary/" target="_blank" rel="nofollow">https://www.wanikani.com/vocabulary/</a>' + word, function(data) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; var html = (data);
      var audio = $(‘audio source[type=“audio/mpeg”]’, html);
      if (audio.length == 0) {
        return false;
      }

When I find some spare time, I’m looking to contribute to http://wanikanitoanki.com/ so it can reference the audio links.
I’m making use of the critical items exporter to pound on those kanji I just keep getting wrong.
Damn you 日or 日


#10
baerrach said... The vocab stuff is not available in the api.
And there is no easy way to work out the audio link - its an amazon s3 url which is a large unique identifier.

You can scrape this stuff yourself with some scripting.
I've not yet got around to trying it, but another thrid party api tool is doing it via Javascript and jQuery,
here is the bit I've pulled out for when I get around to it (I forgot which one I stole this bit from)

    $.get('https://www.wanikani.com/vocabulary/' + word, function(data) {
      var html = $(data);
      var audio = $('audio source[type="audio/mpeg"]', html);
      if (audio.length == 0) {
        return false;
      }

When I find some spare time, I'm looking to contribute to http://wanikanitoanki.com/ so it can reference the audio links.
I'm making use of the critical items exporter to pound on those kanji I just keep getting wrong.
Damn you 日or 日

 Getting that audio into an anki deck would be really, REALLY awesome. I don't have any idea of coding and stuff, so I can´t really keep up with this. If you manage to make it work, please let me know! I'm so much looking forward to it. (Maybe email? :S "mr.rash.boy@gmail.com")

Anyway thanks!
Mario.