When can I read [X]? - WaniKani Level Checker

pohlondrej said... OK, Xyresic, I love you ! But I am too lazy to open console everytime I want to check my level...so I made this Google Chrome extension using your script. To use it, go to a page with kanji and click the Japanese language icon in extension bar.

Download here : https://www.dropbox.com/s/p5en168obuvvqlz/WaniKani.crx?dl=0

Note : I am a hardware engineer. I do VHDL, C, C++. I have absolute no idea what I'm doing here, I don't know javascript and this is my first Chrome extension of all time. Also, there is 01:00 AM in the morning and I did this in ~2 hours just out of curiosity. There is a link to the GitHub repository : https://github.com/pohlondrej/WK-reading-ability-checker . Feel free to contribute.
 A fellow Computer Engineer =) I salute thee!

Chrome turns the extension off when restarted and it cannot be truned on. Any way to get past that?

This is because of security restrictions on Windows. I published it, now you can install it from Chrome store :

https://chrome.google.com/webstore/detail/wanikani-reading-ability/geilaheefnofbnocgibjjdeopmmipanc

I will improve the desription and add screenshots when I have more time…

 ありがとうございました。

Oh wow will definitely be checking this out.
ありがとう

Haven’t used it yet. I know you said it ignores Kanji that aren’t in WK, but should it at least have a list of “Kanji not taught” rather than them just being 100% ignored?

Syphus said... Haven't used it yet. I know you said it ignores Kanji that aren't in WK, but should it at least have a list of "Kanji not taught" rather than them just being 100% ignored?
 I looked into incorporating this, with my idea being to find the number of characters within the bounds of the character space that includes kanji, then subtract the count of kanji that are counted by the tool. However, the CODE() function only works on ASCII characters, and the UNICODE() function is only introduced in Excel 2013, and I don't have Excel 2013. :(

But one of these fancy programmer types might have a better idea of how to do it in JS or otherwise.

You wrote this whole thing in Excel? 

Hmmm…I shall give this a think.

A 5 second look does see that Sheets appears to have CODE() (and a CHAR() function) though I’m not sure that Google Docs is really any better. 

Syphus said... You wrote this whole thing in Excel? 

Hmmm...I shall give this a think.

A 5 second look does see that Sheets appears to have a CHAR() function though I'm not sure that Google Docs is really any better. 
 In Excel at least, CHAR() generates a character based on its ASCII code. CODE() is the opposite, generating an ASCII code based on an input character. The comparable functions for the unicode character space are UNICHAR() and UNICODE(). The CHAR() and CODE() functions don't work for this task because the CODE() function returns the kanji's value in some kind of Japanese character space (国->118), but in reverse it seems to generate the value in the English character space (118->v). The problem with this is that if there's any roman text (like a lower case v) in the input text, it will also register as an ASCII value in the same space as the kanji (either "v" or "国" -> 118). The UNICODE() function should be able to find unique values for the different characters, but again, I don't have a version of Excel that supports it. I don't know whether Sheets does, but you should see whether the CODE() function supports unicode, or if Sheets has a function that does, since the function needs to evaluate western and Japanese characters differently.

As I mentioned in my edit, it seems they have both. But just looking for the documentation it claims it uses Unicode. https://support.google.com/docs/answer/3094120?hl=en

I just tried using them really quick since I’m using Sheets for work, and it seems to work okay. Or at the very least I didn’t see an immediate problem.

Awesome idea!

Here’s my version in ruby.

BreadstickNinja said... I looked into incorporating this, with my idea being to find the number of characters within the bounds of the character space that includes kanji, then subtract the count of kanji that are counted by the tool. However, the CODE() function only works on ASCII characters, and the UNICODE() function is only introduced in Excel 2013, and I don't have Excel 2013. :(

But one of these fancy programmer types might have a better idea of how to do it in JS or otherwise.
Does Excel support regex? You should be able to do the same thing I did in Javascript with a equivalent function testing whether a unicode character falls between the [\u4e00-\u9faf] character classes ( more info: http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml )

And it's already supported in the JS version, under the "Unknown" section.
pohlondrej said... This is because of security restrictions on Windows. I published it, now you can install it from Chrome store :

https://chrome.google.com/webstore/detail/wanikani-reading-ability/geilaheefnofbnocgibjjdeopmmipanc

I will improve the desription and add screenshots when I have more time...
You guys are all geniuses! Damn I feel like a complete idiot ...
I'm so glad that I've discovered WaniKana and its amazing community, it makes the kanji learning process way more doable!

pohlondrej thank you so much for the Chrome extension! I'm sure I'll use it everyday.
A little suggestion for the future (if you'll ever want to improve it). It would be even more awesome if you could have a WK estimate only from a selected portion of text and not the whole webpage. Maybe I'm asking too much, I don't even know if it's possible, but I thought it could be an idea.

It is definitely possible, but as I stated a few pages back, I am not a Javascript guru - I just took the existing code and made a Chrome extension from it. 

But if someone actually IS Javascript guru, have a look in the public repository on github : https://github.com/pohlondrej/WK-reading-ability-checker

Well, I’m certainly no JavaScript guru, nor do I know how to contribute through GitHub, but you can get that functionality to work if you define these functions:

function anyTextIsSelected() {
    return selectedText() !== “”;
}
function selectedText() {
    return window.getSelection().toString();
}
function allTextInDocument() {
    return document.body.innerText;

}

and change this line:
}((document.body.innerText || document.body.innerText).split(“”))).filter(function (x) { 
to this:
}((anyTextIsSelected() ? selectedText() : allTextInDocument()).split(“”))).filter(function (x) { 
When a portion of the page’s text is highlighted only that part will be tested, otherwise it will check the whole page as it does now.

Edit: Fixed formatting (I hope).

GangsterOfBoats said... Well, I'm certainly no JavaScript guru ...
 Let me crown you ! Now you're JavaScript guru !

There is a story behind calling somebody a 'guru' :

In our company, we've been developing a native android application. Our client was designing the UI layer and did the "managed stuff", we did the application logic. However, our client had to deal with something in Android they didn't know anything about, so they hired an "Android Expert". We've been laughing about it a while, because of the "we are too lazy to google the thing, let's just hire another guy who will do it for us" approach...

One day, our manager said to us that we'll need to port the application to iOS (as our client wishes). My colleague said :
"Hey, but we don't know anything about iOS development...we need someone who knows the platform, you know...some iOS Expert."
Manager took a macbook and gave it to my colleague : "Can you turn this thing on ?"
He said : "Of course I can !" and turned it on. Then manager said :
"Cool. From now, you're an iOS Guru !"

Anyways, I updated the code, tested the extension and republished it in Chrome Web Store, so now you can use it on the selection of Japanese text. Thank you, GangsterOfBoats !

It should update automatically. If it does not update, here is the URL : https://chrome.google.com/webstore/detail/wanikani-reading-ability/geilaheefnofbnocgibjjdeopmmipanc
pohlondrej said... It is definitely possible, but as I stated a few pages back, I am not a Javascript guru - I just took the existing code and made a Chrome extension from it. 

But if someone actually IS Javascript guru, have a look in the public repository on github : https://github.com/pohlondrej/WK-reading-ability-checker
 && Breadstick Ninja for the original --

Thank you so much!  I have both the excel file and the extension (aforementioned) ready to go.  I definitely think this will inspire me to keep going, as now I can look at websites or content that interests me and gauge approximately where I am.



ERROR=>FIX
I love the WK community so hard.

I was looking at Breadstick’s original file for a project of my own, found an error, come back here to report the solution, and find such wonderful implementations that I’mma use hard. I hope you guys actually see this though. Weirdly, Xyresic’s script seems just fine, which probably secures pohlondrej’s extension - so this is principally aimed at Breadstick and other users of that file.

Basically, unless my file is somehow corrupted (downloaded it again and checked on dropbox to be sure), two kanji appear twice:

32, 41
28, 46

What should appear in place of the second instances are the following omitted characters:
41
46

Hope that's correct and useful.

PS: There was interesting discussion about using UNICODE() in Excel 2013. I have Excel 2013! What might the formula be? Or, please teach me how to RegEx as it would be useful for this project, but my relevant -fu is weak.

PPS: Breadstick, this is so helpful for the corpus project I'm working on, twice over with a little customisation. Its potential is hot hot hot.

Bookmarking this for later use when I’m level 15-20 and/or feel more confident in my grammar… :smiley:

Thank you, everyone! You’re all AMAZING! すごい!

ocac said... *ERROR=>FIX*
I love the WK community so hard.

I was looking at Breadstick's original file for a project of my own, found an error, come back here to report the solution, and find such wonderful implementations that I'mma use hard. I hope you guys actually see this though. Weirdly, Xyresic's script seems just fine, which probably secures pohlondrej's extension - so this is principally aimed at Breadstick and other users of that file.

Basically, unless my file is somehow corrupted (downloaded it again and checked on dropbox to be sure), two kanji appear twice:


32, 41


28, 46

What should appear in place of the second instances are the following omitted characters:


41


46

Hope that’s correct and useful.

PS: There was interesting discussion about using UNICODE() in Excel 2013. I have Excel 2013! What might the formula be? Or, please teach me how to RegEx as it would be useful for this project, but my relevant -fu is weak.

PPS: Breadstick, this is so helpful for the corpus project I’m working on, twice over with a little customisation. Its potential is hot hot hot.

 Hey, I just saw this!!! Thank you so much for catching the error. I don’t even know how it could have got in there since I thought I built this directly from the kanji lists, but you were absolutely right. I’ve fixed the files now.

My idea surrounding the UNICODE() function would be to find the upper and lower limits of the kanji character space in unicode to identify characters that a) are kanji but b) aren’t contained in the WK kanji list.

You’d use the VLOOKUP() function to test whether all characters (including things like punctuation and English characters) are in the list of WK kanji, using the “FALSE” fourth attribute to ensure that VLOOKUP() returns an exact match. This will result in a positive match for WK kanji, but an error for both punctuation and non-WK kanji. You can use the IFERROR() function to make the cell evaluate a new formula in that error case. Then as the error result, you test whether the unicode value of the character is within the kanji character space with AND(UNICODE(cell)>=X,UNICODE(cell)<=Y), where x and y are the lower and upper kanji bounds.  This identifies that the cell is a kanji rather than punctuation or something else, but just not taught by WK. You make that return some value that gets counted in the master list as a non-WK kanji (in my projects I usually call it “61” to represent all kanji that aren’t assigned a WK level). The false error condition on the second formula returns “”, a blank cell.

The whole formula would look like

=IFERROR(VLOOKUP(cell,kanji:array,2,FALSE),VLOOKUP(cell,kanji:array,2,FALSE),IF(AND(UNICODE(cell)>=X,UNICODE(cell)<=Y,61,“”)))

where “cell” is the cell you’re testing on the left, kanji:array is the list of kanji and their levels, and X and Y are the unicode kanji space bounds described above.

AnimeCanuck said…
Bookmarking this for later use when I’m level 15-20 and/or feel more confident in my grammar… :smiley:

Thank you, everyone! You’re all AMAZING! すごい!

 Thank you!!