Wanikani Phonetic-Semantic Composition 1.0.5 [No longer supported]

Wanikani Phonetic-Semantic Composition - Userscript

by ruipgpinheiro (LordGravewish)

THIS SCRIPT IS NO LONGER SUPPORTED BY ME.

Look at this post for further information.

Description:

It seems that many kanji were created using a process called
phonetic-semantic composition. This process joins two (or more) kanji
(radicals), one (or more of them) usually called the bushu or dictionary
section header establishes the meaning of the kanji, and another one,
the phonetic component that establishes the (on’yomi) sound.

This means that a lot of kanji have a built-in mnemonic that I
haven’t seen being referred to in Wanikani, and so it’s quite useful to
know some of them, especially when having trouble with a specific
reading!

For example (using non-Wanikani kanji names):
反・はん “to rebel” (“anti” by Wanikani mnemonics)

飯・はん “rice”

版・はん “print”

板・はん “a board”

坂・はん “slope”

販・はん “sale”

叛・はん “to betray”

As you can notice, these kanji all use the first one as a phonetic
component, placing it to the right of the semantic component (mostly,
phonetic components are drawn right-most). Due to the evolution of the language, many such kanji have since
then slightly changed pronunciation (仮・か “temporary”), but knowing this
information can be a major help.

This script imports a
database of over 100 phonetic components with over 400 regular Kanji
that use their on’yomi reading onto Wanikani.
This means that over a fourth of Wanikani’s Kanji should be included in here somewhere and have
a “built-in mnemonic” of sorts. Depending on how you study, it could be
a huge help (or no help at all - you decide what’s best for your
brain). The information will be shown on the Kanji info page, during
reviews (if you check the details for a Kanji) and during lessons,
provided the relevant Kanji is included in the database.

Note that the database used in this script was automatically
generated from a PDF file, and even though I tried to check it for
mistakes, it is possible that it contains an error or two.
This
userscript contains the whole Kanji table from Hiroko Townsend’s Thesis
about phonetic components, which means the script’s database includes
143 different phonetic components encompassing 417 regular kanji (kanji
that use the on’yomi reading from the phonetic component) and 210
irregular ones (kanji that use a different reading, though with -
supposedly - similar roots)
. Some of these Kanji aren’t
available on Wanikani, though, even though they’ll be shown by the
userscript as they are part of its database.

This
script is also a sort of experiment into adding information to Wanikani.
Now that most of the hard work has been done, I’ll be able to
add/modify information quite easily, and maybe fulfill a couple common
requests from the community. Of course, this is all if I ever feel in
the mood (for me, programming is a hobby, so once I finish a project
that takes a couple of days - like this one -, I rarely feel like doing
more for some time).


More screenshots available on the userscripts.org download page

Changelog

1.0.4 (23 January 2014)
- Now supports the HTTPS protocol.

1.0.3 (24 November 2013)
- Corrected 症, which has the wrong reading in the thesis used for creating the DB. It’s now listed as irregular with the reading しょう, even though its phonetic component can also (rarely) be read the same way.

1.0.2 (24 November 2013)
- Fixed a bug in the code that automatically generated the DB, which would misread phonetic components with a single, irregular kanji, like 刃, and put them inside the DB entry of the previous phonetic component.
Therefore, the DB was regenerated from scratch. Updated the DB count in the description accordingly.

1.0.1 (23 November 2013)
- Kanji links now open in a new tab, to fix a bug where clicking them would just restart the current reviews/lessons session.

1.0.0 (22 November 2013)
- First release.

DISCLAIMER:

I am not responsible for any problems caused by this script.
This script was developed on Firefox 25.0.1 with Greasemonkey 1.12.
It was also tested on Chrome 31.0.1650.57 with Tampermonkey 3.5.3630.77.
Because I’m just one person, I can’t guarantee the script works anywhere else.

LICENSE INFORMATION:

This script contains a database of phonetic components adapted under Fair Use (for nonprofit educational purposes) from 

Phonetic Components in Japanese Characters
by Hiroko Townsend,
Master of Arts in Linguistics, San Diego State University, 2011 
Thank you Hiroko Townsend for the very useful thesis! Obtain a complete copy of it at http://sdsu-dspace.calstate.edu/bitstream/handl… 

The rest of the script is licensed under GPLv3. More information at http://www.gnu.org/licenses


Download Version 1.0.5:

This script can be installed here. Note that this is an old version (the script is no longer supported by me) and as such it will likely not work.

Warning for Chrome users: Due to some limitations imposed on us by Chrome's userscript API (and because I developed the script primarily for Firefox), it uses some stuff only available in the Greasemonkey userscript engine (and derivatives, like Tampermonkey). You should install the Tampermonkey extension for Chrome. Once you try to install my Userscript, you should get a choice to install it using Tampermonkey, and if you choose that everything should work fine!




4 Likes

… this may be one of the best userscripts of all time! Thanks! (will test it later though since I’m still sleepy… zzz)

Wow! This is awesome! Thank you!

This makes some of the kanji readings disappear. 

http://i.imgur.com/hbpptz2.jpg

Satoshi said... This makes some of the kanji readings disappear. 

http://i.imgur.com/hbpptz2.jpg
Huh. That's super weird. I'll do some reviews now and hopefully I'll be able to reproduce it. I did about 150 reviews yesterday without any problem using this script, though, and didn't notice this.

Sorry for the inconvenience!

Any further details? What browser do you use? Was it shown correctly before you expanded the item info dialog (to show both Reading and Meaning)? Did it start happening on every Kanji, or only on that specific one? After a long reviews session, or right during the beginning of the session?

I’m using Chrome. It’s for all kanji at the moment. When I fail a kanji meaning and open the item info the meaning is shown but when I expand the info to show the reading it is blank. When I fail the reading and open the item info the reading shows up for a split second and then disappears. 

It’s in the beginning of a review session, haven’t used it very long. 

Satoshi said... I'm using Chrome. It's for all kanji at the moment. When I fail a kanji meaning and open the item info the meaning is shown but when I expand the info to show the reading it is blank. When I fail the reading and open the item info the reading shows up for a split second and then disappears. 

It's in the beginning of a review session, haven't used it very long. 
For ALL kanji? O.o
Really weird! I can't reproduce it here. What versions of Chrome and Tampermonkey do you have? Does that information appear correctly in the Kanji pages (for example this one), including the information added by the script? Could you try redownloading the script to make sure you have the same version I do? I originally screwed up the upload of version 1.0.1 (for like 10 seconds, then I fixed it, so I doubt this is the cause, but this way I can make sure it is not).

What other Userscripts/addons do you have that affect Wanikani? Could you try disabling them all and see whether the problem still occurs, or post the list here and I'll do it myself.

Ok, I figured it out. Sorry for making you try to reproduce it, it’s actually the fault of the “Transform on’yomi readings into katakana” userscript
When I disable it your script works fine.

Satoshi said... Ok, I figured it out. Sorry for making you try to reproduce it, it's actually the fault of the "Transform on'yomi readings into katakana" userscript
When I disable it your script works fine.
 Thanks! I didn't use that script so I couldn't reproduce it. I'll install it to try and fix (or at least work around) the incompatibility. Or does the script actually cause problems without mine?

EDIT: Just tested it, the problem only occurs on Chrome, but the kana still disappear even with my script disabled. Not my fault, then!

I know there probably an easy guide somewhere around here, but I can’t seem to find so I guess I’ll ask here.  How do you install a userscript/extension?  Never used one before, guide would be appreciated.

SoxKeepYouWarm said... I know there probably an easy guide somewhere around here, but I can't seem to find so I guess I'll ask here.  How do you install a userscript/extension?  Never used one before, guide would be appreciated.
Firefox:
Install the Greasemonkey addon. Open the link to the userscript you want to install, and Greasemonkey will ask you for confirmation or let you look at the source. Just press "Install" and you're done.

Chrome:
Depends if the userscript if natively compatible with Chrome. If so (not the case), just open it and Chome will ask you for confirmation and then install it as an add-on. If not (since my scripts are made primarily on Firefox/Greasemonkey, this is the case), install the Tampermonkey addon. Open the link to the userscript and it'll ask you whether to install natively or through Tampermonkey. Choose Tampermonkey, and you're done.

You need to reload affected pages for the userscripts to have effect.
ruipgpinheiro said... Thanks! I didn't use that script so I couldn't reproduce it. I'll install it to try and fix (or at least work around) the incompatibility. Or does the script actually cause problems without mine?

EDIT: Just tested it, the problem only occurs on Chrome, but the kana still disappear even with my script disabled. Not my fault, then!
 Sorry about that, it seems I made a mistake while updating my userscript. I'm looking into it.

By the way, thank you for all your amazing userscripts ruipgpinheiro! 


ooo, thanks for info ruip.

Hotfix update for the DB. I noticed that some phonetic components were showing as irregular kanji in the sets belonging to different phonetic components. My code that automatically generated the DB from Hiroko Townsend’s thesis had a bug in that completely irregular phonetic components with a single member were being misinterpreted as members of the previous phonetic component.

1.0.2 (24 November 2013)
- Fixed a bug in the code that automatically generated the DB, which would misread phonetic components with a single, irregular kanji, like 刃, and put them inside the DB entry of the previous phonetic component.
Therefore, the DB was regenerated from scratch. Updated the DB count in the description accordingly.

Greasemonkey/Tampermonkey should automatically update the script for you

… something isn’t quite right with this picture:


http://www.wanikani.com/kanji/%E7%97%87

The on’yomi of 症 isn’t せい but しょう. Weird.

The table in Hiroko Townsend’s thesis (the one I used for the script) lists that one as having せい as a reading. Must be a mistake? Every dictionary seems to list that one as しょう, so it seems unlikely that’s an alternative reading.

Shame! Exceptions are always annoying :stuck_out_tongue:

EDIT: Ah I understand the problem now, I think. It might be a mistake, but it’s not that it’s irregular, just that it uses a different reading from the phonetic component - 正 has an on’yomi reading しょう that’s not used much, and the more common reading せい

I’ll change that one manually as if it were irregular, even though this isn’t the exact case.
Thanks for the report!

EDIT2:
Here you go:

1.0.3 (24 November 2013)
- Corrected 症, which has the wrong reading in the thesis used for creating the DB. It’s now listed as irregular with the reading しょう, even though its phonetic component can also (rarely) be read the same way.

ruipgpinheiro said...
1.0.3 (24 November 2013)
- Corrected 症, which has the wrong reading in the thesis used for creating the DB. It's now listed as irregular with the reading しょう, even though its phonetic component can also (rarely) be read the same way.
Woah, that was quick! Great work! ;D
mayucchi said... Woah, that was quick! Great work! ;D
:)

The DB is very simple and easy to edit, so fixes like this are almost instantaneous. The script then generates the text/HTML it shows dynamically from the information in the DB, so it requires absolutely no changes.

An example entry in the DB (all 143 of them are listed exactly like this near the end of the script source file):
{
    phonetic:  "皮", reading:"ひ",
    regular:   ["彼","被","疲","被","披"],
    irregular: [["破","は"],["波","は"]]
}
The hard part in making the DB was converting it from the tables in the .pdf file automatically. The code for that is not included in the released script (it's pretty ugly, filled with tons of huge cryptic regular expressions haha), it took me quite some time to get that working correctly...

I think this last version of the DB version has been converted from the tables without any mistakes - any mistakes (like the one you pointed out) are probably present in the thesis, too.

The thesis tables seem to be missing a few components. Any strong radicals that involve enclosures appear to have been categorically ruled out, so there are kanji using some components listed that are omitted too. If you’re looking to expand it at all, Kanji Damage has a similar list. You’d have to go through each item in the strong radical index not already in your database and check the cross references (or just throw it into EDICT’s multi radical lookup and pull out the appropriate results).

As an example, the following spring to mind at a glance:
曽 ソウ in 層
競 キョウ in 境
麻 マ in 魔

If you’re willing to throw me a plaintext copy of the database, I could even do the lookup myself and just list a bunch of additions here.

The script itself is awesome though and fits perfectly with the rest of the site. I’m now going to find myself wishing this were a native feature so I’ll have it on Android too. :slight_smile:

Glad you find it useful!

It’s likely the database is missing a lot of stuff. There might be better places to pull stuff from, I just found this one and stuck with it, since it seems to be quite good, though it’s missing a lot of cases where, for example, a radical can be used regularly for two different on’yomi (something that happens a lot).

I still want to add the same functionality to the radical pages/reviews/lessons, too. Having that information on the ones that match would also be pretty useful.

I’m really busy with university work so I don’t see myself working more on this (other than possible bug fixes) until the end of january once exams end. If you really want to work on it, be my guest! I’d actually be honored :slight_smile:

Summed up quickly, open the userscript source code and  navigate to line 505, where the database begins. You can edit it directly in the script, refresh Wanikani, and test the changes. I left the database in quite a readable form on purpose, since it’s easier to read, debug, and add stuff.

It consists of an array of entries like the following (separated by commas):

{
    phonetic:  “半”, reading:“はん”,
    regular:   [“絆”,“拌”],
    irregular: [[“伴”,“はん/ばん”],[“判”,“ばん”]]
}
phonetic - A string containing a single character, which is the radical. If there is no Unicode character representing the radical (because it’s obsolete), then type “obsolete” (look at the end of the database for a few examples). In the example above, it’s 半.

reading - A string containing the regular reading, as hiragana. In the example above, it’s はん.

regular - An array of kanji that use the regular on’yomi reading. Optional (for example, if there are no regular Kanji). In the example above, both 絆 and 拌 use the regular reading.

irregular - An array of arrays with 2 entries - the first one is the Kanji, the second one is the reading(s) (in hiragana). Optional (for example, if there are no irregular Kanji). In the  example above, I list 伴 with reading(s) はん/ばん, and 判 with ばん.

If you want to add your own stuff, I’d suggest going to the end of the database (line 1003), adding a comment separating the old entries from yours (look at the comment on line 942, for example), and just add stuff :slight_smile:
Make sure to post your changes here! I’d love to improve the script even more.