[Userscript] Niai 似合い Visually Similar Kanji

Ah thanks ! found it :slight_smile:

Is there an API to get visually-similar Kanji/Hanzi? I am working on my Chinese project.

Not yet, at the moment it is a DB with similar kanji for the jouyou kanji (and a ~80 MB matrix with all mutual scores). But I can imagine it as a service, and even auto-generate missing lists for a kanji query on demand. The latest version generates PNGs of all kanji in different fonts and calculates the similarity based on mutual vs differing pixels, so as long as the characters are included in fonts you could compare anything with everything automatically.

1 Like

I really like this script, it’s one of my favourites.

I noticed that it rarely has the kanji I have mistaken it with, though, haha. I almost always remember the right side of the kanji, but mix up the left side, whereas the database (is that what it is?) that comes with the script almost always has a list of kanji with the left side the same, but the right side changed.

I like how you can add your own kanji mistakes, though! Super useful. I change the number-y thing to 0.99 to just show my own mistakes.

What do you mean by that?

Thanks! The similar kanji come from different sources, you can experiment a bit with the “Original Sources” in the options, when it is enabled it add more kanji that are related by the “logical” relation (like how many strokes must be changed), the “new sources” focus on real visual difference, like how many pixels must be changed to turn one kanji into another.

If you want to focus on differences in for example 誰椎推崔進堆 you can also give my other script Keisei a try, the right-hand side is often helpful for guessing the reading of a kanji as well, and the left-hand side (rather the “real radical”) is related to the meaning of a kanji.

I rather like to see lots of similar kanji to get some inspiration what the future problems will be, but manually editing everything is also an option. There is a “manual database” with score 1.0 though, so even with 0.99 some stuff from me might show up.

1 Like

Thanks for this plugin.

I was doing my kanji lessons, and opened the settings and I couldn’t change that number. how come that happens?

By the way… why 次 doesn’t come up in similar kanji list when I learned 欠 ?? Probably the most obvious… I had to add it myself. We just learned it a lesson before!

What happens when you try to change the number? Just no difference, or some more serious problem? You can try to open the console (Ctrl+Alt+I in Chrome, maybe F12 elsewhere, then “Console”) and look for red error messages while changing the number.

For 欠+次 you have to enable the “Original Sources” in the settings, it is included there. The problem of the new method is that it measures how many pixels are shared, and in your example the main part is compressed a bit to make space for the water (ice?) radical. In the end there are not so many mutual pixels to consider it similar …

The original sources contain more things that are of questionable similarity, but they cover interesting things like that pair.

1 Like

Oh, really? That’s the algorithm? Wow. Now I understand why I get so many “non” similar kanjis… mmm. I guess using the radicals they use would add more similars, but well, thanks!!

Version 1.1.3, adjusted the sorting of elements a bit, especially elements from Keisei should now appear more prominently.

1 Like

I used to be able to delete all but the one or two kanji that I felt were similar but as of recently only a handful will delete before the delete buttons just stop working. Assuming this is a bug?

1 Like

Fixed now in 1.1.4, under some circumstances kanji at the end could survive any elimination attempt (by a regression in 1.1.3).

Amazing! Just tested myself and all looks to be working. Really appreciate it.

Is there any chance a similar radicals feature could be added? I just came across 亜, and it was a bit of a faff to compare it with 覀 because they’re not actually connected in any way. I don’t know if it’s a common issue, but I do seem to confuse radicals enough times that it would be useful (夂 and 久 caused me a lot of pain). Even if they had to be manually added by the user, I wouldn’t mind (it might actually be beneficial to have to add them manually…) :slight_smile:

You want to add the similar kanji section also to radical pages, right? I was considering something similar to improve the kanji suggestions (first divide the kanji into two parts, then check the similarity or each part/radical), but not for the radicals itself yet.

I will think about it, but I’m not sure if it is a big enough deal to generate a list. Adding an option to show an empty list to add stuff manually might be the better option.

Some things:

  1. Many radicals are kanji themselves, so you can already see similar kanji if you go to the corresponding kanji page of a radical
  2. 久 appears exactly once, how can it cause trouble? :slight_smile: When in doubt it is guaranteed to be 夂. [Btw, 夂 is related to 攴, I imagine it to be something like “beating stuff with a stick”.]
  3. 亜 is a full-size kanji (and has only one compound in WK, 悪), 覀 is very likely crammed somewhere in-between. When in doubt, 覀 :slight_smile:
2 Likes

If you use + to add the original kanji again, this happens, and it’s not removable.

Capture

(happened by accident)

Thanks, fixed now in version 1.1.6, there was no check for adding the same thing, but it got the same treatment as the head item.

Note that there is no automatic fix for an existing DB at the moment, you need to reset it (reset the item only with the undo-looking button) to get rid of the double city :slight_smile:

Yeah I reset it immediately, it’s not a big issue, thanks.

I’ve been seeing kanji from lower levels still being marked as locked. Is this a bug?

Screenshot

CropperCapture%5B85%5D

^ 想 and 暗 are both level 13, but I’m on level 16 and they’re still locked.

The script is very simple, it tries to read the number in the level bubble on top of the page and remembers it. Now, I tried to restrict the script to load only when necessary; so the only chance to grab a new level is when you visit a kanji page, like https://www.wanikani.com/kanji/一. If you never go there the level will be stuck :slight_smile:

You have to go to a kanji page once per level in the current implementation. The items marked as locked are on your current level and above. :upside_down_face:

I can change the script to load on more pages like the dashboard as well, it is more robust anyway. I’m also considering to use the WK framework (for example the kanji meanings are taken from a private database, not WK, so if WK changes something it will differ), at least optionally.