[Userscript] Open Framework: Visually Similar Kanji Filter

Right here


Very excited to try this. Thanks for yet another awesome script! :purple_heart:

1 Like


But Kumi was going script crazy before the competition.



It’s the viking DNA


I see it coming

Yet I thought, “nah it can’t be”

And while clicking I was losing hope… And there it was


Honest answer: I just learned by trying to do stuff while looking at other scripts and googling


That’s great. You sure have learned a lot, I remember you started with way more simple scripts like the Kanna one.

I might be looking at your code if you don’t mind

1 Like

Woah Burns (the kanna one) was actually pretty recent. I made that one in June 2018, while I started learning in October 2017. The oldest script that I still have listed is Spongebob Time Cards from November 2017, then there’s Rainbow Flairs from December 2017, and then a few more before my first notable script, the like counter, in January 2018. In total I have 51 scripts published, but only 7 with more than 100 installs, if you don’t count the pure CSS ones.

1 Like

Nice work! There are several files with similar kanji using different criteria there, why did you choose the “stroke edit distance” specifically? In the Niai script I use several sources because each criterion misses some kanji that are definitely very similar.

I remember that the stroke edit distance had some issues because it occasionally includes 「一」 with a very good score, and it lists exactly 10 items, even though there are more (or some items are not similar at all).

I chose that one because because it had ten similar kanji for all kanji, making it predictable, and when studying similar kanji I feel like it won’t be as effective if you study only the similar ones right after each other, so some less similar kanji help with interleaving.

Also it seemed a lot of work to use them all, unless I compiled them into one source.

I was going to ask why the script didn’t work for me. But I guess the answer was just having a list of kanji wasn’t enough. I had to select them and also click the option to “include this source” for them to appear on the quiz. Did I get it?

Anyway, assuming that’s it, thanks very much for making this. It looks like something that will be very helpful to me.

1 Like

Ah, yes. I did it this way so you can keep a list of kanji that have been problematic in the past, in case you want to study them again sometime, and not have to enter it into an input box every time (can be hard if you’re already having trouble remembering the kanji).

Yep. That’s a quirk of the self study quiz, though.

That should do it!

1 Like

I mean, I was suspicious and suspected it a teeny bit, but still glad I can be surprise rickrolled. Successfully baited :smile:

1 Like

Wanted to ask the exact same thing lol

@Kumirei thank you for this script, I’ll definitely use it to prepare for the jlpt because the visually similar kanji are killing me.


Although this answer discouraged me lol


I have a few questions about Visually Similar.

  • Is there any reason the github file is not cached? I see network delays at startup.

  • I there a way to populate all the items with visually similar data as opposed to just those that are configured in the settings? I would then be able to populate the Item Inspector popups with the relevant visually similar kanji icons.

  • Are you open to make a search filter that uses a list of kanjis separated by commas like I did in the search filters I just posted? This would let people search for any list of kanji they please in a manner that is consistent with the other search filters.

Is it not? I’m fetching it with $.get(), which I believe should default to cache?

Nope. This script was very hastily made and I never bothered to polish it.

The thing is I’m not happy with the interface of this script. It’s poorly designed and poorly executed. What you suggest makes sense, but I’m not sure I want to spend the time to fix this script right now, because I would want to redo most of it.

If you want to make any changes to the script yourself go ahead and I will merge it. I have all of my scripts on GitHub so it would be easy to make a pull request or if you prefer just post the changes here. I would eventually want to have another go at this script, but it’s very low priority for me right now

1 Like

I am not familiar with $.get. I didn’t see code for caching so I assumed there was no cache. But you are right, it defaults to cache, so it is cached. Probably the network delays I saw are due to the fact I was loading for the first time.

Will do. These changes are easy and will be done quickly. I will rip off the old interface if you don’t mind. I think it doesn’t make sense to keep it once there is a search by search terms. There aren’t many existing users so I don’t risk upsetting someone.

1 Like

My changes are now on github.

I got rid of the old user interface. Most of your code pertained to this interface. Therefore I ended up replacing the bulk of your code. I hope you don’t mind it turned out this way.

I have changed the label of the filter to LY Visually Similar. ‘LY’ stands for Lar Yencken. This is to distinguish from the search filter WK Visually Similar I wrote that searches WK visually similar data.

The new interface just takes search terms separated by commas in the search box. There is no dialog in the settings anymore.

I expect people to type with an IME. Both latin and Japanese commas are accepted.

People may start their search terms with a real number between 0 and 1 separated from the search terms with a comma. Then only matches that have a score in the database that is greater or equal to that number will be returned. This number is optional. The default is 0, which returns all visually similar kanjis.

Both comma and period are accepted as a decimal separator for internationalization.

Both latin and Japanese numerals are accepted to support IME input. The period and comma decimal separators may also be both latin and Japanese.

Explaining all this makes for a lengthy hover tip. I don’t see a way to shorten it without removing information the user needs to know.

This thread top post should now be rewritten to include all this.

There is a typo in Lars Yencken name in the top post. Yencken needs a c.

According to Lars Yencken site the data is available under the Creative Commons Attribution 3.0 Unported license. This license requires that a link to license be posted somewhere. I suggest you include it in the top post for compliance.

Lars Yencken visually similar data now displays nicely in Item Inspector version currently in development thanks to the filter. It works fine in exports too.

I investigated the cache because I still had network delays. I found that the github server where the file comes from instructs the browser to age the cache after 600 seconds. For this file this is worthless. I found no way to override the server directive at the browser. I tried to setup headers that request the server to setup a longer ageing time. These requests are rejected by the server. I gave up on $.get() for these reasons. I use wkof.load_file() and manually age the cache after one month.