Anki Word Frequency Inserter: Learn most common words first

This website/script inserts word frequencies from the InnocentCorpus (5000+ novels) into your Anki cards. That way, you can choose more common words to learn first. (Enables sorting!)

Yomichan shows you these frequencies for most words (that aren’t too rare):
image
This number just tells you how often the word occurs in a corpus of ~5000 books.
Anything over 10k is very common, below 100 is rather rare (私: ~900k, 新聞: ~30k, とろ火: 72).
(<100 doesn’t mean obscure though, the word can still be useful)

Yomichan can even export these frequencies with the Anki export feature (which is great, see plus sign). However, it puts HTML around it, which makes it hard to sort. Also, it can’t add frequencies to existing Anki cards.

Requirements

  • The Anki addon AnkiConnect needs to be installed (which should already be the case if you use Yomichan and its Anki export).
  • Anki needs to be running.
  • Your notes need to have a field FrequencyInnocent (you can change that name if you’re technically minded / can use the browser console).
    If your notes don’t have that field yet, you can add it in Anki via Tools → Manage Note Types → Fields.
  • You should close the Anki Browse window while doing the changes. I think the worst that can happen is that the currently opened card will not be updated. I tested this with ~900 changes and the rest was fine.

Disclaimer

This script should be very safe, since it only updates the FrequencyInnocent field of notes, if it already exists. But please back up your Anki collection via File → Export beforehand. It’s a good idea anyways. Use the script at your own risk, i won’t be responsible for changes to your Anki decks. The code is public though, you can check it here.

Using this to sort/search your Anki cards by frequency in Browse

You can either use another addon like Advanced Browser to be able to sort by custom fields:

Or if you just want to search without sorting or addons, you can use a query like deck:Yomichan FrequencyInnocent:9___ (3 underscores) which will find all cards with frequency 9xxx. (Or for frequency >10k: _____* (5 underscores + * wildcard)

To learn the most frequent, what i do is select some cards → right click → Reschedule → place in review queue (0/0).
There’s probably a smarter way, since this makes the first interval 3 days for me, so i have to mark it ‘Again’ on the first review.

This is also just nice information, even if you don’t want to always learn the most frequent words first. The frequency column is nice to have in the Browse window.

Further usage and technical information

See GitHub - sschmidTU/anki-frequency-inserter: Inserts Japanese word frequencies from the InnocentCorpus into your Anki notes/cards.

My other website: wtk-search

image

It has its own thread:

Enjoy :slight_smile: of course, feel free to leave any feedback here.

9 Likes

Yet another immensely useful tool. Can’t wait to try it out! I still use Multi-Radical Kanji Search daily (by the way, are you still actively adding kanji to it?)

2 Likes

Thank you, that’s great to hear :slight_smile:
Yes, i’m still adding kanji to the kanji search, it’s at 3071 now, and in fact i was about to start adding a whole bunch more very soon :slight_smile: (~60 from the Aozora frequency list, and more)
I actually haven’t encountered any kanji in the wild lately that weren’t already in my search, i’ll always keep adding those!

2 Likes

Oh, this looks very interesting and useful. I’m still an Anki novice, but I’ll give this one a go! ^>^ Thanks!

3 Likes

I added an improvement where if no frequency was found, it checks the Furigana field of the card (if it exists) and extracts the dictionary entry and frequency from that.

This gave me 50 more frequency updates after my first batch of ~900 with just looking up the Front of the card. (often it’s just due to me messing with the Front field, but, you know)

This is version 1.1.2 now, note that Chrome likes to not update the javascript that comes with the website and uses an old cached version, so try a guest/incognito window or another browser. (also, restarting chrome seems to do it) - it seems like Firefox is more willing to do a hard refresh (Ctrl+F5).

By the way, you can change the field names in the console via ankiInserter.ankiFuriganaFieldName etc. Just be careful that when you change ankiInserter.ankiFrequencyFieldName, you also need to change ankiInserter.ankiSearchQuery. But the page will warn you anyways.

The next big step will be inserting frequencies for (individual) Kanji, there’s separate frequency data for kanji as opposed to vocab, also from InnocentCorpus. Since Kanji and Vocab cards are difficult to separate, i’ll probably have to make the user say what their kanji deck is named, so that it doesn’t accidentally use the vocab frequency.

Another idea: Insert WK kanji and vocab level into your Anki cards. I’ve put a lot of WK items into Anki because i burned them on WK and forgot them, for example, and i’d like to know how many. I already have both fields for my notes, it’s just a hassle to fill them out. It’s also interesting to know whether a kanji or vocab is on WK or not. If it’s not, I put 0 in the level field.
(btw jisho.org’s WK information is outdated, but still an indication i guess)


next comment in preparation:

  • v1.1.8: Implemented removing the HTML from the Front field to look up the frequency.
    This didn’t find any new frequencies for my Anki decks, but it might for yours.
1 Like