ChristopherFritz's Study Log

what ever happened with your script to take the leeches and move them to the bottom of the pile…
curious because I’ve been doing nothing but reviews (no new lessons for over 3 months just doing 75-100 reviews a day and leeches just churn…the intervals on wk are horrible for me…and as long as WK doesn’t have a built in manager we are pretty much screwed…

does no more SRS mean you are gonna probably drop WK? just curious… I’m tempted to just use the reorder script to learn the kanji as that’s what I really want and maybe somehow ditch all the vocab and switch something else to learn the vocab I care about (already using Kitsun for that purpose anyway)

3 Likes

I still have the script somewhere, but it was too much bother to keep using just to do lessons on even more kanji that never or almost never show up in what I read. It’s like learning medical terminology when you want to read fantasy novels — it doesn’t help much!

I haven’t done a lesson or review in forever and I didn’t set vacation mode, so right now it looks like:

image

I was using Migaku’s kanji extension for Anki for a bit but that wasn’t helping me learn new kanji very well either, as I’ve learned all the most common ones I see when reading meaning I’m down to the less common (but not uncommon) ones.

9 Likes

thanks for the update :wink:

3 Likes

blink if you’ve been kidnapped

Joking aside, you dropped your 1/day plan?! But, but you had a streak!! I tried Anki for a few months, and while its review system is “forgiving” in a way that WK is not, it still didn’t do it for me. Does the SRS burnout continue? Not sure. My newest strategy is to attempt Japanese class at the local community college to see how far that gets me. And then use WK as a review system versus a learning one. Maybe reframing it in my mind will lower the affective filer.

Even with a lifetime subscription, it would be nice to make it to level 60… one day?

5 Likes

Yeah, I forgot to do reviews one day, so that went bye-bye.

The other reframe for SRS is to use it as a placeholder for content you will be encountering (which is sort of a cross between learning and reviewing). In this case, you learn the item, you add it to your SRS reviews to not forget it, and then when you encounter it not long after you recall it and create a better memory of it. But this method works best with common words, or if you know in advance what words you will encounter.

I think having a mindset of systems (reading lots) over goals (level 60) keeps me from considering eventually reaching level 60. Especially considering all the time that was freed up for reading by no longer constantly reviewing the same kanji and vocabulary that I just wasn’t learning.

7 Likes

I found I didn’t have recorded anywhere my mpv setup (as I recently had to get everything set up from scratch), so I figured I’d include it in my study log in case I need to reference it in the future.

Scripts

Migaku MPV

Installation and use guide.

Provides easy subtitle navigation (left for the previous line, and right for the next line).

Provide easy hiding/showing of subtitles (up to hide and unhide).

sub-pause

Provides auto-pause at the end of each subtitle.

Provides replaying the current subtitle.

Usage

  1. Open a video in mpv.

  2. Press n to enable auto-pause at the end of a subtitle.

  3. Press up to hide subtitles.

  4. Play video until a subtitle (or press right to jump to the start of the next one).

  5. At the end of the subtitle, the video auto-pauses.

  6. Try to work out what was said in the spoken dialogue. If unsure:

    • Press Ctrl+r to repeat the current line.
    • Press up to show the subtitle. (Afterwards, press up to re-hide subtitles before continuing.)
  7. Press space to unpause and continue.

Missing Features

Some things I may have to look into implementing:

If pausing at the end of a subtitle is enabled, the video is paused, and subtitles are visible, have it hide subtitles upon pressing space to unpause. (Purpose: Removes the need to re-hide subtitles manually.)

If pausing at the end of a subtitle is enabled, and the line gets repeated x number of times, enable the subtitle. (Purpose: Removes the need to unhide the subtitle manually, although I still can if desired. Encourages listening to the audio x number of times before checking the subtitle.)

I may remap repeat from Ctrl+r to a single key without a modifier. This one’s simple to do, but I need to see which keys aren’t already mapped.

7 Likes

I’m thinking about level 60 like I would a trip to Australia- while it would be nice to visit, it’s not on my list of places I need to check out. :sweat_smile:

Since I’m getting a masters, I have been availing myself to the university library- hey, I’ve got to get my money’s worth! Anyway, I’ve read some case studies about using an SRS system in the classroom as part of a learning/studying strategy, and the jury is out on how to keep a lot of individuals interested for a sustainable amount of time. The WK community is basically a microcosm of the challenge of the long term usage of an SRS system by a majority of people. Looking at the graphs, there’s always a fall off. There’s also the question of best practices but that’s a discussion for another day. Anyway, my point is, that I envision using an Anki deck (for whatever textbook the class uses) closer to quiz times, and suspend the cards for stuff I already know. And then maybe perhaps using WK as a general review session at strategic points like between semesters or over the summer.

I don’t know how I did everyday for so long, but even with my second Anki attempt, I kept pushing back my reviews until the end of the day. So… I need a purpose and deadlines that don’t have anything to do with the algorithm (if that makes sense).

This issue has come up in the literature that I’ve read as the downside to SRS systems. If you’re just looking at flashcards in a vacuum without the opportunity to apply it somewhere, the usefulness/motivation, or maybe I should say interest in continuing to use it, diminishes. This is why WK is both great and not so great at what it does. The learning system has a singular focus and it does not deviate from it. And obviously the forum exists to support learning in a way that the system does not, but that doesn’t make up for learning kanji in isolation- it becomes harder to stick it out when either you’re not seeing the kanji you want to learn, or you’re learning kanji that you don’t see anywhere else.

Before I go off on yet another long tangent, there are definitely changes that could be made if the company wanted more users to make it to the end.

5 Likes

To add to this, I took some random WaniKani level 30’s kanji that I don’t recognize at a glance and ran a search of them through a bit over 200 manga volumes I’ve read. Here are the number of occurrences of these unrecognized kanji:

Kanji Hits Notes
2 Both are 「推薦」.
8
37 Should recognize this one a little by now…?
10
18
2 Both in 「アウト満塁」.
6 I expected this one to be more common.

There are plenty of level 30’s kanji on WaniKani’s list that I did recognize at a glance, so this list is just focusing on the kind of kanji that would become my leeches.

Since I don’t do production, applying for me is seeing them when reading manga.

I think the main issue faced here is that the changes to make things work for a larger audience can break the cohesion of what they have now.

Personally, I’d like something where I can feed it the kanji/vocabulary I want to focus on learning, and let it create a path of radicals, kanji, and vocabulary from that. Of course, there’s Migaku’s kanji add-on for Anki that does that, and I didn’t keep up with using that long-term.

I’ve also looked into whether I can do similar in Kitsun, but it’s unfortunately not been flexible enough for me to utilize.

6 Likes

This. :arrow_up:
It’s fine until it isn’t.

Yes! This is what I meant by what they do well- stick to their focus and don’t deviate from it. I assume this is why they give users the opportunity to create and use scripts to customize their learning experience, but even this has limitations. It’s laudable to focus on one thing and make it “great”, as defined by its creators, but it’s also frustrating as a user when you reach a point when you need a system to do something else, but then you realize that you’re trying too hard to make it work, so maybe it’s just not the right thing for you at the moment. Or anymore.

4 Likes

All said, I still recommend WaniKani to anyone to start out with. Eventually, those who keep learning Japanese reach a point where they either know they want to continue with WaniKani or else know better how to learn kanji using another system.

Between that and the book clubs here, I’m very glad WaniKani exists in its current form!

6 Likes

Oh absolutely. There’s no ‘one size fits all’ for anything when it comes to learning. Sometimes it takes a bit of time to find the tool, or collection of tools that work best for oneself, but there’s something for everyone. And I look forward to experimenting on myself to see how SRS works in conjunction with a more formal education (versus self study).

Happy 2023. I haven’t spammed your thread in awhile even though I keep up with your various projects. Okay, I just skimmed over the code stuff :face_with_spiral_eyes:, but otherwise, it’s cool to see you plugging away.

4 Likes

Wow that’s a long list of readings!! Well done! It’s very inspiring, I can’t wait to get there!

6 Likes

Also I completely share your view on WK vs learning Japanese:)
After I’ve learned all N5 and N4 kanjis I don’t see the point in rushing anymore and I have slowed down a lot to free time to more grammar and immersion.
So far I still plan on making it to level 60 one day though, but let’s see, maybe when my reading is more advanced I’ll also see it as you and drop it completely:)

5 Likes

Just popped in to say thanks for posting that. Looks really cute and look a good read in-between harder material. Both volumes are also on sale on Bookwalker right now in case anyone else is interested.

6 Likes

This weekend’s work: new code.

The following issues exist within my manga frequency lists:

  1. Non-content pages (such as the copyright page) are included in the vocabulary extract. Sometimes I can skip these based on the page’s filename, but other times I cannot.

  2. Imperfections in recognizing whether a word is valid or not allow misparsed words to make it into the vocabulary lists.

  3. Character names are often included in the lists. I consider names a special case and would prefer to exclude them from the frequency lists and word counts.

  4. Page count isn’t factored in. The first volumes of orange and ARIA the Masterpiece each have about 3,900 total words, but the former spans about 175 pages, the latter almost 300.

My updated process (being developed) gets me close to addressing all these items.

The “fun” part of excluding non-comic pages (such as the cover page, the copyright page, and the chapter title pages) is the manual work per volume, as I have to collect the file names of which pages to skip over. Likewise for compiling lists of character names to exclude from the vocabulary lists. But it might be worth it in the long run.

It’s a bit difficult to provide a visual of this, but the current progress outputs files that look something like this:

Words:

{
  "dictionaryWords": {
    "私": 44,
    "みんな": 23,
    "翔": 132,
    "思う": 19,
    "後悔": 8,
    ...
    "減る": 1,
    "つづく": 1
  },
  "nonDictionaryWords": {
    "フツー": 1,
    "アハハ": 2,
    ...
    "バイバーイ": 1,
    "NGU": 1
  },
  "pageCount": 155
}

Stats:

{
  "stats": {
    "uniqueWordCount": 743,
    "totalWordCount": 2555,
    "pageCount": 155,
    "wordsPerPage": 16,
    "newUniqueWordsCount": 239
  },
  "checkpoints": {
    "4%": 1,
    "7%": 2,
    "8%": 3,
    ...
    "98%": 680, 
    "99%": 705, 
    "100%": 731
  }
}

Probably the worst part of this is that once I have all this information in JSON format, and with the various Javascript I’ve learned recently, I can streamline my process for tracking learned words and the words I should learn next. (It’s “the worst part” because I’m already in the middle of too many projects as it is…)

I intended to be reading manga lots today, but that’s clearly not happening!


Edit: And of course now I have a very basic script to take a frequency list, load it in a browser, let me mark which words I know (one list applies across all series), and tells me what percentage of total words from the volume I know.

I suppose now I’ll kind of have to clean up it sometime, and add it to my site.

image

4 Likes

Following my last post, I’ve put a preliminary version online.

It’s not designed with the intention for others to use at this time, but for anyone who wants to see an early version of what I may or may not add to my site for manga word tracking, here are some sample pages:

2 Likes

Couldn’t you search for words in these pages and skip them? Like 第ー章 or manga title, author, publisher, stuff like that. I don’t know how you’re doing the code, but it seems plausible.

Also for names, parsing stuff like X-くん, or comparing it to a name database, and excluding those results.

3 Likes

Good ideas all around. Alas, some of them have already gone into consideration and didn’t work out.

I’ve considered that, but I have this potentially unreasonable fear that I’ll land on a word that legitimately appears in dialogue in a series.

There’s also catching (for example) pages that are ads for another series:

The final piece is that if I want to get a count of pages with content (excluding non-content pages between chapters), it would be complex to write an algorithm that can distinguish between a blank between-chapters page with the manga logo on it, and an actual content page with no dialogue but also has the manga logo.

The only weakness here is that it’ll miss names without the suffix.

image

However, it could be a good way to auto-generate a list of all the names in a volume put into my exclusion list for a series, as that relies only on a name suffix being used at least once. I’ll have to consider this.

For now, I’ve been looking up web pages with names of all the characters in a series to quickly put together a list, but some smaller series just don’t have anything like that out there. The alternative has been to look at my generated lists for high-frequency words I don’t know (easy to find when I’ve filtered out the words I do know) and then check if those are names.

One thing I am doing is using a vocabulary word database I put together to exclude words that aren’t in the database. This actually does a decent job at removing a lot of names, but my database has some common names in it simply due to the various sources I extracted it from. I should probably download a names database and compare the two to find names in my database to remove.

Another consideration is that in some cases names are also valid vocabulary words. Although I mean real names, there are also manga characters with intentional names, such as うさぎ.

image

(At one point I forgot this guy’s name was 律, and couldn’t figure out why someone suddenly was talking about the law in a way that made no sense whatsoever.)

I feel like the more progress I make in projects, the more work I end up giving myself!

Edit: The newly-added code for locating names based on さん etc. is already catching names I missed (where they are written in kana rather than kanji):

からかい上手の高木さん 1
{'高木': 91, '西片': 2, 'ホ': 17, '中井': 1, 'おかげ': 1, 'R': 1, 'サナエ': 2, '本屋': 1, '表紙': 1}

からかい上手の高木さん 2
{'高木': 70, 'ホ': 22, '、': 1, '高尾': 1, '君': 1, 'サナエ': 1, 'ユカリ': 1, '中井': 4, '真野': 1, '私': 1, '9': 1, '.': 1}

からかい上手の高木さん 3
{'ホ': 45, '高木': 63, 'ダンディ': 1, '西片': 2, '中井': 3, '木村': 2, '真野': 1, 'ホラユカリ': 1, 'ユカリ': 1, 'サナエ': 1}

からかい上手の高木さん 4
{'高木': 61, 'ホ': 25, '高尾': 1, '西片': 1}

からかい上手の高木さん 5
{'ホ': 17, '高木': 57, '真野': 2, '中井': 3, 'サナエ': 1, '北条': 2}

からかい上手の高木さん 6
{'ホ': 13, '高木': 47, '私': 1, '中井': 14, '真野': 9, 'ちゃん': 1, 'オレ': 1, '地蔵': 1, '天川': 2, '北条': 3, '.': 1, '西片': 2}

からかい上手の高木さん 7
{'西片': 5, '高木': 46, 'ホ': 17, '木村': 1, 'か': 1}

からかい上手の高木さん 8
{'高木': 45, 'もらっと': 1, '北条': 1, 'ホ': 11, 'ミナ': 1, 'なかい': 1, '中井': 1, '私': 1}

からかい上手の高木さん 9
{'高木': 47, 'ホ': 9, 'ユカリ': 1, 'ちゃん': 1, '西片': 1, '真野': 1, '中井': 1, 'あんた': 1, '北条': 5, 'オレ': 1, '私': 1, 'サンタ': 1, '木': 1}

からかい上手の高木さん 10
{'ホ': 10, '高木': 35, 'ほめるおこるかみつくおばさじ': 1, '真野': 1, '私': 1, '西片': 2, '月本': 3}

A tiny bit of curation is still needed, but this will save me a good amount of time.

3 Likes

Right, this is unfortunate, but I guess it does happen in most languages.

That’s the fun part. :eyes:

3 Likes

Once again today, I have done lots of reading spent a lot of time coding.

I’m getting closer and closer to something that feels like it can completely replace my Google Sheets for tracking my known words, and more importantly knowing which words I should target learning next.

Recent developments:

Links to vocabulary lists now appear on the stats/frequency list page for a series:

This gives me easy access to the highest frequency words that I don’t yet know, as well as to marking which I do know:

image

And I finally have a page that lists all the words I’ve marked as known, and lets me unmark in case I marked one as known by mistake:

image

Top to-do items:

  1. Add an option to export/import the known words list.
  2. Add volumes for more series.
  3. Add series vocabulary pages that list vocabulary frequency for a whole series.

Maybe I’ll read a book club chapter a day early so I don’t miss a day of reading today.

6 Likes