[Userscript]: Anime Context Sentences

There appears to be Youglish as well, but I am not sure if there is an API that can be used in UserScript? Also, this is not anime nor movie, but real speech (though, perhaps monologue).

Also unlike ImmersionKit,

  • Sound files cannot be extracted, to use in Anki for example.
  • Rewinding is easy, and audio is continuous, not chopped to segments.

I found this from this Discord, btw.

1 Like

Yeah, I’m aware of Youglish.

The problem with YouTube is just the lack of good subtitled Japanese content on YouTube. Very often you end up with personal vlogs and v-tuber videos that could be mildly interesting, but lack good context (interesting visuals / characters) to reinforce memory.

Visual novels and drama share the same problem, but to a lesser extent. Anime just has this combination of dramatic scenes, poppy colors, and exaggerated intonation that helps you remember the word or the phrase.

Speaking of updating the userscript…I’ve added a ton more anime (and drama) to the API but since this userscript filters by whitelists, all the new content is filtered. Maybe @psdcon can update it when they’re back.

8 Likes

Not to be nitpicky, but it’s about the original website.

  • Sound files aren’t really continuous, and some middle thought-to-be-silent segments are missing.

I know it is impossible to audit every segments to check if vocabularies exist, or in the correct form, but there are some instances Youglish do better.

This is from Homophone Dictionary, but Jisho only has 会う (listing 遭う as an alternative form).

(I have also just noticed the thing about <title> or og:title meta tags, but well, Jisho failed on that too.)

Another important case is, not all Japanese vocabularies have Kanji, nor Kana at all.

Again, somehow, Youglish got this right.

Of course, there are cases that both fail. I can’t think of a good example right now, but the vocab are broken first with MeCab, I guess.

Somehow, Youglish wins again.

Also, not all vocabularies have Jisho entries, in particular, phrases. Some manual labor might be needed to fix this.

I also think of using community manual labor, or perhaps only mine, to fix this, even if partially.

1 Like

Just a quick reply, exact searches are what you want to do for such cases.

https://www.immersionkit.com/dictionary?keyword=「遭う」

It’s noted in the search section. I know most people won’t read it, and I’ve been thinking of adding an exact search toggle. That will make it more obvious.

I would refrain from tagging searches “winning” or “losing”, as it isn’t the case of not returning results but more the case of what is shown and whether that is what the user /learner wants.

From a learner’s perspective, it’s fine to return inflections, for example 泣く could return 泣いた、泣かない. But how about なく? What should that return? I don’t recall anyone saying there is a problem with jisho.org when they provide 20 entries for なく, or other forms of あう when searching for 遭う, but some find that issue with immersion kit.

On a side note, Sudachi is used, not MeCab.

I agree there are some cases where a different parser or a different search algorithm would make sense, and in fact I have added quite a number of hard mappings on top of the Sudachi parser. I guess if you were to sit someone down for full time summer job and go through the 10k most common vocab parsing that would be helpful to the site.

https://youglish.com/pronounce/かんとく/japanese?

^ Just raising an example to point out problems with different parsing.

Well, I already open sourced the early data on Github and Jo Mako has all the data on his public spreadsheet, so you’re welcome to patch the data.

2 Likes

It’s more of Kanji choice, actually. Since this is a Kanji learning site, it can matter.

Also, I didn’t really test the UserScript, so I can’t tell if it would fail, but I don’t think 「」 is a part of the script.

I would consider making PR. It’s in /resources/*/*/data.json’s word_base_list key, perhaps. I also notice that I don’t really directly need to use the API. Just look at the sound key, and adding your static base URL is enough.

That being said, it is also possible that PR is accepted or not, but I can still make my search engine.

2 Likes

I think it’s fine to return results with とりあい for 取り合い, とにかく for 兎に角, or はずかしめ and 恥ずかしめ for 辱め. I know some might want the exact search result returned, but that is a compromise I think that is better for learning purposes.

I’m not responsible for the userscript. For the game gengo textbook, in the “extra examples” section, some items use regular search, some exact search, and some are hand-picked from the database. That is a better approach.

Edit: I archived the old repo and added the link to the latest one.

1 Like

Relased v1.1.1

  • Supports the new titles on immersionkit
  • Updated to v1.4 of item injector

Thank you @Sinyaven and @mathewthe2 for answering the questions in the thread, and for the updates and new titles. You’re awesome!

P.S. @mathewthe2 I won’t implement the sentence length form the API since the user can sort by length, or scroll past the short ones. :slight_smile:

8 Likes

sorry if this has been answered before, i’ve done a quick scan and can’t find it

sometimes when new titles get added, all of the other titles which i’d selected in that list get deselected? even though i’d not done anything or even gone into the settings at all. so this time, in the ghibli list everything was fine. the other titles were selected as i’d left them, and the new ones were where they should be. in the (tv) anime list, the new ones had come in, but all of the other ones had been deselected. last time i remember the same thing happening to both lists, not just the one like this time

is this a common thing or is it just me :sweat_smile: again, sorry if this has been brought up before

Hello :slight_smile: Good question! I’ll be honest, I don’t know how to preserve settings between updates, and I justified it by saying at least if the title choices are lost, this will prompt people to open settings and see the new titles! Not very user friendly, I know. :sweat_smile: I’ll do more tests before the next update to see if I can avoid it.

2 Likes

no problem, thanks!

I just came to the forum after thinking that Wanikani context sentences are a bit bland and poorly implemented and stumbled upon this amazing script! Thank you so much for creating this, it really breathes new life into my lessons! And of course also a big thank you to the creators and contributors of immersionkit.com, which I wouldn’t have discovered otherwise.

3 Likes

hi. thank you for the script. i really wanna use this but I don’t know how to navigate to the settings panel where you select anime from. and this settings icon is not clickable

image

Do you have Open Framework installed? It is necessary for displaying the settings dialog.

1 Like

that was the problem. thank you so much^^

1 Like

Is it possible to autoplay the first context sentence’s audio (unless specifically chosen, or opt out), after the voice actor has finished speaking the vocabulary?

2 Likes

Yep i’d love that too, or even have the audio options to automatically open after answering

Hey, this is awesome! I have a few critical suggestions , ranked by order of importance :

  1. Anime chosen for filter - their sentences will be at top, but the rest will be displayed lower. this way I can avoid having no example sentences at all because I’ve filtered too much. usually I just want the animes that I like be on top, but having something is better than nothing in cases where my specific anime does not have that vocab word :slight_smile:

  2. option for anime example sentences to be on top when opening info on card . i.e i got a card wrong and press F to see details - anime context will be at the top.

  3. option for card info to open automaticlly because I want to hear the recordings when I review .

  4. option to give a “like” to a specific recording you like so it will play back during reviews or atlist be on top of other anime sentences when you open the additional info.

  5. have an option to remove anime sentences with one word, or with the word put in brackets - often times subtitles have certain words in brackets, and these words are not actually part of the sentence but actually describe the scene . i.e words in brackets are not actually voiced.

  6. have an option to mark an anime sentence , then export the list of anime sentences marked into downloadable audio files so I can make anki cards out of them :smiley: .

6 Likes

Can I make a donation? I love this

wish there was a plugin the automaticly plays a sentence after entering an answer

Seems there’s a new Wanikani update today which separates the lesson quiz info into tabs:


As you can see, Anime Sentences is in its own unnamed tab rather than the Context tab. Just a heads up as I would think it would make sense for it to join the Context tab.

4 Likes

After updating Item Info Injector last week, I was wondering how long it will take until someone mentions the change in one of the dependent scripts. :smiley:

I have already created a pull request for fixing the problem with the unnamed tab (but this fix still keeps the Anime Sentences section as a separate tab). @psdcon, if you want to place the Anime Sentences section in the Context tab, the append() has to be changed into appendSubsection(). You now also have the option to change notify() [1] into notifyWhenVisible() [2] – both of them work, but have small differences (currently only during the lesson quiz).


  1. The callback function will be called once the user opens the item info. At this point, the Context tab is still closed and its content is not in the DOM, so when you try to append your section, Item Info Injector outputs a warning in the console that your section could not be injected. However, your section is cached and appended automatically once the user opens the Context tab, so everything should still work. ↩︎

  2. The callback function will be called once the user expands the Context tab. In this case, appendSubsection() works without problem and you don’t get a warning. However, if the user collapses and then expands the Context tab once more, the callback function is also called again – so your section gets recomputed every time the tab is expanded.

    So in the case of Anime Sentences, the main difference between these two options is: with notify(), Anime Sentences gets loaded even if the user opens the item info without ever looking into the Context tab, while notifyWhenVisible() will delay the loading until the user actually opens the Context tab, but this also means that the user then has to wait until the data has been fetched from Immersion Kit before they can see the sentences. ↩︎

1 Like