I saw a lot of ads about Migaku extension for video streaming platform like Netflix.
The idea is brilliant. They use the internal jp subtitles to show it with your native one, giving you the ability to check words by clicking and quick jump between them (among other feature with Anki …). Please note that it will not work with streaming service that do not have jp subtitles, like crunchyroll.
The idea is good, but after using it for few days, I saw a lot, really a lot, of misleading translations.
Here is an example:
ございません is the polite form of あります, but it was broken into ござい and ません
I tried it once a few weeks ago, loaded up one of the videos they suggested, and found an error within 2 minutes.
While I get it won’t be perfect all the time because Japanese is difficult, if the ones they themselves suggest have blatant errors, that is not something I want to bother with.
The ございません one is surprising given how basic it is but いる/はいる is a lot trickier to figure out programmatically (alongside all other situations where you can’t completely ascertain the pronunciation of a kanji based on furigana).
Beyond that, any machine translation of Japanese will break down in edge cases, you need to always be careful blindly trusting any of these tools. It’s only misleading if you let it mislead you.
The first two of those look like LLMs, which are a completely different and much more computationally expensive (and often slower) way to parse text than a traditional parser. Long term they might be the way to go for this kind of task but I’m not sure how economically or time-budget viable they are right now.
Migaku does seem to have a rather sub par parser right now (bad enough that I wouldn’t recommend using the tool based on the examples in this thread), but comparing it to LLM output is not apples to apples I think.
You are right @pm215, the LLM are slow and accurate, but not fitting in this situation.
it just looked promising to replace the NflxMultiSubs browser extension, but the result is not good enough to pay for it that much (the price is really high)
I find Migaku’s parsing to be reasonably accurate most of the time.
There will always be parsing errors when you prioritize fast tokenizing that uses few computer resources. People generally won’t want to wait while their browser parses things.
It’s also worth noting that for Japanese, Migaku has an SRS-backed course that teaches grammar and vocabulary, which can help with learning common vocabulary that might have unfortunate parse errors in various cases.
It can be argued that Migaku should be more upfront about the limits of the technologies they use. It’d also be nice to know what they’re using under the hood, as some people might be willing to contribute to helping improve the tools.
But since the state of parsing Japanese isn’t perfect outside of tools that are expensive (on time, CPU, (V)RAM, and/or database/model storage), and as such certain parsing issues cannot be reasonably avoided, I wouldn’t consider it to be “misleading”.
Disclaimer 1: I have a lifetime Migaku membership.
Disclaimer 2: I have a website that provides manga frequency lists made using OCR, and I don’t yet have a disclaimer about the limits of OCR and its impact on the frequency lists. (But I plan to add such a disclaimer…eventually.)
Is your website paid or is it free? I think the ethical standards for disclosure are higher if your asking for donations than if it’s free and MUCH higher if your requiring money to use it.
Also, do you have a frequency list for A Silent Voice / 聲の形?
I tried Migaku 2 months ago when I saw it was on sale and a day later I bought the lifetime subscription. I think it’s a wonderfull program and more importantly - the devs are very active on Discord. What works best is to start a topic in the ‘problem’ section there and give these examples so they can look at it.
It’s certainly not perfect, sometimes it will break down a word incorrectly like the example you give. But in those cases I find it often very obvious and when making a card for that word I simply manually correct it.
I use Migaku in combination with Youtube, Netflix, epubs and visual novels and the amount of vocab it recognized incorrectly was not that much at all.
So no, it’s not ‘misleading users’, it’s just not perfect.
I think it’s a bit easier for me to be forgiving because Migaku used to be free. As it’s a paid product, having a section on technology limitations would be nice.
聲の形
Free. I may add donation and subscription options in the future, but if I do, all currently free functionality will always remain free.
I thought I had this one, but it seems I don’t. The first two volumes are currently free to read on Kobo, so I can grab them and extract frequency lists from there. I’ll ping you when I have them added to the site.
I think it used to be 400,- last year too, but occasionally they have a sale for 200,-. That was somewhere around Christmas until early january and the reason I bought it instead of a monthly subscription. I used Yomitan and a texthooker before Migaku, but despite that being free, I found Migaku so much easier to use that I knew I would use it a lot.
I subscribed to other systems too like WK and Bunpro and in hindsight it would have been cheaper for me if I would have bought a lifetime subscription for those too right at the start.
I bought a lifetime for WK btw after I got to lvl 60. I still use it a lot to look up Kanji and vocab when I forgot something.
I suspect part of what’s going into the pricing is their support of many different languages. But at the same time, supporting many languages should get them a larger number of subscribers.
Ah ok. Maybe because their team expanded, I believe they have a team of 15 people now working on this and still are looking for more. As Christopher mentions above, you can get it for 200,- now. With those price increases that’s a nobrainer if you think you’ll use this enough.
Or simply go for the free Yomitan/Anki/Texthooker route, but that is more pain to setup and use.
I’m sad to see that the price is going in early February. I was actually thinking about dropping Duo Lingo and starting either BunPro or Migaku. At my level I think BunPro would be a lot more helpful. So I’m hesitant to jump in on a lifetime subscription to Migaku.