Mokuro: Read Japanese manga with selectable text inside a browser

Kazzeon · March 14, 2023, 9:13pm

ChristopherFritz · March 14, 2023, 9:15pm

Thanks for backing me up as not being completely inept at Japanese =D

I knew I had seen it in kana before, so I was confused (but accepting) when my original search for 「めんどう」 (allowing for hiragana or katakana) returned nothing.

I wonder why it’s always めんど when it’s in kana, at least in all the manga I have on hand.

Well, I’m sure that’s a topic for another thread!

HaseebYousfani · March 14, 2023, 9:29pm

Apparently when I tried it again this time it worked, I had both python and pip but the command wasn’t running the first time around.

Well now that I ran the command finally, a bunch of text spurts out, and it seems like it’s all been downloaded. Though I do get a warning that I’m using a bit of an outdated version of pip, does that matter?
Aside from that, do I just go on to the manga volume step?

HaseebYousfani · March 14, 2023, 10:04pm

Well I tried the manga step, but it says mokuro is not recognized as an internal or external command

ChristopherFritz · March 14, 2023, 10:05pm

Does the answer here help? (Let me know if you’re not on Windows.)

For this, it won’t matter.

You should be good to proceed with what Mokuro’s README says.

In the command window, type:

mokuro "/home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1"

Paths to process:

/home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1

Each of the paths above will be treated as one volume. Continue? [yes/no]

This looks good to me, so I type “yes” and press Enter and off it goes.

For some reason it doesn’t recognize my GPU, so it just runs it on the CPU. Takes maybe 15 to 30 mins depending on number of pages, but was a bit longer on my older computer.

HaseebYousfani · March 14, 2023, 10:19pm

Honestly I’m struggling to figure out how to even apply the answer on that website. I am using windows yes. And I thought I wasn’t so bad with computers…

Would you perhaps be okay with going on a discord call or something while I screenshare? I think that might just save a lot of headache.

ChristopherFritz · March 14, 2023, 10:21pm

You may need to guide me on how to initiate contact on Discord, but here’s what I think is the needed information to reach me:

Gorbit99 · March 15, 2023, 12:32am

It’s likely the environment variables needed to be reset, which is done when restarting the cmd

Akashelia · March 15, 2023, 3:36am

That was interesting science indeed, wow! Thanks for asking @Kazzeon !

rafascar · March 15, 2023, 4:34pm

@ChristopherFritz Thank you so much for sharing this, amazing projects.

I might even try to roll my own versions of Mokuro Bookshelf and Manga Text Search, it really inspired me!

Akashelia · March 15, 2023, 6:02pm

I installed Mokuro and tested on 3 pages of Ruri, wow that looks good

I’m a bit puzzled because when I open the html file in Chrome, my Yomichan shortcut doesn’t work, it doesn’t do any lookup, any idea why?

But otherwise very cool, so much searchable data now, so much potential as you say hehehe

ChristopherFritz · March 15, 2023, 6:05pm

Assuming a Chromium-based browser such as Google Chrome:

Click on this icon on the top-right area of the browser:

Then this at the bottom:

Find Yomichan and click this button:

Enable this option:

By default, extensions cannot access files loaded directly from your computer. This process gives an extension local file access.

(If anyone has this issue on Firefox, I can look into how to enable it on there.)

Akashelia · March 15, 2023, 6:08pm

YES now it works, thank you so much

Now on to making a custom version of Manga Text Search! (never used ruby before so I might run it with something else)

ChristopherFritz · March 15, 2023, 6:11pm

There’s so many options and potential here.

One could opt to use SQLite rather than text files.

One could store the location of the textbox (from the JSON file) alongside the text, so when they view a result in the HTML file, it displays a div with a border around the textbox (making it easy to see where the text is on the image).

And so on.

Akashelia · March 15, 2023, 6:18pm

I like those ideas!
From a learning perspective, maybe there would be a way to extract the grammar points if they match those on Bunpro for example (though wouldn’t work great unless there’s an exact match, so ておく wouldn’t be matched with ておいて )

ChristopherFritz · March 15, 2023, 6:26pm

One option: Use Ichiran to help parse out the grammar.

But I’ve never gotten a setup working to use Ichiran locally.

I’ve considered whether I could take the output of a tool such as Mecab or (my preference) Juman++ and find patterns that match specific grammar points.

For example, consider the following sentence:

言っておくが私はこちゃ紅茶にはうるさいぞっ

Feeding that into Juman++, the output starts with:

言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト"
おく おく おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 基本形 2 "代表表記:おく/おく"

And using this sentence:

言っておいてやろう

Juman++ opens with:

言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト"
おいて おいて おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 タ系連用テ形 14 "代表表記:おく/おく"

(My interest was in searching for text by grammar. I concluded it would be too much time and effort for something I’d build on my own to use on my own.)

Phryne · March 15, 2023, 6:34pm

Thank you so much for putting this together. I have been toying with the idea of dusting off my (minimal, lol) knowledge of Javascript and I think this could be an interesting way to kickstart things. I am looking to put something together that can track what words I look up and give me a tally, so I can make informed decisions about what vocab to put into Anki.

Aw man, this is like when they open the curtain on the Wizard of Oz

ChristopherFritz · March 15, 2023, 6:40pm

That’s a feature I expect to see in Migaku (subscription service) in an upcoming release, tracking lookups.

For manga on one’s local computer, specifically (and ignoring lookups on websites), I could see adding something via Javascript that watches for Yomichan/Migaku being invoked for a lookup and then storing/incrementing a counter for that word. It’s probably better to use IndexedDB or Web SQL, whichever is the newer technology, rather than Local Storage for this.

I’m just rambling.

Phryne · March 15, 2023, 6:44pm

That might be the point I switch from Japanese.io to Migaku. I’ve been enjoying it a lot for the past years, but unfortunately the features I have been waiting for the most are just not being implemented…

*eyes glaze over* I have a lot to learn. But that’s fine! I am considering a slow career change into programming and the only way I can imagine myself finding the motivation to learn A LOT OF STUFF is if I need it for a Japanese related project

ChristopherFritz · March 15, 2023, 6:49pm

If it makes you feel any better, I don’t know anything about IndexedDB and Web SQL either (or I would have recommended just one).

I only “learned” Javascript recently, as well as using Local Storage. I’ve found Local Storage to be super-simple to use, but it may not be as well-suited to something like tracking words with look-up counts.

That said, I do use Local Storage to track my “known words” for manga I’m reading, and it seems to be doing okay. I just have to be aware of and accepting of its limitations.

Topic		Replies	Views
Recommendations for reading digital manga with OCR Reading	4	714	November 20, 2024
Yomichan for pdf Questions	4	2449	September 12, 2024
Reading Manga with 'Nihongo' app Japanese Language	8	734	May 9, 2024
Trouble reading tough kanji in manga/VNs? OCR to the rescue! Resources	17	1107	August 1, 2023
Is there a decent OCR manga reader on IOS? Resources	11	2442	June 13, 2025

Mokuro: Read Japanese manga with selectable text inside a browser

Related topics