Mokuro: Read Japanese manga with selectable text inside a browser


3 Likes

Thanks for backing me up as not being completely inept at Japanese =D

I knew I had seen it in kana before, so I was confused (but accepting) when my original search for 「めんどう」 (allowing for hiragana or katakana) returned nothing.

I wonder why it’s always めんど when it’s in kana, at least in all the manga I have on hand.

Well, I’m sure that’s a topic for another thread!

2 Likes

Apparently when I tried it again this time it worked, I had both python and pip but the command wasn’t running the first time around.

Well now that I ran the command finally, a bunch of text spurts out, and it seems like it’s all been downloaded. Though I do get a warning that I’m using a bit of an outdated version of pip, does that matter?
Aside from that, do I just go on to the manga volume step?

Well I tried the manga step, but it says mokuro is not recognized as an internal or external command

Does the answer here help? (Let me know if you’re not on Windows.)


For this, it won’t matter.

You should be good to proceed with what Mokuro’s README says.

In the command window, type:

mokuro "/home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1"

Paths to process:

/home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1

Each of the paths above will be treated as one volume. Continue? [yes/no]

This looks good to me, so I type “yes” and press Enter and off it goes.

For some reason it doesn’t recognize my GPU, so it just runs it on the CPU. Takes maybe 15 to 30 mins depending on number of pages, but was a bit longer on my older computer.

1 Like

Honestly I’m struggling to figure out how to even apply the answer on that website. I am using windows yes. And I thought I wasn’t so bad with computers…

Would you perhaps be okay with going on a discord call or something while I screenshare? I think that might just save a lot of headache.

1 Like

You may need to guide me on how to initiate contact on Discord, but here’s what I think is the needed information to reach me:

image

1 Like

It’s likely the environment variables needed to be reset, which is done when restarting the cmd

1 Like

That was interesting science indeed, wow! Thanks for asking @Kazzeon !

2 Likes

@ChristopherFritz Thank you so much for sharing this, amazing projects. :star2:

I might even try to roll my own versions of Mokuro Bookshelf and Manga Text Search, it really inspired me! :smiley:

3 Likes

I installed Mokuro and tested on 3 pages of Ruri, wow that looks good :smiley:

I’m a bit puzzled because when I open the html file in Chrome, my Yomichan shortcut doesn’t work, it doesn’t do any lookup, any idea why?

But otherwise very cool, so much searchable data now, so much potential as you say hehehe

2 Likes

Assuming a Chromium-based browser such as Google Chrome:

Click on this icon on the top-right area of the browser:

image

Then this at the bottom:

image

Find Yomichan and click this button:

image

Enable this option:

image

By default, extensions cannot access files loaded directly from your computer. This process gives an extension local file access.

(If anyone has this issue on Firefox, I can look into how to enable it on there.)

8 Likes

YES now it works, thank you so much :heart_eyes:

Now on to making a custom version of Manga Text Search! (never used ruby before so I might run it with something else)

There’s so many options and potential here.

One could opt to use SQLite rather than text files.

One could store the location of the textbox (from the JSON file) alongside the text, so when they view a result in the HTML file, it displays a div with a border around the textbox (making it easy to see where the text is on the image).

And so on.

2 Likes

I like those ideas!
From a learning perspective, maybe there would be a way to extract the grammar points if they match those on Bunpro for example :smiley: (though wouldn’t work great unless there’s an exact match, so ておく wouldn’t be matched with ておいて :frowning: )

1 Like

One option: Use Ichiran to help parse out the grammar.

But I’ve never gotten a setup working to use Ichiran locally.

I’ve considered whether I could take the output of a tool such as Mecab or (my preference) Juman++ and find patterns that match specific grammar points.

For example, consider the following sentence:

言っておくが私はこちゃ紅茶にはうるさいぞっ

Feeding that into Juman++, the output starts with:

言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト"
おく おく おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 基本形 2 "代表表記:おく/おく"

And using this sentence:

言っておいてやろう

Juman++ opens with:

言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト"
おいて おいて おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 タ系連用テ形 14 "代表表記:おく/おく"

(My interest was in searching for text by grammar. I concluded it would be too much time and effort for something I’d build on my own to use on my own.)

3 Likes

Thank you so much for putting this together. I have been toying with the idea of dusting off my (minimal, lol) knowledge of Javascript and I think this could be an interesting way to kickstart things. I am looking to put something together that can track what words I look up and give me a tally, so I can make informed decisions about what vocab to put into Anki.

Aw man, this is like when they open the curtain on the Wizard of Oz :wink:

1 Like

That’s a feature I expect to see in Migaku (subscription service) in an upcoming release, tracking lookups.

For manga on one’s local computer, specifically (and ignoring lookups on websites), I could see adding something via Javascript that watches for Yomichan/Migaku being invoked for a lookup and then storing/incrementing a counter for that word. It’s probably better to use IndexedDB or Web SQL, whichever is the newer technology, rather than Local Storage for this.

I’m just rambling.

That might be the point I switch from Japanese.io to Migaku. I’ve been enjoying it a lot for the past years, but unfortunately the features I have been waiting for the most are just not being implemented…

*eyes glaze over* I have a lot to learn. But that’s fine! I am considering a slow career change into programming and the only way I can imagine myself finding the motivation to learn A LOT OF STUFF is if I need it for a Japanese related project :grin:

1 Like

image

If it makes you feel any better, I don’t know anything about IndexedDB and Web SQL either (or I would have recommended just one).

I only “learned” Javascript recently, as well as using Local Storage. I’ve found Local Storage to be super-simple to use, but it may not be as well-suited to something like tracking words with look-up counts.

That said, I do use Local Storage to track my “known words” for manga I’m reading, and it seems to be doing okay. I just have to be aware of and accepting of its limitations.

1 Like