Thanks for backing me up as not being completely inept at Japanese =D
I knew I had seen it in kana before, so I was confused (but accepting) when my original search for 「めんどう」 (allowing for hiragana or katakana) returned nothing.
I wonder why it’s always めんど when it’s in kana, at least in all the manga I have on hand.
Well, I’m sure that’s a topic for another thread!
Apparently when I tried it again this time it worked, I had both python and pip but the command wasn’t running the first time around.
Well now that I ran the command finally, a bunch of text spurts out, and it seems like it’s all been downloaded. Though I do get a warning that I’m using a bit of an outdated version of pip, does that matter?
Aside from that, do I just go on to the manga volume step?
Well I tried the manga step, but it says mokuro is not recognized as an internal or external command
Does the answer here help? (Let me know if you’re not on Windows.)
For this, it won’t matter.
You should be good to proceed with what Mokuro’s README says.
In the command window, type:
mokuro "/home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1" Paths to process: /home/chris/Books/Comics/Japanese/日々蝶々/日々蝶々 1 Each of the paths above will be treated as one volume. Continue? [yes/no]
This looks good to me, so I type “yes” and press Enter and off it goes.
For some reason it doesn’t recognize my GPU, so it just runs it on the CPU. Takes maybe 15 to 30 mins depending on number of pages, but was a bit longer on my older computer.
Honestly I’m struggling to figure out how to even apply the answer on that website. I am using windows yes. And I thought I wasn’t so bad with computers…
Would you perhaps be okay with going on a discord call or something while I screenshare? I think that might just save a lot of headache.
You may need to guide me on how to initiate contact on Discord, but here’s what I think is the needed information to reach me:
It’s likely the environment variables needed to be reset, which is done when restarting the cmd
That was interesting science indeed, wow! Thanks for asking @Kazzeon !
@ChristopherFritz Thank you so much for sharing this, amazing projects.
I might even try to roll my own versions of Mokuro Bookshelf and Manga Text Search, it really inspired me!
I installed Mokuro and tested on 3 pages of Ruri, wow that looks good
I’m a bit puzzled because when I open the html file in Chrome, my Yomichan shortcut doesn’t work, it doesn’t do any lookup, any idea why?
But otherwise very cool, so much searchable data now, so much potential as you say hehehe
Assuming a Chromium-based browser such as Google Chrome:
Click on this icon on the top-right area of the browser:
Then this at the bottom:
Find Yomichan and click this button:
Enable this option:
By default, extensions cannot access files loaded directly from your computer. This process gives an extension local file access.
(If anyone has this issue on Firefox, I can look into how to enable it on there.)
YES now it works, thank you so much
Now on to making a custom version of Manga Text Search! (never used ruby before so I might run it with something else)
There’s so many options and potential here.
One could opt to use SQLite rather than text files.
One could store the location of the textbox (from the JSON file) alongside the text, so when they view a result in the HTML file, it displays a div with a border around the textbox (making it easy to see where the text is on the image).
And so on.
I like those ideas!
From a learning perspective, maybe there would be a way to extract the grammar points if they match those on Bunpro for example (though wouldn’t work great unless there’s an exact match, so ておく wouldn’t be matched with ておいて )
One option: Use Ichiran to help parse out the grammar.
But I’ve never gotten a setup working to use Ichiran locally.
I’ve considered whether I could take the output of a tool such as Mecab or (my preference) Juman++ and find patterns that match specific grammar points.
For example, consider the following sentence:
Feeding that into Juman++, the output starts with:
言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト" おく おく おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 基本形 2 "代表表記:おく/おく"
And using this sentence:
Juman++ opens with:
言って いって 言う 動詞 2 * 0 子音動詞ワ行 12 タ系連用テ形 14 "代表表記:言う/いう 補文ト" おいて おいて おく 接尾辞 14 動詞性接尾辞 7 子音動詞カ行 2 タ系連用テ形 14 "代表表記:おく/おく"
(My interest was in searching for text by grammar. I concluded it would be too much time and effort for something I’d build on my own to use on my own.)
Aw man, this is like when they open the curtain on the Wizard of Oz
That’s a feature I expect to see in Migaku (subscription service) in an upcoming release, tracking lookups.
I’m just rambling.
That might be the point I switch from Japanese.io to Migaku. I’ve been enjoying it a lot for the past years, but unfortunately the features I have been waiting for the most are just not being implemented…
*eyes glaze over* I have a lot to learn. But that’s fine! I am considering a slow career change into programming and the only way I can imagine myself finding the motivation to learn A LOT OF STUFF is if I need it for a Japanese related project
If it makes you feel any better, I don’t know anything about IndexedDB and Web SQL either (or I would have recommended just one).
That said, I do use Local Storage to track my “known words” for manga I’m reading, and it seems to be doing okay. I just have to be aware of and accepting of its limitations.