Omg, thanks for the tips!!! That solves the only two things that thought were so annoying, I thought there might be a way to do it better, but had no idea how to look for it
This is my first Book Club, I’ve read a little bit of Naruto, and all of tadoku’s graded readers. Looking forward to this one, really liked the little previou i’ve seen on Amazon.
No problem! If it makes you feel any better about not knowing how to search for it, even if you do, the top results wouldn’t give you the shorter ways. It does give you other keyboard shortcuts, but I discovered the shift+caps one on my own by mistake when using the ones I could find on Google, and then realized that, hey, wait. There’s a much faster way to do this. It’s so much more convenient this way for me. Though, if I used more special characters like other languages have, or didn’t have an American layout, I imagine it wouldn’t be quite as convenient.
I believe the Katakana thing was taught to me by @2OC3aOdKgwSGlxfz a while back when I was still swapping into Katakana directly by using Alt+Caps to get to Katakana, and Ctrl+Caps to force it back to Hiragana. It worked, but it was just extra keypresses, and meant that if I Shift+Caps’d back into Romaji without switching back to Hiragana first, it would default to Katakana when I changed back, and it was just a mild annoyance. (Still loads better than having to swap to the mouse and click in the bottom right to change input methods like I was doing when I first installed the IME).
But anyway, yeah, the ramble is to say that the internet wasn’t very helpful in the IME department. Trial and error and this wonderful community are the only reason I can use it with any sort of speed.
Welcome to the club! And to the community! If you ever have any questions, whether it’s about the way the forums work or language questions, always feel free to ask. Everyone here is friendly and welcoming!
You’ve no idea how much time this saves me, thank you so much! If you know of any other tips and tricks please let me know
Those are the ones I know off the top of my head, but there are tons of people smarter than me here, so if there are extra tips to be had, I’ll be learning them alongside you if they get posted.
I’m in this case but I can just switch keyboard with Windows + space, so I don’t mind, it was just annoying to have to use the mouse for hiragana / romani (I don’t switch keyboard if I’m using English only), so now you’ve fixed that, it’s amazing
Nice that it was also a tip you got / and fount out yourself, thanks for paying it forward! Agreed that this community is wonderful!
Hey all, first post, first book club, and first Manga ever! Really excited for this experience.
Hi guys! I’ve been doing WaniKani for a little over two years and I finally decided to take the plunge and participate in something fun like this! Just ordered the book and it looks great. Most of my reading experience just comes from little bits and pieces I can understand, so I’m excited to actually start consuming media.
Hello, my name is Lexi J and I have intent of joining this reading club. I already read the english version online, but I had too much trouble following the japanese original, so I hope that I can improve enough here to continue reading in japanese when the series continues. My hardcopy is underway and I hope it arrives in time.
I will try this in Japanese now, so please bear with me…
こんにちは、レクシージェーと申します、よろしくお願いします。この読書会入したいのですが。この漫画の英語版もう読みましたでも日本語版が難しすぎます。漫画の出版が続けると日本語に読むように自分に上手になったと希望します。私のハードコピーが道中だ間に合ったらいいのね
Answering over here since it’s kind of off-topic in the other thread.
Nice! And if you’re already reading よつばと! - L18 and L21 isn’t that far apart anyway.
And as long as you are using the language actively, don’t worry too much about forgetting burned things. Unless you’re taking a test, occasionally looking up things you “should” know isn’t the end of the world, and then it’ll be part of your memory again. You’ll (hopefully!) learn most of your words outside of SRS anyway!
I searched a bit and couldn’t find one. Not that surprising though, considering the manga came out just last year.
Maybe you could delete your latest post in the other thread and also make it over here, @ChristopherFritz? I’m sure it would be helpful for other people in this thread too!
I don’t know if anyone’s done this, but I do have an auto-generated vocabulary frequency list that can be used with an extension like Yomichan or Migaku to create Anki cards for the most common words in the volume.
Funny thing is, I was this close to posting it here instead, but didn’t, and then you did a reply here. Had me thinking, “Should I move mine over here?”
That’s really cool! I gotta look through the tools needed for that later.
And now I wonder if we should create a modified workflow to auto-generate vocab spreadsheets for book clubs. (Or is there one already?)
edit: Changed “could” to “should”… looking at the tools available, it seems like the hardest parts (i.e. OCR and splitting sentences into words) are already done.
Between that and this, it’s definitely possible.
The main issue is word order. When you OCR a manga page, the sentences won’t necessarily be in order, meaning some of the words will likely be out of order.
There are also issues with word parsing.
Vetting would be needed.
Ah, yeah, I didn’t consider word order on the page. Of course the OCR doesn’t have any idea about that, and any automatic approach would be naive at best. I guess a UI to select speech bubble order would be needed to make this process user-friendly and painless, hmmmm…
Mokuro output is a HTML/JavaScript powered reader, right? It might be doable to add a little bit of JavaScript that adds a clicked speech bubble (if that’s the unit) or line of text (if that’s the unit) to a text box. So, if you go through a page clicking on everything in order, you’d have all sentences in the right order, ready for further processing.
As a lazy person, that sounds annoying, but less annoying than building a vocab sheet by having to type words one by one.
How big are the word parsing issues?
It’s also a json file, that details the exact positions of the texts and what their contents are. It’s probably easier to make something from the ground up using that, than trying to make the original html+js version work, since that has many moving parts.
In my experience, I used mokuro on the first volume of Kaguya-sama, which is a very (probably overly, and on purpose even) kanji heavy one. It struggles a lot with complex, blurry kanji, less so with simple kanji or kana. And even there I could only find a mistake every third page or so.
Also, it doesn’t parse words, it parses sentences, so you would need a method to extract words from that.
If going that route, one could modify Mokuro to track a timestamp each dialogue balloon was clicked on, then you could export that and sort by timestamp to get them in order. But I’d not want to be the one going through all that clicking… (Edit: Although I did populate a whole volume’s vocabulary sheet manually once upon a time.)
Another option to consider is that since the data is also stored in JSON files is to try working out some logic to guess the correct order based on dialogue positions, but that likely would fail in many cases.
Strings of hiragana can be troublesome. Compound words sometimes get split into two words. Expressions/idioms aren’t always recognized (although I don’t know how often those get manually entered into a vocabulary sheet as an expression).
That’s not even to mention when Mokuro misreads a kanji/character.
For this, I like Juman++ best, but it’s not without (the aforementioned) issues. MeCab and others are also available.
Most people following this thread can completely ignore our technical conversations =D
Ah, I don’t mean working with the existing code. I mean injecting a little bit of code into the HTML page. Something simple like “When a div is clicked, export the text of the div.”, or whatever Mokuro actually uses exactly. (And hopefully there’s a nice way to get the page too from that.)
If that can work, it would be rather easy to achieve. Definitely much less work than creating anything from the ground up, I think?
That’s ok, having a hybrid approach would be probably the best. First you run the output through the algorithm, which would work 95% of the time most likely. Then you run through the book, or just the chapter of the week, looking for inconsistent positioning. If you find one, you correct it. Much less time in general, and you don’t have to do a lot of clicking
You can have Javascript append it to an entry in local storage (no need to track timestamps as I considered earlier), then have a button you click to export that local storage to a file. Simple enough!
Edit: And you don’t even need to modify Mokuro’s source code. Just add some Javascript to the HTML file it generates. Or, generate the HTML file with Mokuro’s option to separate out the Javascript file as a separate file, so it can easily be used for multiple HTML files (manga volumes).
Edit 2: Implementation would be, loop through all div’s with class “textBox” adding an onclick handler to a function to append to local storage, then add a button to export local storage to a file. Maybe also have something that gives all the divs a visible background color (like yellow) that goes away on click to ensure you get them all. Gamify!