[Guide] Studying with Native Material: How to Dissect, Learn and Mine!

Important message before we begin:
A similar thread was closed by moderator recently, as people began to suggest piracy and link to illegal (or at least, less than legal) websites. PLEASE DO NOT DO THIS! This is against the Wanikani forum policy and can get well-intentioned threads shut down. I will only be linking to legal sources here. If you have a question that you know will likely result in a piracy recommendation (i.e. “where can I watch anime for free with Japanese subtitles?”), please refrain from doing so. Likewise, please do not discuss/recommend illegal content. Thank you for following the rules. Now let’s get on with the show!

So, you’ve been studying your grammar and burning through those Wanikani reviews. You feel like you’re learning, but let’s be honest here, you really want to dive in native material already! Be it a manga or light novel, you want to get to the good stuff. But alas, you feel like you must continue slaving away at your textbooks or Bunpro lessons before you can finally consume the material you started learning Japanese for.

What if I told that you can use that native material to actually study? What if I told you that with this guide, you could learn to dissect native material to learn new grammar and vocabulary?

“I already know that! I’ve been doing it already!”

But have you been doing it effectively? Have you been properly adding every word you come across to your SRS? Have you been “sentence mining”, and truly digging through the native material for the gold that lies within? Have you been putting it off because it sounds like a slog? Well, look no further! I’ll show you how to make the most out of native material, properly!

gif483337411
(You: “Tell me more!”)

This guide is going to get lengthy, so I’ll be separating it into several parts that you can view anytime with the table of contents. Remember to read the “introduction” and “tools” section before getting started

Table of Contents

Part I: Introduction

Who is this guide for?

This guide is for upper beginners who have some basic understanding of the language, all the way to more advanced learners who don’t know how to mine native material properly. If you feel uncomfortable touching anything Japanese outside of a textbook without having a dictionary and Google Translate equipped like it’s a sword and shield, then this guide might be great for you. If you are the kind of person who prefers to skip past the things you don’t understand when reading, and hate to interrupt your reading flow, then this guide may not be for you. This guide is meant for people who want to fully understand most of the content they happen upon. If you’re good with only understanding around 50% of content while you read, then that’s okay! This likely isn’t your guide.

This guide is optimized for a PC. I will briefly touch upon how you can replicate some of the steps with your phone, but this guide is made for a PC in mind. Some steps may require you to use your phone though, so make sure you have a functioning handheld device with at least enough storage to download Google Translate and a dictionary app. You will be downloading Chrome extensions, so make sure you have Chrome installed. All the tools used in this guide are free, apart from Netflix, but you can replace that with anything that as Japanese audio.

I’ll mainly be focusing on reading material, as listening comprehension is generally a whole different ballgame, but I will have a section dedicated to listening/watching as well. If you couldn’t tell already, this guide is also written with a light-hearted tone. Just to keep the mood up! I’m not a teacher; I’m a learner just like you, and this guide is what helps me. I’ve never written a guide before, so please do not hesitate to give feedback! The formatting of this guide is inspired by @jprspereira 's Ultimate Wanikani Guide so plenty of shout-outs go to them!

What is this a guide for?

This guide aims to teach you an effective way to make the most out of the native material (material in Japanese made for a Japanese-speaking audience) you wish to read/watch. Using different tools, you’ll learn how to sentence mine quickly and efficiently from different mediums, and most importantly, how to make the content you learn really stick. We live in a world where other Japanese-language fans have created fantastic, free-to-use tools to help you learn, and this guide will show you how you can integrate them into your studies. This guide won’t teach you how to swim in the endless sea of Japanese content, but it will hopefully teach you how to float.

This guide will focus on teaching this method of sentence dissection I’ve come up with.

Part II: Tools

These are the tools you’ll need for the guide.

PC Downloads

Google Chrome

An E-book reader of your choosing (not needed now, but you might need one down the line)

Apps

Jisho Japanese Dictionary (Dictionary with drawing capabilities) App Store and Google Play
Note, this app can get finicky. If you copy something in Japanese to your clipboard on your phone, the app might open and auto-translate it. This is a built-in feature that is not great. We’re using solely because of its powerful ability to recognize our poorly-drawn kanji.

Google Translate (only for OCR capabilities)

Chrome Extensions

Yomichan (dictionary)

Japanese.io (Optional for guide, dictionary + adds furigana to kanji)

Copyfish (OCR, free version)

Language Learning with Netflix (awesome Chrome extension for Netflix)

Websites

Main Tools

Bunpro (Grammar look-up)

KameSame (Custom SRS + sentence separation)

Ichi.moe (Sentence parser)

DeepL Translator (translator, significantly better than Google Translate)

Koohi Café (SRS for select anime, manga, novels and VNs) Note: This site is unfinished at the moment and is overtaking the original Flo*Flo site. Flo*Flo still works when this was written)

Sources for Native Material

Comico (Many free manga samples, legal)

Ganma.jp (Optional for guide free manga, legal, though the images are bad quality and don’t work with our OCR most of the time)

NHK Web News (News site) or NHK Web Easy (easy version)

Bookwalker or Amazon (Optional Easy source for E-books)

Aoitori Bunko (Elementary/Middle School level books with tons of free pages)

Netflix (Japanese series with Japanese subtitles) Note: You may need a VPN for JP subtitles

YouTube (JP videos, occasionally there is legally uploaded free anime)

Part 1: Manga

Downloaded your Chrome extensions, downloaded Jisho Japanese Dictionary, and signed up for KameSame? Great! Let’s get studying! Make sure you’ve thoroughly read how Copyfish and Yomichan works first.

I’ll be starting the guide off with some manga. Although it’s not the easiest to mine, the sentences are shorter and there are pictures, making it an easier choice for lower-level learners. For this guide, I’ll be first demonstrating how we can study with the web comics found on Comico, but I’ll show you how you can mine physical manga at the end of this part. However, do note that online manga/manga in e-book form are easier to deal with!

First things first, head on over to Comico and find something you think looks interesting.


I’ll be using this random manga I found, うどんの国の金色毛鞠 (This looks so cute! Let’s hope it’s nothing creepy…) Depending on the manga, you either hit 試し読み to read a free trial chapter, or you click on one of the chapters marked as 無料. For the free trials, you read it as a regular manga. For the manga with 無料 chapters, you scroll down the page like a Webtoon. Since I’ve chosen a manga with a free trial chapter, I’ll be clicking 試し読み.

Once we’re in, we can start dissecting. Already, we see some text bubbles. I’m going to ignore the table of context. Starting from the first panel, click the Copyfish icon on your browser. Remember to set the input language as Japanese in the extension’s settings, and to enable access to file URLs. You should get this screen if everything’s gone right.

Next make a box over the speech bubble by holding down the mouse button and dragging. It seems counterintuitive, but make sure the box has some room on the sides. If the text takes up too much of the selected area, the OCR might not recognize it. You might have to do some trial and error. You should get this if you’ve succeeded.

If everything worked, the text can be copied to the clipboard now. That’s one step down. Let’s resort back to the flow chart. Say you don’t understand at all how this sentence works. Copy the text to the clipboard, and head to Ichi.moe.

Now we have this. Hopefully you can understand how the sentence is parsed by seeing the different parts. Now, let’s say this entire sentence is composed of words you’ve never seen before. Next, let’s go to KameSame. After clicking “Lessons”, paste the sentence in the box that says “Study words found in content”, then “scan for vocabulary”.

This might take a bit. When it’s finished, rejoice! Tick all the words you don’t understand, then hit “study”. Then, go through the lessons like you would Wanikani! That’s the basics of sentence mining!

If you still don’t understand the sentence, you can paste the whole thing in DeepL Translator. Using Translators like this should always be your last resort. Once you’ve translated the sentence, try dissecting it again to understand the grammar.


If all else fails, ask about the sentence here on the Short Grammar Questions thread!

This is formula is how we sentence mine manga. However, Copyfish is finicky. If it just can’t pick up the sentence, you can type it manually into a wherever you need it. This is why we have the Jisho Japanese Dictionary on our phone; we can use it to draw the kanji we don’t recognize. It’s ability to recognize hand-drawn kanji is far more powerful than the one on the Jisho website, so it’s easier to look up words when we don’t know the stroke order or the radicals.


Now, if you have the manga downloaded on your PC, then you need the desktop version of Copyfish. You’ll find it in the Tools section of the guide. After that, you use it just like we’ve practiced. But what if you only have physical manga? No worries, we’ll just use our phone for that! All you need to do is download the Google Translate app. Using the Camera feature, just hover over the speech bubble you want to copy, like this.

Then hit Scan (make sure the first language is set to Japanese, the second one doesn’t matter as we will NOT be using Google Translate to do any actual translating), and finally take the photo.

In my case, there was a few exclamation points missing. Remember to add those in afterwards before you copy this to DeepL (if you plan on using it.) Highlight one speech bubble (and in this case, one part of the bubble) at a time. Now, just copy this sentence to wherever you need it, just like last time.

Voila! We’re done! Hopefully you can use this method to mine manga til’ your heart’s content!

Part 2: The news

Now, say you’re not a manga person. Maybe you’re just in the mood for some good ol’ fashioned news stories. This one is pretty straight forward. Just head on over to a japanese news site of your choice (I’m using NHK), and just follow the flowchart at the beginning. We don’t need to use any OCR here, but we’ll probably need to use Yomichan if we’re not going to end up importing the whole sentence into Ichi.moe/KameSame.


If you’ve done your homework on Yomichan, you’ll know that you can download multiple dictionaries, and even sound files. That’s what makes Yomichan quite a beast! You can choose what dictionaries you want imported once you’ve downloaded them from the link on the Yomichan options page. It should look like this when you highlight a Japanese word with the shift key now!

Part 3: Novels

Okay, so you’re ready to upgrade from the easy stuff. You want to get into novels now. Reading novels are a lot more difficult, but luckily, out flowchart makes learning and mining them a lot easier. To demonstrate how we’re going to do this, we’ll use a random trial page from Aoitori Bunko.


Once you’ve entered the site, hit これまでにでた本. Press the drop down menu with the date on it to go to a previous month. Do this until you’ve found a book that has a ためし読み button. We’ve learned this word from Comico already, so click away!

I’ve chosen this book, 炎炎ノ消防隊 悪魔的ヒーロー登場. Wait, isn’t that Fire Force, a Shonen manga for older teenagers? Why did they make a light novel version for elementary school students?! Uh, anyway, let’s get mining… I’m just going to skip past the character introductions (the images are low-quality with teeny-tiny text, so our OCR can’t pick up on it anyway. Heck, I can barely pick up on it!) This is the first proper page.

Open up Copyfish, and make a box around the first sentence. The OCR picked up first try for me, how’d it go for you? Just like last time, we can take this sentence and throw it into wherever we need it…

I didn’t recognize this 死因, so I put the sentence into KameSame.

And, oh. That’s, uh, morbid… This is in the elementary school section?!

Part 4: Anime and Drama

With manga, novels and the news, we can practice our reading ability. However, we won’t improve our listening comprehension at all this way. This is where Netflix comes into play. Don’t have Netflix? It may be worth the investment once you see how this extension works. I don’t have Netflix myself at the moment, so I’ll just be linking to this article instead. The sentence mining will work the same way we’ve already learned, so no need to go through all that again! Phew!
lln-01-1
Now, there’s another way that you can practice your listening comprehension if you don’t have access to Netflix. (Or even if you do, you can use this method in tandem with the Netflix extension.) This is when we use Koohi Café or Flo*Flo. First thing’s first, head over to their library and find an anime that you can watch legally afterwards.

Note, this site is not finished, so these screenshots might be outdated in a few weeks.


Here we are on the library on Koohi after creating an account. Clicking on a picture that has a little piece of film underneath it means that it’s an anime vocabulary list. (Again, vocabulary list, you can’t watch anything on here.) I’m using Sangatsu no Lion as an example.

Once you’ve clicked the picture and hit “go to vocab list” at the bottom, click “generate list”. You’ll now find a whole list of words that you can add to the site’s SRS!

If you already know the word, hit the trash can to trash it and it won’t appear in other lists, or hit the plus button to add it to your SRS. There’s likely to be a lot of words you don’t know, and the site will only show you 300 at a time. Don’t add everything all at once! Give yourself some time to learn some words first!

Once you’ve learned a lot of words, you can try watching the anime now. Head on over to a LEGAL streaming site of your choice, and hit play. You most likely won’t be able to turn off the English subtitles, so try to avoid them, or put a piece of paper or something in front them. You’ll likely notice that even though you know the words, you still have trouble making out the sentences. This is 100% normal. You just have to listen and re-watch the episode over and over again, until it sticks. This isn’t an in-depth guide on listening comprehension though, so I’m not going to go too in-depth on how to improve that particular skill here.

Now, if you want to further mine your anime, you can try writing down what you hear, but this is very challenging. You have been warned! If you’re an absolute beginner, stick with the Netflix extension.

Part 5: Video Games

As much as it pains me to say it, video games are the hardest to mine, as you can’t hit a rewind button, and there are things happening on screen all the time. It’s not impossible though! RPGs usually have a lot of text where you can pause and read, so that’s where you’ll be doing most of your mining. Unless you’re playing on your computer, you have to use your phone for OCR. I find it much more frustrating to get interrupted all the time when playing versus when watching something, so again, I only recommend this option if you’re a bit more advanced, or you have a great deal of patience.


Watching Let’s Plays of Japanese video games is a bit better when mining than playing yourself, as it’s easy to pause and OCR what’s on your PC screen than what’s on your TV, but don’t let that stop you! I might update this section later when I’ve tried this out more myself, but until then, use all the techniques you’ve learned today and mine the ever-loving GOLD out those games! 頑張って!

Part III: Conclusion

If you’ve followed this guide, you’ve learned how to effectively incorporate the native material you’ve always wanted to read into your studies in a quick and easy way. Maybe now joining one of the book clubs here won’t be so scary? As I’ve mentioned before, this is the first time I’ve ever written a guide, so please give feedback. Do you have a faster, more effective way of mining? Do you like this guide? Why/why not? Do tell! And if you’ve really like this guide, pass it around to friends or other students you know! Hope you’ve enjoyed it! Huge thank you to @searls for KameSame! Support them on their Patreon if you can. Same goes for @Raionus and their wonderful work on Koohi Cafe and Flo*flo!

87 Likes

Wow, this is such an in-depth and thoughtful guide! Glad to hear KameSame has aided you in your journey. You’re using it very similarly to how I do as well

3 Likes

I am here to thank you greatly. I really want to finish my N4 grammar at least before diving, so this will be very useful after that. Thanks a million!

2 Likes

I don’t own a pc. And i don’t know what extinstions are. But looks very useful thanks for you efforts.

1 Like

If you download Google Chrome on your phone, you might be able to do all the same things anyway. I haven’t tried it myself, but it might work! Extensions are like little apps you can install directly in Google Chrome, so you can do things in If it doesn’t work, you can just use Google Translate on your phone anyway. You can take a screenshot of whatever you’re looking at on your phone, and upload the photo to Google Translate in order for the app to recognize all the Kanji and everything. It doesn’t work 100% of the time, but it still works pretty well!

No problem! I hope it helps!

Thanks, this guide was really fun to make! Thank you once again for creating KameSame!

Wait until you see the rest of the paragraph. :joy:

5 Likes

Thanks for taking the time to write this up. I didn’t know about Copyfish - was trying to use abbyy finereader but that wasn’t working out as well as this free chrome extension for whatever reason. Using it for sentence mining seems promising

4 Likes

PFFFTTTT, I didn’t have time to read the whole page when I made the guide, so this is my first time reading it, and I was definitely NOT expecting that from a children’s book… :joy:

2 Likes

The description in your screenshot says mid-elementary school and above, so 10yo+. Your mileage may vary, but I was reading horror stories at the time, so it’s fine I guess? :thinking:

2 Likes

Yeah, it’s probably fine! Kids are tougher than they look. It’s just funny that out of entire list of kids’ books, I get the morbid one first try! :joy:

2 Likes

Two of the extensions can also be found for Firefox, if you’re like me and don’t use chrome.

Yomichan

Copyfish

4 Likes

Also I’d just like to throw in that r/visualnovels has a guide on text hooking for visual novels here.

Just in case anyone browsing was, you know, a degenerate.

5 Likes

So I’m fairly new at learning Japanese, I can understand very basic grammar and all the kanji / Vocab up to and including some of the ones on lv. 6 on WK. At what point do people think that I should start to do the things that this guide talks about?

1 Like

I’d said N4 should be a great place to start using this guide with things like manga and beginner children’s novels (like Aoitori Bunko). There will likely be a lot of words you don’t know, but this guide shows you how to make the process of looking up and learning those words faster. If you finish up the N4 course on Bunpro or finish Genki 2, then you should be able to start using this guide effectively. If haven’t, you should still be able to use this guide in NHK Web Easy, but I’d say wait until maybe N4 until you start using this guide on heavier stuff like manga or novels.

3 Likes

I’m aiming to have finished Genki 2 by sometime near the end of august and hope to be around level 20 on here too! I’ve put a note in the last chapter of genki 2 to revisit this thread when i get to it :joy: :joy:

1 Like

For listening studies with native material, I sometimes use SuperNative - Level up your Japanese and find it quite entertaining. They use short snippets of drama or anime and let you repeat the dialogue, fill in a missing word or the like. Of course they also have an SRS so you can study vocab directly on their site.

3 Likes

@TrinityBringer, following up on the other thread where I mentioned a high failure rate with Copyfish, I set everything up again, and tried again. Results are much better than the last time I tried this, and it looks like I may be able to make use of it now.

Thank you very much for TrinityBringer for putting the guide together, and to Naphthalene for bring it back to my attention.

For the curious, here are some of my tests today:

アオハライド

Screenshot_20200711_164834
Missing the first line, but not an issue.

Screenshot_20200711_164930
No issues here. It read one line of furigana and not the other, but I would say that’s expected with furigana.

Screenshot_20200711_165020
The fish ate the bread.

Screenshot_20200711_165126
Looking good here.

ご注文はうさぎですか?

Screenshot_20200711_165756
Minor mistake at the end is inconsequential.

Screenshot_20200711_165844
Lost てんてん is no harm.

Screenshot_20200711_165927
Impressive.

Screenshot_20200711_170043
Very nice.

美少女戦士セーラームーン

Screenshot_20200711_170638
Looks pretty good.

Screenshot_20200711_170729
Sorry Copyfish, I very well knew this wouldn’t be possible.

キラキラ100%

Screenshot_20200711_171044
Another very nice result.

Screenshot_20200711_171126
I wasn’t certain it would even be able to handle this font! A little lost てんてん is no issue for me.

Screenshot_20200711_171211
OY.

一週間フレンズ。

Screenshot_20200711_171710
I think this is a series I tried it on and had it fail for me before. I don’t see any issues here.

Screenshot_20200711_171754
Yeah, I wasn’t expecting this to be recognized. Just wanted to try and see, you know?

Screenshot_20200711_171845
Pretty good.

The reason why I had checked Copyfish before is because I like running states on kanji in manga. But, I’ve only been able to do so for one single manga volume, because it was a volume I transcribed completely.

Now all I need is to set something up to monitor the clipboard and paste its contents to a file whenever something new is copied… Then I could copy all the text from a volume to a text file, glance over the result versus the manga pages to look for errors (fixing as they’re found), and finally drop them into my spreadsheet for coming up with kanji stats.

2 Likes

Glad I could help, and I hope it’s usable now! Feel free to @ me again if you have any other problems with it.

Unfortunately, I don’t have enough programming experience to help with this myself! :sweat_smile: (Still a noobie in that department!) But maybe these posts can put you on the right track:

1 Like

I should have checked here after dinner rather than jumping right in. I was halfway there on a bash solution (using clipnotify and xsel), but was hitting some issues, so I came to the forums for a brief break. Your second link is essentially what I was trying to do. I updated my script as per theirs, and it looks like it’s working. Now to test it…

Mm, yes, it does seem this will be satisfactory for my potential usage.

2 Likes

What makes you so sure?

Then where is it >:c

1 Like