Using Anki to Practice Listening to Audiobooks

I kinda stumbled upon an interesting way to ease the transition into audiobooks using Anki.

How to do this:

First of all, you need the book as mp3 files (mine came that way).

You then load this into Audacity and use the Analyze → Sound Finder feature to split it into discrete phrases.

These phrases can then be used to create individual cards in anki.

Why I think this is something worth creating a thread about:

There are basically two reasons that I really liked this:

  1. You can easily press the play again button to repeat a phrase as many times as you like. This is a lot easier than having to rewind a preset 10 or so seconds every time.
  2. You can flag the sentences you didn’t understand. If you can also find a text version of the book, you can then create proper flash cards from them with the audio on the front side and the text on the back. You can even optionally sort these using morphman.

Even just 1) alone is a big help I think, and by using 2) you can also use anki to practice vocabulary you might otherwise not be able to guess.

Note though that creating these autio+text flash cards takes a bit of time, while you can easily listen to hundreds of sound files in a single go. Therefore I usually limit myself to just making ten or so each day, meaning I can listen at any pace I want without forcing hours of workload upon my poor future self.

Well, that’s it, really. Hope anyone finds this interesting!


while a bit cumbersome still, I think you could give this a go and see if it makes the process more automated… the process still is in a somewhat experimental phase :sweat_smile:

I’m my case I managed to make the first scrip to work, but only after getting a lot of help was able to get the final deck set up correctly.

Probably soon there will be a better alternative for making use of those ebook / audiobook pairs though. :+1:


What about this is it that takes a lot of time? I would be interested, but I don’t really want to spend a lot of time on it, so if I could figure out how to make it easier I’ll consider it

What I meant above was mostly that you’ll need to listen to each card you want to add text to, find the appropriate sentence in the book and copy-paste it… maybe it takes me longer since I also want to look up and add translations for any words I don’t know, and maybe merge it with some other sentences to make sure I can understand it properly.

My actual process for getting it into anki feels a bit inelegant at the moment as well.

Specifically I have the following issues:

Firstly, Audacity (at least on my computer) will freeze the computer if you export too many parts of the same sound file at once, so I first split each chapter up into ten minute chunks and then do the sound detection process on each.

Secondly, I’ve yet to find an elegant way to get the sentences, in order, into anki.

What’s probably the most user-friendly is to use the media import plugin and then sort them by file name, but because of Audacity’s naming of these files, this means you need to make sure each file is only split into at most 99 chunks.

What I do is that, since I have a mac, I drag all the files into a TextEdit document, this will trigger the usually completely useless and unwanted feature of adding all their file names as text. I’ll then do some tweaking to this file (add the requisite [sound:…] around the names, and add an index column) and import as csv.

To me this hasn’t been that much of a problem, and I’m sure there is a less clunky way to do it… and frankly I’m kinda hoping that if other people try this they can find it :slight_smile:

1 Like

Oh also, if anyone want to try this without buying a whole book, you can (at least in chrome) actually download the 5 minute sample from audible:

Go to the audible page for a book, open the developer console (right-click the page, select inspect), click the network tab, click the “Sample” button on the audacity page and then double click the .mp3 file that now appears in the network tab list.

1 Like

Thanks for the tip!

When I have some coding time, I’ll give the youtube thing a go. I’ll probably just write my own scripts though :slight_smile:

I kinda like the split being done based on the narrator’s pauses, rather than the text though, and I guess I’ll have to see how consistently youtube manages to cut correctly so as not to mangle the sentences.

For auto-alignment I personally found aeneas, which I installed, but it seems you need to actually write a program in python in order to use it for Japanese? I didn’t get it working from the command line with Japanese at any rate. I’ve been thinking I’ll check it out when I feel like learning python :slight_smile:

1 Like

Yeah, while super useful now I’m using the deck, I don’t see it becoming a widely used resource tbh :man_shrugging:.

For once audiobooks aren’t cheap, then there’s the DRM removal of both ebook and audiobook in order to make them into workable file formats… and just then the actual process of aligning both to make it into something that can add to the experience.

Anyway, I’m interested to see what use can other people give it to this combination of resources.
For now I’m finding it quite handy for checking relative frequency for new words on the spot in Anki, and then for having a sentence line that’s actually nice to hear while reviewing (contrary to the robotic sound of the TTS one).

1 Like

Having used this method for a while now, the real benefit is really the ease of replaying sentences and that it pauses between each one.

Even using none if the flash card functionality, this alone makes me able to follow the story pretty well, which is still impossible when listening to the book as-is!


I was just thinking through a problem and in the end, via Google, I ended up back here in the WaniKani forums… and found everything here that I was looking for :slight_smile:

I own the text and now, after reading this thread and liking the idea, also the audiobook for コンビニ人間.

I’ll try and use the approach that @Ncastaneda linked above and created synced subtitles using YouTube studio (what a brilliant workaround…). I probably won’t bother installing those scripts and just write my own instead but as long as YouTube studio does what it is supposed to do, that is the most important thing. I’ll afterwards create a subs2srs deck with the content.

I always like having text + audio, it makes things more memorable. And I like having text in Anki because then I can additional information to it. Simple things such as furigana but potentially also more complex stuff like an automatically generated dictionary, WaniKani & my own mnemonics for all the kanji in a sentence. This should be a fun Sunday project :slight_smile:

Thank you @crihak for the inspiration and @Ncastaneda for that helpful link!


This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.