Preview: Durtle Daily

mikeyb1 · November 8, 2024, 8:41am

Hey all,

I was smashing the treadmill at the gym the other day and I really wanted to do some Wani practice but that’s a bit hard to do while sweating profusely, so I came up with this idea.

I wrote a web app that connects to WK with your api token, grabs your reviews for the day and generates a kind of mock podcast, mashing together AI voice synthesis and the WK provided pronunciation samples. It reads out each meaning and gives you a few moments to say it aloud or in your head then moves on to the next one.

Here is a sample of the output. Download MP3

I started making this 2 days ago, so it’s not ready to use by anyone else yet, has no customisation at all and takes several minutes to generate (though it does cache all the data it can, so can be quick after the first run and only new things will be generated)

I wonder if anyone else would find this useful, has any thoughts, suggestions or tips

Many thanks

EDIT: Turned in to a web app, take a look at https://dd-staging.mikeyb.tech/ and let me know how it goes

mikeyb1 · November 9, 2024, 11:22am

I used this for the first time today, had it on a loop and done it 4 times, it worked pretty well!

Few things I want to tweak and maybe add some more information to each vocabulary. Like if it’s a noun, or adverb, etc… maybe read out other meanings and things

I’d like to give a bit more time but some are easy to remember and some are hard so there is no good amount of time to pause for…

Might be interesting to describe the kanji it’s made up of and their readings also.

I’d make this all fully configurable also so can decide what is good for you.

Inserio · November 11, 2024, 7:31pm

The idea is not bad, and the implementation seems solid for a first draft.

I noticed via the sample that you’re taking a KaniWani approach, that is, testing your recall of the Japanese word when prompted by the English one.
If your goal is simply to add in some supplemental studying, then this seems fine, but just be aware that it’s practicing something fundamentally different than WaniKani, and so your results may vary.

You’re also going to have issues once you have multiple words that have the same primary meaning—doubly so if they are in the same review queue. For example, 理由 (7) vs 理性 (14) vs 故 (26) vs 訳 (32).

Perhaps you could do something like a “spelling bee” approach, by grabbing one of the context sentences along with the word. It might drastically increase the time needed, but it might help differentiate the synonyms or homophones. And you could then also consider making it so that you could either do it EN → JP or JP → EN. I would find myself a lot more interested in something that did this (particularly with the prompt being the Japanese word & sentence).

Also, if you wanted considerations for what you could remove to trim down the total time slightly, it might not be desirable/necessary for some people to have the item number mentioned at the beginning of each one, as long as there’s something else that sufficiently separates the items, like the pause or maybe a short tone or something.

Anyway, those are just my first thoughts. Certainly an interesting idea, though I don’t know if I’d find myself using it at this point.

mikeyb1 · November 11, 2024, 8:38pm

Thanks for taking a look!

I hadn’t really realised I’d gone KaniWani, I just implemented it initially without too much thought, things can be swapped around when I’ve solved all the major problems!

I did consider the issue of duplicates but haven’t thought too much yet. My second time using it and I had Father ちち then Father おとうさん. Kind of annoying and I’ll work on it. In my head while using it I just thought either, it’s not like your thoughts count toward actual reviews, just practice and help to memorise. I must say doing my reviews just now I found it a ton easier as I’d gone over then a bunch of times with the audio.

I can definitely get rid of the number x… between but it feels like a natural flow. Easy to add configuration for these kind of things though, each to their own.

Adding context sentences would be good, I think at my level (4) I don’t know half the words they appear next to, so it’s not super useful to me yet but definitely a good thing to have.

Same with JP>en / en>jp, definitely on the radar! I could do that and make it configurable, mix them both together, or do one or the other.

My struggle at the moment is to find a way to generate reasonable sounding Japanese from text, the AI models I use don’t support Japanese.

I’m starting to think using AI models in the browser, while cool, technically, is a losing battle.

It eats memory (>4gb) and takes a long time (on my 24 core CPU). Definitely not something you could run on a mobile device even though it is technically possible to do so.

The alternative is to use an external service like Google TTS but they cost money and I’d want to offer it for free. Though there is a limited number of translations needed and I’d cache the data, so maybe it wouldn’t cost me much.

It’s not for everyone, everyone has different learning styles and life constraints, I’m finding it useful already though. Thanks for your feedback and honesty!

Inserio · November 11, 2024, 8:59pm

Just to make a quick note about the Japanese TTS generation, particularly if you’re looking at caching it.
Somewhat surprisingly, the Edge browser offers a native-sounding TTS as one of the built-in options. It’s essentially Microsoft’s updated Narrator options, but they’ve somehow figured out how to lock it down to only be listed as an option when using Edge^[1].

Not sure about whether there’s any legality liabilities there, but I doubt people would care as long as it’s not being used for profit. ↩︎

mikeyb1 · November 12, 2024, 8:52am

The browsers TTS is accessible via a standard way across different browsers and they each implement it in their own way. This means that while Edge has a good one, if I relied on it, it would give varying results depending on what browser you use.

I’m looking at Google TTS service, it’s free for something like 4m characters per month, which would most likely cover the vast majority of WK subjects already, if I cache the data on my servers, it’ll only have to be done once for each bit of material. Trying to replace the AI model with this at the moment

mikeyb1 · December 16, 2024, 11:45pm

I’ve built an alpha version. You can find it here https://dd-staging.mikeyb.tech/

It mostly outputs the same as the MP3 I attached but there are a few things you can configure, like the pause time between question/answer, what separates review (number X, pause, gong noise). I’ve also made it configurable if you hear Japanese or English first.

You can choose to output it as an MP3 or to just use the built in player in the site. One benefit of using the built in player is you can skip forward or backward between reviews easily.

There are still many things I want to add, more things to configure, tidy up the design (as frankly, I am not a designer and the styling of a bunch of stuff is ugly), add the option of contextual sentence examples (which isn’t actually too hard with the way I’ve built it).

All voice synthesis happens in the cloud now so no heavy lifting for your browser (unless you chose MP3, which does encoding in the browser).

I think the voice is as good or better and I can apply things to it, like slowing down the reading and stuff like that, which may be good to add an option to say the japanese slower, etc.

Anyway, I mostly built this for me as I love a little side project to distract me, but if anyone wants to try it out, feedback would be much appreciated! If you notice anything broken or any issues preventing it working, let me know!

Topic		Replies	Views
Jin Sensei - Your personalized AI-powered Japanese Podcast Resources	2	197	May 13, 2025
App similar to KaniWani for audion recognition? API And Third-Party Apps	13	447	January 12, 2024
[Web] WaniKani Custom - The WaniKani SRS system but with your custom words and vocabulary API And Third-Party Apps	60	7087	October 27, 2023
Vocab recall app API And Third-Party Apps	28	11374	July 16, 2015
Recognizing vocab by listening to it API And Third-Party Apps	3	171	March 27, 2025

Preview: Durtle Daily

Related topics