Voice Input using Web Speech API
This is a userscript which enables you to do reviews and lessons hands-free using dictation. It relies on the Web Speech API which only works in some browsers. I’ve only tested it in Google Chrome, and it may not work in other browsers.
I’ve been using this for a few months now, but it’s not perfect and likely has some bugs. If you run into issues, please let me know here. I hope someone finds it useful.
Features
- dictate in English for meanings and in Japanese for readings (automatically detects language)
- works in reviews and lesson quizzes
- support for user synonyms
- built in (optional) lightning mode
- commands for marking wrong and going to the next flashcard
How to install
- install Tampermonkey or similar userscript plugin
- (optional) install Wanikani Open Framework. This is only required to customize the script
- install the Voice Input script via this Github link
How to use
Once it’s installed and enabled, when you start a review or a lesson quiz, your browser should ask for permission to use your microphone. If you allow this, you can then dictate in English or Japanese as appropriate for the flashcard. Dictate exactly what you would type in.
If you have not used speech recognition before, please be patient. It’s not perfect technology and requires a learning curve. For kanji flashcards especially, you may need to repeat yourself.
Commands
There are two commands to make it possible to complete a whole review session hands free. Simply say one of these words to trigger the behavior.
- Mark a card incorrect: wrong, incorrect, mistake, 不正解, ふせいかい, 間違い, まちがい, だめ
- Advance to the next card: next, つぎ, 次, ねくすと
Troubleshooting
There is a live transcript that should be giving you visual feedback. By default, this is black text on a gold background (colors are customizable) that will appear at the top of the screen:
If you suspect that it’s not working, this is the first thing to check. Try speaking a longer phrase, and ensure the script is actually able to hear you and the speech recognition is working. If this doesn’t work, make sure your microphone works in other software. Try this Web Speech API demo
If you can see the live transcript, but it’s not matching the flashcard, it’s possibly a bug or one of the many situations where it’s difficult to match speech to the expected answer. Here are some known examples:
- punctuation. if it’s a meaning and is something like 屁理屈 “Far-Fetched Argument”, this should work. But there may be other words or phrases where punctuation causes issues
- kanji readings. these readings are naturally sometimes only fragments of real Japanese words, and so the speech recognition is not always so great at identifying them, especially if they are short or have long vowels (eg しょう vs しょ)
- other phrases, words, etc that just happen to be absent from the dictionaries I’m using, or homonyms for random loan words, proper nouns, etc. I have been collecting a small set of homonyms (eg speech recognition hears “EC2” instead of 遺失・いしつ) and the script accounts for those I have found, but please feel free to let me know of others you find.
Customizing
If you have WKOF installed, you can customize this script via the gear icon. The following features are customizable:
- Live transcript on or off
- Live transcript position
- Live transcript text color and background color
- Lightning mode (auto advance on correct answer)
Because of the way the Web Speech API streams results to the script, and the way the script switches language modes automatically when a flashcard changes, you may find the built in lightning mode more reliable than using lightning mode from another script like Double Check. If you have any bugs related to this, please let me know, but also try disabling other lightning modes and turning on the built in one.