ChristopherFritz's Study Log

polv · October 16, 2022, 9:14am

I used Subtitle Edit with Audio to Text in batch mode (so run once for a whole series), and it seems to use CPU model – 🔉 🎙 Listen Every Day Challenge (Summer Edition) 🏖 - #823 by polv

Whisper seems to prefer medium model and up for Japanese, so needing GPU RAM at least 5 GB, and also of decent speed.

I have tried Subtitle Edit and Whisper with small model + GPU, and both aren’t exactly satisfactory. Whisper entails more transcribing errors, while having better language.