I'm not looking for STT->AI->TTS, I'm looking for truly good voice-to-text experience* on Linux (and others). Siri/iOS-Dictation is truly good when it comes to understanding the speech. Something this level on Linux (and others) would be great, yeah always listening, maybe sending the data somewhere, but give me UX - hidden latency, optimizing for first chars recognized - a good (virtual) input device.
Have you tried https://handy.computer ?
> I'm not looking for STT->AI->TTS, I'm looking for truly good voice-to-text experience
Umm, ah, wait no, uhh yes you are. Unless, hang on, you are possessed with greater umm speech capabilities than most, wait nevermind start over. Unless you never make a mistake while talking, you want AI to take out the "three, wait no four" and just leave the output with "four" from what you actually spoke. Depending on your use case.
> Siri/iOS-Dictation is truly good when it comes to understanding the speech.
What...? It is terrible, even compared to Whisper Tiny, which was released years ago under an Apache 2.0 license so Apple could have adopted it instantly and integrated it into their devices. The bigger Whisper models are far better, and Parakeet TDT V2 (English) / V3 (Multilingual) are quite impressive and very fast.
I have no idea what would make someone say that iOS dictation is good at understanding speech... it is so bad.
For a company that talks so much about accessibility, it is baffling to me that Apple continues to ship such poor quality speech to text with their devices.