logoalt Hacker News

lifestylegurutoday at 4:44 PM2 repliesview on HN

Last time I tried whisper, it hallucinated an elaborate conversation from sounds of slapping and moaning and it took minutes to spit every single line of it.


Replies

3eb7988a1663today at 6:00 PM

Parakeet has been trained to detect non-voice sounds and exclude that from identification, so you might have better luck with that family.

dotancohentoday at 8:43 PM

If I remember correctly, the whisper documentation actually recommends to trim non-speech portions as the models halucinate heavily during those portions.