logoalt Hacker News

Ekarostoday at 6:13 AM0 repliesview on HN

How do these LLM summarizations work? Do you feed the raw wave data to model and it translate it?

Or do they use traditional voice recognition algorithms to do that part and then just "fix" the result to look plausible? Which with good quality output might not be much, but with bad can be absolutely everything.

If it is later seems to me that issues will absolutely happen.