Why does the voice need to be sent to the server? Why not perform speech-to-text on-device? Is the p...

solatic • today at 7:57 AM • 1 reply • view on HN

Why does the voice need to be sent to the server? Why not perform speech-to-text on-device? Is the p10 phone/laptop not capable of this yet, despite every "dictation" feature I see in every modern OS?

Replies

omcnoe • today at 8:20 AM

An eventual goal is likely to allow interacting with the LLM directly via audio tokens in input/output skipping tts and stt completely.

alt Hacker News

Replies