logoalt Hacker News

solatictoday at 7:57 AM1 replyview on HN

Why does the voice need to be sent to the server? Why not perform speech-to-text on-device? Is the p10 phone/laptop not capable of this yet, despite every "dictation" feature I see in every modern OS?


Replies

omcnoetoday at 8:20 AM

An eventual goal is likely to allow interacting with the LLM directly via audio tokens in input/output skipping tts and stt completely.