logoalt Hacker News

d4rkp4tternyesterday at 1:25 PM3 repliesview on HN

Also in general I don't know get what the appeal of a 7b full-duplex (speech-to-speech) model is: 7b can't be very intelligent on its own, and for anything useful, you'd need tool-calls, which speech-to-speech models can't do. This is also why ChatGPT voice mode annoys by never doing a web search or reading a link (in fact it pretends to search or read, outright makes up stuff, and when pushed admits it can't really read web pages or do web searches).

There are probably definitely use cases for this though, open to be educated on those.


Replies

satvikpendemyesterday at 2:26 PM

Why can't a speech to speech model do tool calls? Others like Gemini live do it just fine.

show 2 replies
water-drummeryesterday at 2:05 PM

Gemini live api and grok voice api can make tool calls and they're speech to speech models

show 1 reply
WhitneyLandyesterday at 2:30 PM

Yes. Is there a basic chat app for iOS that prioritizes full intelligence over full duplex?

Agree ChatGpt advanced voice mode is so bad for quality of the actual responses. Old model, no reasoning, little tool use.

I just want hands free conversations with SOTA models and don’t care if I have to wait a couple of seconds for a reply.

show 1 reply