logoalt Hacker News

TrainedMonkeyyesterday at 7:09 PM0 repliesview on HN

Will need to wait for real benchmarks, but based on OpenAI marketing Instant is their latency optimized offering. For voice interface, you don't actually need high tok/s because speech is slow, time to first token matters much more.