I'm skeptical of how fast "up to" 750t/s really means. Maybe if they make it ext...

eli • yesterday at 7:04 PM • 2 replies • view on HN

I'm skeptical of how fast "up to" 750t/s really means. Maybe if they make it extremely expensive so it frees up enough capacity?

GPT‑5.3‑Codex‑Spark currently runs on Cerebras chips and it's giving me around 150t/s. Still relatively very fast, but nowhere near the 1,000t/s they claimed at launch. (Also it's not a very good model.)

That said, I'm super bought in to faster models being better for most use cases than smarter models.

Replies

aurareturn • today at 5:56 PM

If it's 150 t/s, that's barely faster than Nvidia GPUs who are batching a lot more and are a lot more cost effective. Add in the Groq piece and Nvidia claims it can do 400 tokens/s.

beering • yesterday at 8:30 PM

Soon the bottleneck will be how fast your laptop can grep for a string.

alt Hacker News

Replies