I'm very worried for both.
Cerebras requires a $3K/year membership to use APIs.
Groq's been dead for about 6 months, even pre-acquisition.
I hope Inception is going well, it's the only real democratic target at this. Gemini 2.5 Flash Lite was promising but it never really went anywhere, even by the standards of a Google preview
You can call Cerebras APIs via OpenRouter if you specify them as the provider in your request fyi. It's a bit pricier but it exists!
I don't think it's a good comparison given Inception work on software and Cerebras/Groq work on hardware. If Inception demonstrate that diffusion LLMs work well at scale (at a reasonable price) then we can probably expect all the other frontier labs to copy them quickly, similarly to OpenAI's reasoning models.
What do you mean by Grow is dead since about 6 months ago? Not refuting your point, but I’m curious.
Cerebras are on OpenRouter.
I am currently using their APIs on a paygo plan, I think it might just be a capacity issue for new sign ups.
Once again, it's a tech that Google created but never turned into a product. AFAIK in their demo last year, Google showed a special version of Gemini that used diffusion. They were so excited about it (on the stage) and I thought that's what they'd use in Google search and Gmail.
Taalas is interesting. 16,000 TPS for Llama on a chip.
https://taalas.com/