The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision weights total a...

zargon • today at 4:08 AM • 2 replies • view on HN

The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision weights total approximately 154 GB. KV cache is said to take 10% as much space as V3. This looks very accessible for people running "large" local models. It's a nice follow up to the Gemma 4 and Qwen3.5 small local models.

Replies

regularfry • today at 11:10 AM

I'm going to blow my bandwidth allowance again this month, aren't I.

sbinnee • today at 4:31 AM

Price is appealing to me. I have been using gemini 3 flash mainly for chat. I may give it a try.

input: $0.14/$0.28 (whereas gemini $0.5/$3)

Does anyone know why output prices have such a big gap?

➕ show 2 replies

alt Hacker News

Replies