logoalt Hacker News

mrinterwebtoday at 5:08 PM1 replyview on HN

I think two recent advances make your statement more true. The new Qwen 3.5 series has shown a relatively high intelligence density, and Google's new turboquant could result in dramatically smaller/efficient models without the normal quantization accuracy tradeoff.

I would expect consumer inference ASIC chips will emerge when model developments start plateauing, and "baking" a highly capable and dense model to a chip makes economic sense.


Replies

fauigerzigerktoday at 7:30 PM

Who will be funding state of the art local models going forward? AI models are never done or good enough. They will have to be trained on new data and eventually with new model architectures. It will remain an expensive exercise.

I could be wrong because I'm not following this too closely, but the open weights future of both Llama and Qwen looks tenuous to me. Yes, there are others, but I don't understand the business model.