logoalt Hacker News

danielhanchenyesterday at 11:54 AM2 repliesview on HN

Oh I didn't expect this to be on HN haha - but yes for our new benchmarks for Qwen3.5, we devised a slightly different approach for quantization which we plan to roll out to all new models from now on!


Replies

nnxyesterday at 1:19 PM

Can you describe what is this slightly different approach and why it should work on all models?

hedorayesterday at 6:26 PM

Nice! Your stuff ran LLMs extremely well on < $500 boxes (24-32GB ram) with iGPUS before this update.

I’m eager to try it out, especially if 16GB is viable now.