logoalt Hacker News

AstroBentoday at 5:10 PM1 replyview on HN

This doesn't look accurate to me. I have an RX9070 and I've been messing around with Qwen 3.5 35B-A3B. According to this site I can't even run it, yet I'm getting 32tok/s ^.-


Replies

misnometoday at 5:25 PM

It seems to be missing a whole load of the quantized Qwen models, Qwen3.5:122b works fine in the 96GB GH200 (a machine that is also missing here....)