logoalt Hacker News

intothemildtoday at 9:59 AM1 replyview on HN

You should enable MTP now that its available.

LLamaCPP has had some massive updates in the last week or so.


Replies

npodbielskitoday at 2:05 PM

Yes, Qwen 3.6 MoE is hitting like 80-90tk/s on Strix halo. On R9700 I had like 170t/s. It was not possible to keep up. But MoE is circling very often. I switch then to dense model and have 20-30t/s but it is able to solve quite a lot of tasks.

show 2 replies