> Qwen3.6 35b a3b is still my local champion but I may use this for auto complete and small tasks.
I second this! Using the Unsloth Q6 (I forgot the exact name). Currently using it with forgecode (with zsh), on my Strix Halo, and it's suprisingly really good. I would say slightly Similar to Haiku 4.5, plus additional privacy, minus speed. It's surprisingly really fast for the hardware, given the speculative decoding, still PP is on the slow side.
If you use it for agentic coding and often hit PP, there's something wrong with your harness IMO
Out of interest, what are you seeing for token generation - especially as the context fills?