logoalt Hacker News

unleadedtoday at 11:23 AM2 repliesview on HN

Qwen3.6-35B-A3B-UD-Q4_K_M runs at about 11 tokens/second on my poor old 1060. Absolutely nuts how far we've come


Replies

piyhtoday at 2:33 PM

I tried running any model on my 1070 and it instantly crashes my old tower, probably time to get off windows and run linux on it.

show 2 replies
broodbuckettoday at 11:38 AM

Mind sharing your llama.cpp settings for that?

show 1 reply