logoalt Hacker News

lzaborowskitoday at 7:50 PM1 replyview on HN

One thing I’ve noticed with local models is that people tolerate a lot more trial and error behavior. When a hosted model wastes tokens it feels expensive, but when a local model loops a bit it just feels like it’s “thinking.”

If models like Qwen can get good enough for coding tasks locally, the real shift might be economic rather than purely capability.


Replies

trvztoday at 10:09 PM

Wasted tokens are preferred for local models, I need the GPU mainframe in my bedroom to heat it as I live in a third world country with unreliable heating (Switzerland).