logoalt Hacker News

yogthostoday at 4:39 PM0 repliesview on HN

My expectation is that local models will be the default for coding within a year or two. You can already run Qwen 3.6 with MTP at a pretty reasonable speed without needing a huge amount of VRAM. And while it's not as good as current frontier models, it's already quite competent for a lot of tasks.

And there's no sign that people are running out of ideas for how to optimize models further. You see a bunch of papers come out literally every few weeks right now. So, it's entirely plausible to me that we'll see models that are superior to current frontier ones in a year or two that will run on your machine.

Once we get to that point, I don't think it's even going to matter if frontier models keep improving for most people. Being able to run the model on your machine, use it as much as you want in any way you want, without having to worry about it changing from under you or the company changing pricing, and not have to send all your data to the vendor are going to be the deciding factors.

At some point the models are just good enough to do what you need to do. On top of that, I expect tooling around models and coding patterns will evolve as well. That could compensate significantly for the capabilities of the model. We already see this happening with two prime examples here:

https://github.com/itigges22/ATLAS

https://arxiv.org/abs/2509.16198