logoalt Hacker News

twodaveyesterday at 2:06 PM1 replyview on HN

> I have been consistently skeptical of LLM coding but the latest batch of models seems to have crossed some threshold.

It’s refreshing to hear I’m not the only one who feels this way. I went from using almost none of my copilot quota to burning through half of it in 3 days after switching to sonnet 4.6. I’m about to have to start lobbying for more tokens or buy my own subscription because it’s just that much more useful now.


Replies

ACS_Solveryesterday at 8:02 PM

Yes, it's Sonnet 4.6 for me as well as the most impressive inflection point. I guess I find Anthropic's models to be the best, even before I found Sonnet 3.7 to be the only model that produced reasonable results, but now Sonnet 4.6 is genuinely useful. It seems to have resolved Claude's tendency to "fix" test failures by changing tests to expect the current output, it does a good job planning features, and I've been impressed by this model also telling me not to do things - like it would say, we can save 50 lines of code in this module but the resulting code would be much harder to read so it's better not to. Previous models in my experience all suffered from constantly wanting to make more changes, and more, and more.

I'm still not ready to sing praises about how awesome LLMs are, but after two years of incremental improvements since the first ChatGPT release, I feel these late-2025 models are the first substantial qualitative improvement.