Yes, it's Sonnet 4.6 for me as well as the most impressive inflection point. I guess I find Anthropic's models to be the best, even before I found Sonnet 3.7 to be the only model that produced reasonable results, but now Sonnet 4.6 is genuinely useful. It seems to have resolved Claude's tendency to "fix" test failures by changing tests to expect the current output, it does a good job planning features, and I've been impressed by this model also telling me not to do things - like it would say, we can save 50 lines of code in this module but the resulting code would be much harder to read so it's better not to. Previous models in my experience all suffered from constantly wanting to make more changes, and more, and more.
I'm still not ready to sing praises about how awesome LLMs are, but after two years of incremental improvements since the first ChatGPT release, I feel these late-2025 models are the first substantial qualitative improvement.