Curious if anyone around here stayed on 4.6 (having a choice to use 4.7)
I have stuck with 4.6. I fully believe 4.7 can be smarter for truly complex and long running agentic use. But I prefer the more direct, literal mechanistic style and 4.6 seems to be peak Opus for that.
Stay with 4.6 if you can, it is disabled (afaik) on vscode claude code extension.
4.7 IMO is around 10-20% worse at understanding your prompt intention. You need more effort to explain your intention clearer so it doesn't divert.
I still use 4.6 if I need Opus. It's mostly GPT-5.5 for me. Only if I know it cannot do some thing like push without running the tests (because AGENTS.md said so), I switch to 4.6.
Although GPT's been acting weird since Thursday...
Switched back when 4.7 had an issue last week and it was wayyy faster. I assume mostly because a lot of people have moved off but might consider using it more just for the speed boost.
4.7 turned out to be a disaster in multilingual settings, so I sticked to 4.6 so far. 4.7 seemed to be optimized for (very specific slice of) coding at the expense of everything else.
I’ve stayed on 4.6. Was thinking of trying 4.7 though just today. Still, I did not jump on it day one.
I don't want to change from 4.6 because I'm finding it so good (I could change).
I've spent the last couple of days building Swift bindings to a monster CPP lib and I've actually had fun.
i use 4.6 and i've configured advisor to be on 4.7, so, when something's more complex the advisor can help. at least that's how i do with claude code, not sure of the others have implemented the concept of advisors.
I went to 4.7, didn't have a choice, found it unsatisfactory, then Claude quietly added in the option to use 4.6, so I'm back on 4.6, and I'm not the only one in my company.
I had far more hallucinations with 4.7 than 4.6.
I'll try it again after a few more months for them to get it right, but 4.6 is what changed my mind on LLMs as a tool, and 4.7 felt like a step backwards, so for now I'm sticking with something that has delivered me value, instead of arguing with a model ostensibly better that was making shit up 1 - 2 times a day. It was really disappointing.
I can give examples if needed, I screenshotted the most aggravating ones, but what worries me is which ones I didn't recognise.