Seconded. After disabling adaptive thinking and using a default higher thinking, I finally got the quality I'm looking for out of Opus 4.6, and I'm pleased with what I see so far in Opus 4.7.
Whatever their internal evals say about adaptive thinking, they're measuring the wrong thing.
As far as I understand Opus 4.7 disregards the disable adaptive thinking flag. So if you're seeing it perform well, perhaps their evals are inline?