Iv'e been using GPT-5.1, 5.1-codex and 5.1-codex-max and gpt-5.2 the last few weeks. Then I got tipped off about opus, and that it was supposed to be awesome. The problem is I can clearly see old patterns of "Oooh, I found the issue!" in the middle of the stream long before it has found the real issue I was asking about, and not very good results. The GPT family to me is better.
I was especially impressed by 5.1-codex-max for a webapp, but that is ofc where these model in general shine. But it was freak, never had 15-20 iterations (with 100s of lines added each time) before where I did not have to correct anything.