Not sure how Claude and CC has become the defacto best model given gpt 5.3 codex and 5.4 exist. This space moves so fast you should be testing your workflows on different models at least once every quarter, prudently once a month.
I've been an avid fan of codex for the last few month's but finally hit the weekly limit so I've wanted to try out claude code before biting the bullet and going for the 200 dollar codex sub.
Obviously in hindsight it would be unfair to Anthropic to judge them on an unstable day so I'l leave those complaints aside but I hit the session limit way too fast. I planned out 3 tasks and it couldn't finish the first plan completely, for that implementation task it has seen a grand total of 1 build log and hasn't even run any tests which already caused it to enter in the red territory of the context circle.
It was even asking me during planning which endpoints the new feature should use to hook into the existing system, codex would never ask this and just simply look these up during planning and whenever it encounters ambiguity it would either ask straight away or put it as an open question. I have to wonder if they're limiting this behavior due trying to keep the context as small as possible and preventing even earlier session limits.
Maybe codex's limits are not sustainable in the long run and I'm very spoiled by the limits but at this point CC(sonnet) and Codex(5.4) are simply not in the same league when comparing both 20 dollar subscriptions.
I will also clearly state that the value both these tools provide at these price points are absolutely worth it, it's just that codex's value/money ratio is much better.
Checking different models once every quarter is exactly what made me move to claude code.
We've got access to opus 4-6, gpt 5.4, gemini pro and a few others through corprate. I have customized agents on claude, gpt and gemini since we tend to run out of tokens for x model by the end of a month. Out of all of them I've consistently been using sonnet for most tasks. Opus functions mainly as hand-off agents and reviewer". In my anecdotal experience Claude is miles ahead of the other models and has been for a long while... when it comes to writing code the way we want it. Which eksplicit, no-abstraction, no-external packages, fail fast defensive programming. I imagine you'd get different milage with different models and different coding styles.
The rest of the organisation, which is not software development or IT related, mainly uses GPT models. I just wish I hadn't taught risk management about claude code so they weren't wasting MY tokens.