I was initially excited by 4.7, as it does a lot better in my tests, but their reasoning/pricing is really weird and unpredictable.
Apart from that, in real-life usage, gpt-5.3-codex is ~10x cheaper in my case, simply because of the cached input discount (otherwise it would still be around 3-4x cheaper anyway).