You're hitting on something really important that barely gets discussed. For instance, notice how opus 4.5's speed essentially doubled, bringing it right in line with the speed of sonnet 4.5? (sonnet 4.6 got a speed bump too, though closer to 25%).
It was the very first thing I noticed: it looks suspiciously like they just rebranded sonnet as opus and raised the price.
I don't know why more people aren't talking about this. Even on X, where the owner directly competes in this market, it's rarely brought up. I strongly suspect there is a sort of tacit collusion between competitors in this space. They all share a strong motivation to kill any deep discussion of token economics, even about each other because transparency only arms the customers. By keeping the underlying mechanics nebulous, they can all justify higher prices. Just look at the subscription tiers: every single major player has settled on the exact same pricing model, a $20 floor and a $200 cap, no exceptions.
opus 4.6 was going to be sonet 5 up until week of release. The price bump is even bigger than you realize because they don't let you run opus 4.6 at full speed unless you pay them an extra 10x for the new "fast mode"
> a $20 floor and a $200 cap, no exceptions
Google caps at $250
It's quite plausible to me that the difference is inference configuration. This could be done through configurable depth, Moe experts, layers etc. Even beam decoding changes can make substantial performance changes.
Train one large model, then down configure it for different pricing tiers.
Doubling speed can likely come from MoE optimizations such as reducing the amount of active parameters.
Isn’t Gemini $300 a month for their most expensive plan? That includes stuff like Genie and all that though.
It kind of makes sense, at least a year or so ago, I know $20.00 unlimited plans were costing these companies ~$250.00 averaged out, they're still lighting money on fire with $200.00 but probably not nearly as bad, however, I'm not sure if costs have gone up with changes in models, seems like the agentic tooling is more expensive for them (hence why they're pushing anyone they can to pay per token).
These AI companies are all in the same boat. At current operating costs and profit margins they can't hope to pay back the investment, so they have to pull tricks like rebranding models and downgrading offerings silently. There's no oversight of this industry. The consumer protection dept in the US was literally shut down by the administration, and even if they had not been, this technology is too opaque for anyone to really be able to tell if today they're giving you a lower model than what you paid for yesterday.
I'm convinced they're all doing everything they can in the background to cut costs and increase profits.
I can't prove that Gemini 3 is dumber than when it came out because of the non deterministic nature of this technology, but it sure feels like it.