It's not gonna stay that way. Token cost is being massively subsidized right now. Prices will have to start increasing at some point.
Seems like the real costs and numbers are very hidden right now. It’s all private companies and secret info how much anything costs and if anything is profitable.
You can run Qwen3 Coder today - on expensive hardware - but fairly cheaply on a token by token basis. It's no Opus, but you can get things done.
This is hard to say definitively. The new Nvidia Vera Rubin chips are 35-50x more efficient on a FLOPS/ megawatt basis. TPU/ ASICS/ AMD chips are making similar less dramatic strides.
So a service ran at a loss now could be high margin on new chips in a year. We also don’t really know that they are losing money on the 200/ month subscriptions just that they are compute constrained.
If prices increase might be because of a supply crunch than due to unit economics.