We don’t know the models sizes, requirements, and optimisations, but we could take a guess using the infrastructure costs of the largest open weight alternatives that perform slightly worse.
In my opinion, it’s a profitable kind of service. They probably don’t pay the public prices for the cloud GPUs though.
In my opinion it seems like a very unprofitable service propped up by investor money trying to capture market share.
Or, as I would say if I were Bugs Bunny, “Duck Season”
Just looking at infra cost is not enough. If the token price doesn't contain all the costs they are losing money and they eventually have to raise prices more.