You don't pay for capacity, you pay for an interface. Paying for capacity is what API keys are for.
Similarly, on a home internet connection you might pay for a given size of pipe, but most residential ISPs don't allow running publicly accessible servers on your connection because you'll typically use way more of the bandwidth.
This is probably one of the worst analogies you could have brought up in this context.
The business model of an ISP involves fixed capital investments into infrastructure with constant opex and very little variable costs.
The marginal cost of sending a gigabyte is basically zero. The limited resource here is bandwidth and ISPs split their tiers based on bandwidth.
The problem is that some users may consume the local bandwidth that is shared with other users. More bandwidth requires more investment into infrastructure. This means that bandwidth in itself doesn't produce costs for the ISP either, it is the maximum bandwidth capacity that costs money.
Hence, oversubscription is a viable business as long as neighbors aren't impacted by power users.
This doesn't apply to LLMs. Token economics has the same economics as steel. There is high capex to get started, but the real killer is the variable cost per unit of steel.
You can't sell steel on a oversubscribed subscription model. It's nonsensical.
If the subscription is more expensive than buying what you need, nobody is going to pay for the subscription unless they consume all of it.
Hence the subscription must contain a subsidy to make it competitive.
However, the people who consume the full subscription are still there and each token they request adds up on your electricity bill.
Ergo, the subscription must be more expensive than the API, but with a smart billing limit that removes the cognitive burden of using your service with pay as you go billing.
If that same internet provider has caps on how much bandwidth I can use every 5 hours and total on a weekly basis, then yes, I pay for capacity.
That argument would have been valid when the 5 hours blocks were unlimited in the beginning.