How would it be a money grab? If the new tokenizer requires more tokens to encode the same information, it costs them more money for inference. The point of charging per token is that the cost is proportional to the number of tokens. That's my understanding anyway
Not necessarily with speculative decoding. Whitespace would be trivial to predict and they would petty much keep using the same amount of compute as before.
I don't think that's their primary motive for doing this but it is a side effect.
Because everyone burns through their limits much faster, forcing them to upgrade to higher limits or new tiers.