I’m not understanding. If cost per token hits the floor that does not mean that you want a model that uses tokens.
If the Chinese are optimizing for token usage, that’s also speed.
Why use more token if few do trick?