The number of tokens seen per model on OpenRouter is not a good measure of quality.
There are so many plausible explanations for why a particular model is or is not ranked in the top 10 by this metric.
Maybe people using OpenAI models are so happy that they don't care about other models and have no need for OpenRouter. Maybe OpenAI models produce fewer tokens, or are more expensive per token.
Your conclusion might be correct, but citing the number of tokens seen by OpenRouter is not very strong evidence.
If ChatGPT 5.2 were actually superior, developers wouldn't be overwhelmingly routing traffic to Gemini 3.1 Pro just 6 days after release.
I use openrouter.ai as the benchmark because it's the foundational API layer for innovator apps that are always the quickest to adopt new tech.