I hope the industry starts competing more on highest scores with lowest tokens like this. It's a win for everybody. It means the model is more intelligent, is more efficient to inference, and costs less for the end user.
So much bench-maxxing is just giving the model a ton of tokens so it can inefficiently explore the solution space.
The premise of the trillion dollars in AI investments is not that it’ll be as good as it currently is but cheaper. It’s AGI or bust at this point.