I wonder how much of Uber blowing their AI budget and MSFT pulling their claude code licenses can be attributed to "tokenmaxxing".
When Meta announced token leaderboards and other followed, I could see this being the logical conclusion. That whole trend is so dumb because it leads to this.
Company announces they will measure developer performance by how many tokens they burn and constantly talks about how the best developers burn the most tokens. Developers see the message and start burning tokens. And then the company acts surprised when their bills go through the roof.
I personally use my OpenAI subscription pretty heavily, 2-3 agents running practically all day on various tasks but I never even get close to running into limits while I hear about others blowing through limits on multiple accounts in the same time period. I'm convinced that most of those folks and their elaborate workflows aren't really for productivity but for bragging rights about how much they use AI.
> I personally use my OpenAI subscription pretty heavily, 2-3 agents running practically all day on various tasks but I never even get close to running into limits
Same. But if I was working for an organization that measured token usage, you can bet I would be doing things like creating a cron job that uses claude to create a customized bespoke report update of the current status of all my open assigned tickets and message that to myself 4 times a day... token burn for zero purpose whatsoever.
> I'm convinced that most of those folks and their elaborate workflows aren't really for productivity but for bragging rights about how much they use AI.
This is quite the reductive, charged statement. Can I ask what subscription plan you're using?
My personal experience is unlike this at all-- I work on ever-expanding codebases so I can easily burn tokens. Not to mention, structured agentic coding with adverserial reviews & task organization is not token-efficient. Additionally, for the problems I'm working on, only xhigh or high reasoning gives me worthwhile results while saving time. There are definitely configurations where default consumption doesn't work.
For reference, I used 15 billion tokens (most of it cached) last month on my day job's enterprise plan. That doesn't include my personal plans' usage.
I make like 2 prompts a week to gemini flash on the weband get more done than all the people that are exhibiting literal manic behavior in the way they use LLMs.
I really wish the management behind these dumb ideas couldnt just quietly pretend they never did it once it goes out of fashion.
The fact that somebody established a leaderboard for tokenmaxxing ought to follow you around like a black cloud for the rest of your career once the collective hallucination lifts and people realize just how monumentally stupid it was.
Alas they do all these stupid things together which makes it seem more defensible and then everybody forgets.
The same here, where I haven't come close to hitting any of my CC limits. Even though I'm more productive than I've ever been (as measured by finished, valuable tasks running in production) and I'm clearing out months of backlog, I have either one of two conclusions when I hear about others who suggest they need more:
1. I'm doing it wrong. Apparently I'm supposed to give it a vague paragraph about what the business does, and I can run off and sip margaritas and wake up to a fully fleshed business
2. They don't know what they're doing, and they're sending the LLM off on a wild goose chase that it does a reasonable job of working it's way out of, so they consider it success despite the waste.