Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

719 points • by cmaster11 • yesterday at 1:15 PM • 639 comments • view on HN

Comments

this same pattern seems to occur every time a new model is about to release. i didnt notice the usage problem - i am on 20x. but opus 4.6 feels siginificantly dumber for some reason. i cant qualitify it, but it failed on everyday tasks where it used to complete perfectly

➕ show 1 reply

jubilanti • yesterday at 2:47 PM

I'm also hitting the limits in a day when it would last the entire week. The service is literally worth 4x to 6x less. Imagine I go to my favorite restaurant and I pay the same for 1/5th of the food. Bye bye, you have to vote with your wallet.

heyitsaamir • yesterday at 2:50 PM

It would be really nice to have improved transparency in token usage and throttling imo.

tiku • yesterday at 3:52 PM

Went with Kimi and z.ai a while back, no regrets yet. When I started using it the limit was far away but Anthropic moves the goalposts, tried to get my money back but they rejected it. Lesson learned, never buy a full year.

➕ show 1 reply

catketch • yesterday at 9:34 PM

stuff is getting goofy. I can blow through claude's session limit on sonnet, i don't even bother with opus now. same prompts and code for codex and it will hardly put a dent in the quota ($200/yr claude vs $20/mo codex). This is not with any crazy parallel agents, mcps, or skills.... pretty much vanilla installs, with some projects using beads.

I don't have the receipts, but I think they were somewhat closer in Jan/Feb.

mortsnort • yesterday at 2:46 PM

I think this comes from Anthropic recently implementing auto routing of model effort. You can manually set effort with /effort in CC.

It does seem like this new routing is worse for the consumer in terms of code quality and token usage somehow.

bad_haircut72 • yesterday at 2:18 PM

They also need to fix the 30 second lag between submitting the request and actually starting to get tokens back - it used to be instant, and still is at work where we use Anthropic models via an enterprise copilot subscription.

armchairhacker • yesterday at 2:25 PM

Make an AI usage tracker like https://marginlab.ai/trackers/codex/. These hearsay anecdotes prove nothing.

oybng • yesterday at 3:28 PM

Cancelled my subscription after repeatedly hitting ridiculously low limits. Unfortunately since off-peak free usage was increased there are way more timeouts and failed requests, but hey at least it's free.

dr_dshiv • yesterday at 2:32 PM

"Hey Claude, can you help me create a strategy to optimize my token use so I don't run into limits so often?" --> worked for me! I had two $200 plans before and now I am cool despite all day use

➕ show 1 reply

pawelduda • yesterday at 2:59 PM

50 days ago I wrote this [1] as the world seemed high on AI and it gave me crypto bubble vibe.

Since then, I've been seeing increased critique of Anthropic in particular (several front page posts on HN, especially past few days), either due to it being nerfed or just straight up eating up usage quota (which matches my personal experience). It appears that we're once again getting hit by enshittiffication of sorts.

Nowadays I rely a lot on LLMs on a daily basis for architecture and writing code, but I'm so glad that majority of my experience came from pre-AI era.

If you use these tools, make sure you don't let it atrophy your software engineering "muscles". I'm positive that in long run LLMs are here to stay. The jump in what you can now self-host, or run on consumer hardware is huge, year after year. But if your abilities rely on one vendor, what happens if you come to work one day and find out you're locked out of your swiss army knife and you can no longer outsource thinking?

[1] https://news.ycombinator.com/item?id=47066701

elthor89 • yesterday at 3:12 PM

Are there local models dedicated to programming already any good? That could be a way to deal with anthropic or others flipflopping with token usage or limits

➕ show 1 reply

niklasd • yesterday at 2:20 PM

We also experienced hitting our Claude limits much earlier than before during the last two weeks. Up to a degree where we were thinking it must be a bug.

docheinestages • yesterday at 2:53 PM

Anthropic paved the path for agentic coding and their pricing made it possible for masses of people to discover and experiment with this new style of development. Their Claude Code plans subsidized usage of models so much that I'm sure they must've had negative margin for quite some time. But now that they have acquired a substantial user base, it makes sense for them to dial back and become more greedy. These quiet and weird changes to the behavior of Claude in the recent weeks must have been due to both this increased greed and their struggles with scaling.

What I wish for right now is for open-weight models and hardware companies (looking at you Apple) to make it possible to run local models with Opus 4.6-level intelligence.

@Anthropic I've cancelled my subscription. Good luck :)

ozozozd • yesterday at 4:03 PM

Pretty sure OpenCode is not subsidizing, and across Codex 5.x always on xhigh, Claude Opus 4.6 on high effort and a bunch of Chinese models, I only burned about $50 over the last month.

I don’t understand why people insist on these subscriptions and CC.

Fanboyism is a bit too hardcore at this point. Apple fanboys look extremely prudent compared to this behavior.

➕ show 1 reply

spiderfarmer • yesterday at 1:29 PM

That’s why I switched to Codex. It’s so much more generous and in my experience, just as good. Also, optimizing your setup for working with agents can easily make a 5x difference.

➕ show 3 replies

delduca • yesterday at 2:48 PM

I noticed the same in last weeks. I canceled my Max 5X and subscribed to Copilot (with Opus 4.6).

It is hard now to hit the limit...

sdevonoes • yesterday at 1:59 PM

I guess it’s better to step down now that we can rather than wait until it becomes impossible (Stockholm syndrome)

No FOMO

semiquaver • yesterday at 3:52 PM

As an anecdote, I use the pro max 5x plan heavily for coding and have almost never hit a limit.

lforster • yesterday at 1:46 PM

Lol imagine how much overcharging is going on for enterprise tokens. This is just the beginning.

peterpanhead • yesterday at 2:01 PM

I don't understand Anthropic. Be consistent. Why do models deteriorate to shit, this is not good for workflows and or trust. What Opus 4.7 is gonna come out and again the same thing? Come on.

gessha • yesterday at 1:50 PM

I’m processing some images(custom board game images -> JSON) with a common layout and basic structure and I exhausted my quota after just 30 images(pleb Pro account). I have 700 images to process…

What I did instead is tune the prompt for gemma 4 26b and a 3090. Worked like a charm. Sometimes you have to run the main prompt and then a refinement prompt or split the processing into cases but it’s doable.

Now I’m waiting for anyone to put up some competition against NVIDIA so I can finally be able to afford a workstation GPU for a price less than a new kidney.

bit1993 • yesterday at 1:57 PM

You know Emacs still works.

jLaForest • today at 11:32 AM

After last week I canceled my claude subscription and bought the GitHub copilot subscription ($40/mo tier) so far I've been very happy, haven't hit any usage limits yet and seems like I won't ever at this rate

softwaredoug • yesterday at 2:38 PM

So glad I just pay by the token.

qwertyforce • yesterday at 1:51 PM

thats exaclty why i prefer codex

Achshar • yesterday at 1:53 PM

I feel like I am living in a bubble, no one seems to mention Antigravity in these discussions and I have not had any issues with Ultra subscription yet. It seems to go on forever and the Interface is so much better for dev work as compared to CC. (Though admittedly my experience with cc is limited).

I strongly believe google's legs will allow it to sustain this influx of compute and still not do the rug-pull like OAI or Anthropic will be forced to do as more people come onboard the code-gen use case.

behole • yesterday at 3:02 PM

I shred my Maxx5 in 2 hours on the reg this week! Glm here I come!

iLoveOncall • yesterday at 2:48 PM

It's very easy to calculate the actual cost given they list the exact tokens used. If I take the AWS Bedrock pricing for Opus 4.6 1M context (because Anthropics APIs are subsidized and sold at a loss), here's what each costs:

Cache reads cost $0.31

Cache writes cost $105

Input tokens cost $0.04

Output tokens cost $28.75

The total spent in the session is $134.10, while the Pro Max 5x subscription is $100.

Even taking the Anthropics API pricing, we arrive at $80.58. Below the subscription price, but not by much.

It's just the end of the free tokens, nothing to see here. It's easy to feel like you're doing "moderate" or even "light" usage because you use so little input tokens, but those "agentic workflows" are simply not viable financially.

stavros • yesterday at 1:48 PM

It's crazy, a few weeks ago the limits would comfortably last me all week. This week, I've used up half the limit in a day.

tiahura • yesterday at 1:48 PM

Also pro max 5x and hit quota for first time yesterday.

gavinray • yesterday at 2:12 PM

Codex is the only CLI I've had purely positive experiences with. Take that for what you will

➕ show 1 reply

jedisct1 • yesterday at 1:31 PM

GPT-5.4 works amazingly well.

I’ve moved away from Claude and toward open-source models plus a ChatGPT subscription.

That setup has worked really well for me: the subscription is generous, the API is flexible, and it fits nicely into my workflow. GPT-5.4 + Swival (https://swival.dev) are now my daily drivers.

➕ show 3 replies

Traubenfuchs • today at 12:24 AM

Seems like the math ain‘t mathing for any ither but Anthropics pay-per-token API plan.

Try it out and you will quickly see how much money they‘d really like for your excessive usage.

jandrese • yesterday at 1:57 PM

I mean this is expected is it not? These companies burned unimaginable amounts of investor cash to get set up and now they have to start turning a profit. They can't make up for the difference with volume because the costs are high, so the only option is to raise prices.

x86hacker1010 • yesterday at 4:09 PM

Im sorry but I have to finally cancel, it’s gotten abysmal.

dboreham • yesterday at 3:37 PM

Random data point: I beat on Claude pretty much every day and have never run into limits of any kind.

TheRealPomax • yesterday at 2:40 PM

And in classic Anthropic fashion at this point, their issues appear to just be for show. No one triages them, no one responds to them.

desireco42 • yesterday at 2:34 PM

I don't use Claude so this doesn't affect me, but I worry it will spoil the fun for me for following reason.

They inflated how much their tools burn tokens from day one pretty much,remember all the stupid research and reports Claude always wanted to do, no matter what you asked it. Other tools are much smarter so this is not such a big deal.

More importantly, these moves tend to reverberate in the industry, so I expect others will clamp down on usage a lot and this will spoil my joy of using AI without countring every token.

Burning tokens doesn't just wastes your allotment, it also wastes your time. This gave rise to turbo offering where you get responses faster but burn 2x your tokens.

nprateem • yesterday at 2:34 PM

I've seen ridiculously fast quota usage on antigravity too, where sometimes lots of work is possible, then it all goes literally within 4 questions.

Probably a combination of it being vibe coded shit and something in the backend I expect.

lvl155 • yesterday at 1:42 PM

Constant complaints about Anthropic. Not much on OAI/Codex. It seems people should just use OAI and come back when they realize compute isn’t free elsewhere.

Rekindle8090 • yesterday at 2:52 PM

I put this in a reply but I'm also posting it as a general comment:

Please unsubscribe to these services and see how they perform:

"Maybe if I spend more money on the max plan it will be better" > no it will be the same "Maybe if I change my prompt it will work" > no it will be the same "Maybe if I try it via this API instead of that API it will improve" > no it will be the same.

Claude, ChatGPT, Gemini etc all of these SOTA models are carefully trained, with platforms carefully designed to get you to pay more for "better" output, or try different things instead of using a different product.

It's to keep you in the ecosystem and keep you exploring. There is a reason you can't see the layers upon layers of scaffolding they have. And there's a reason why after 2 weeks post major update, the model is suddenly "bad" and "frustrating". It's the same reason its done with A/B testing, so when you complain, someone else has no issues, when they complain, you have no issues. It muddies the water intentionally.

None of it is because you're doing anything wrong, it's not a skill issue, it's a careful strategy to extract as much engagement and money from customers as possible. It's the same reason they give people who buy new gun skins in call of duty easier matches in matchmaking for the first couple games.

Stop paying more, stop buying these pro max plans, hoping it will get better. It won't, that's not what makes them money. Making people angry and making people waste their time, while others have no issues, and making them explore and try different things for longer so they can show to investors how long people use these AI tools is what makes them money.

When competitors have a better product these issues go away When a new model is released these issues don't exist

I was paying a ton of money for claude, once I stopped and cancelled my subscription entirely, suddenly sonnet 4.6 is performing like opus and I don't have prompts using 10% of my quota in one message despite being the same complexity.

➕ show 1 reply

holoduke • yesterday at 1:45 PM

I spend full 20x the week quota in less than 10 hours. How is that possible? Well try to mass translate texts in 30 languages and you will hit limits extremely quick.

➕ show 3 replies

rdevilla • yesterday at 1:34 PM

Bubble's bursting, get in.

➕ show 1 reply

bakugo • yesterday at 2:10 PM

This is your regular friendly reminder that these subscriptions do not entitle you to any specific amount of usage. That "5x" is utterly meaningless because you don't know what it's 5x of.

This is by design, of course. Anyone who has been paying even the slightest bit of attention knows these subscriptions are not sustainable, and the prices will have to go up over time. Quietly reducing the usage limits that they were never specific about in the first place is much easier than raising the prices of the individual subscription tiers, with the same effect.

If you want to know what kind of prices you'll be paying to fuel your vibe coding addiction in a few years, try out API pricing for a bit, and try not to cry when your 100$ credit is gone in 2 days.

mannanj • yesterday at 1:34 PM

so basically the anthropic employee who responded says those 1h caches were writes were almost never accessed, so a silent 5m cache change is for our best interest and saves cost. (justifying why they did this silently)

however his response gaslights us because in the OPs opening post his math demonstrates this is not true, it shows reads 26x more so at least in his case the cache is not doing what the anthropic employee describes.

clearly we are being charged for less optimization here and being given the message (from my perspective by anthropic) that if you are in a special situation your needs don't matter and we will close your thread without really listening.

➕ show 3 replies

a7om_com • today at 6:43 PM

[dead]

bustah • today at 10:44 AM

[dead]

alexwelsh • today at 7:28 AM

[dead]

hadifrt20 • yesterday at 1:21 PM

[dead]

alt Hacker News

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

Comments

🔗 View 4 more comments