Claude with Sonnet medium effort just used 100% of my session limit, some extra dollars, thought for 53 minutes, and said:
API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable.
I don't think i'd let it think more than 5 minutes without killing the process.
I hope this doesn't come out wrong but. When this happens do agentic/vibe coders message their boss and say "sorry can't work until tomorrow?"
Just copy and past the error back to Claude and you will be able to continue. I have seen this many times over the past few months. I thought it was related to AWS bedrock that I have been using - but probably not.
Just curious, what version of Max are you on: 5x or 20x?
You're using it within their high usage rate window. I hope you're aware of this, if you use it out of the high usage time window it's supposed to use less, but it does seem a little odd that Sonnet uses so much, even on Medium.
And on the seventh day, API Error: Claude's response exceeded the 32000 output token maximum