Yeah, my colleague recently said "hey I've burnt through $200 in Claude in 3 days". And he was prompting. Max 8hrs/day Imagine what would happen if AI was prompting.
As I like this allegory really much, AI is (or should be) like and exoskeleton, should help people do things. If you step out of your car putting it first in drive mode, and going to sleep, next day it will be farther, but the question is, is it still on road
Burnt through 4 Max x20 in a week here. Throughput isn't the bottleneck anymore. Review quality is. The 1-in-5 error rate in this thread matches my experience. More agents overnight just means more review tomorrow morning.
What moved the needle: capturing architectural context (ADRs, structured system prompts, skill files) that agents reference before making changes. Each session builds on prior decisions. The agent improves because the context compounds. Better context beat more parallelism every time.