I use Claude Code a lot, but I don't understand these "I'm doing 10X the work" comments.
I spend a lot of time reviewing any code that comes out of Claude Code. Even using Opus 4.6 with max effort there is almost always something that needs to be changed, often dramatically.
I can see how people go down the path of thinking "Wow, this code compiles and passes my tests! Ship it!" and start handing trust over to Opus, but I've already seen what this turns into 6 months down the road: Projects get mired down in so much complexity and LLM spaghetti that the codebase becomes fragile. Everyone is sidetracked restructuring messy code from the past, then fighting bugs that appear in the change.
I can believe some of the more recent studies showing LLMs can accelerate work by circa 20% (1.2X) because that's on the same order of magnitude that I and others are seeing with careful use.
When someone comes out and claims 10X more output, I simply cannot believe they're doing careful engineering work instead of just shipping the output after a cursory glance.
That's part of why I don't get AI for directly writing code at all. If I am going to be reviewing anything that comes out of it (and I will) then I might as well just write it myself. It's easier and faster, although it does also make it easier to fall victim to blind spots.
I find that it's relative to the amount of planning time you spend... I feel like I've gotten around 5x the output while using Claude Code w/ Opus over what I will get done myself... That said, I'm probably spending about 3x as much time planning as I would when just straight coding for/by myself. And that's generally the difference.
I can use the agent to scaffold a lot of test/demo frameworks around the pieces I'm working on pretty cleanly and have the agent fill in. I still spend a lot of time validating the tests and the code being completed though.
The errors I tend to get from the agent are roughly similar to what I might see from a developer/team that works remotely... you still need to verify. The difference is the turn around seems to be minutes over days. You're also able to observe over simply review... When I see a bad path, I can usually abort/cancel, revert back to the last commit and try again with more planning.