Now that code is cheap, I ensured my side project has unit/integration tests (will enforce 100% coverage), Playwright tests, static typing (its in Python), scripts for all tasks. Will learn mutation testing too (yes, its overkill). Now my agent works upto 1 hour in loops and emits concise code I dont have to edit much.
Totally get it, and I think we’re describing the same control loop from different angles.
Where I differ slightly is: “100% coverage” can turn into productivity theatre. It’s a metric that’s easy to optimize while missing the thing you actually care about: do we have machine-checkable invariants at the points where drift is expensive?
The harness that’s paid off for me (on a live payments system) is:
Then refactors become routine, because the tests will make breakage explicit.So yes: “code is cheap” -> increase verification. Just careful not to replace engineering judgement with an easily gamed proxy.