Tests can't tell you if the design of the code is fit for purpose, or about requirements you completely missed or punted on, or that a core new piece that's going to be built upon next is barely-coherent, poorly-performing slop that "works" but is going to need to be actually designed while being rewritten by the next person instead, or that you skipped trying to understand how the feature should work or thinking about the performance characteristics of the solution before you started and just let the LLM drive, so you never designed anything, arriving at something which "works" on your machine and passes the tests which were generated for it, but will hammer production under production loads. Neither will running it on your own machine or in Dev.
No amount of telling the LLM to "Dig up! Make no mistakes!" will help with non-designed slop code actively poisoning the context, but you have to admire the attempt when you see comments added while removing code, referring to the code that's being removed.
It's weird to see tickets now effectively go from "ready for PR" to 0% progress, but at least you're helping that person meet whatever the secret AI* usage quota is for their performance review this year.
> Tests can't tell you if the design of the code is fit for purpose, or about requirements you completely missed or punted on
This is what acceptance tests are for. Does it do the thing you wanted it to do? Design a test that makes it do the thing, and check the result matches what you expect. If it's not in the test, don't expect it to work anywhere else. Obviously this isn't easy, but that's why we either need a different design or different tests. Before that would have been a tremendous amount of work, but now it's not.
(Making this work requires learning how to make it work right. This is a skill with brand-new techniques which 99.999% of people will need over year to learn)
> or that a core new piece that's going to be built upon next is barely-coherent, poorly-performing slop that "works" but is going to need to be actually designed while being rewritten by the next person instead
This is the "human" part I mentioned being irrelevant now. AI does not care if the code is slop or maintainable. AI can just rewrite the entire thing in an hour. And if the tests pass, it doesn't matter either. Take the human out of the loop.
(Concerned about it "rewriting tests" to pass them? You need independent agents, quality gates, determinism, feedback loops, etc. New skills and methods designed to keep the AI on the rails, like a psychotic idiot savant that can build a spaceship if you can keep it from setting fire to it)
> or that you skipped trying to understand how the feature should work or thinking about the performance characteristics of the solution before you started and just let the LLM drive, so you never designed anything
This is not how AI driven coding works. You have to give the AI very specific design instructions. If you do it right, it will make what you want. Sadly, this means most programmers today will be irrelevant because they can't design their way out of a wet paper bag.
(You know how agile eschews planning and documentation, telling developers and product people to just build "whatever works right now" and keep rewriting it indefinitely as they meet blockers they never planned for? AI now encourages the planning and documentation.)