Tests are vanity in agentic engineering
They do nothing to keep an AI on track in comparison to the aspects that simulate a product manager
And the AI just will correct the test when it fails as opposed to correct the code, because the code didn't miss anything the specification changed
My protip: just write tickets or have the AI write those too. that and the commits and the PRs will function as the AI’s memory better than any client side markdown file masquerading as a soul
My agent / skill files always tell it to trust neither the code or the test and to reason about the test failure which seems to work pretty well.
In another project without my rules I’ve noticed I have to tell it to set up data for playwright tests instead of skipping if none exists.