That's part of the reason I like red/green TDD - you make the agent show that the test fails before the implementation and passes afterwards.
It can still cheat, but it's less likely to cheat.