logoalt Hacker News

8noteyesterday at 8:37 PM0 repliesview on HN

outsourcing testing the AI also gets its code to be connected to deterministic results, and show let the agent interact with the code to speculate expectations and check them against the actual code.

it could still speculate wrong things, but it wont speculate that the code is supposed to crash on the first line of code