This is a solved problem, there are many that do this. Can't believe YC would fund this in 2026.
Try the OSS alternative - https://github.com/vostride/agent-qa
Try the OSS alternative - https://github.com/vostride/agent-qa