I built something very similar for my company internally. The idea was that that the maintenance of the code is on the agent and the code is purely an optimization. If it breaks the agent runs it iteratively, fixes the code for next time. Happy to replace my tool with this and see how it does!
Super cool! Please let me know how it goes. Since agents are so good at writing code, we think letting the agent rewrite/test the code on failure is better than just using a prompt at runtime