I'm working on something a little similar but mines more a dev tool vs process automation but I love where yours is headed. The biggest issue I've run into is handling retries with agents. My current solution is I have them set checkpoints so they can revert easily and when they can't make an edit or they can't get a test passing, they just restart from earlier state. Problem is this uses up lots of tokens on retries how did you handle this issue in your app?
Generally I've found agents are capable of self correcting as long as they can bash up against a guardrail and see the errors. So in optio the agent is resumed and told to fix any CI failures or fix review feedback.