I think like many of you, I've been jumping between many claude code/codex sessions at a time, managing multiple lines of work and worktrees in multiple repos. I wanted a way to easily manage multiple lines of work and reduce the amount of input I need to give, allowing the agents to remove me as a bottleneck from as much of the process as I can. So I built an orchestration tool for AI coding agents:
Optio is an open-source orchestration system that turns tickets into merged pull requests using AI coding agents. You point it at your repos, and it handles the full lifecycle:
- Intake — pull tasks from GitHub Issues, Linear, or create them manually
- Execution — spin up isolated K8s pods per repo, run Claude Code or Codex in git worktrees
- PR monitoring — watch CI checks, review status, and merge readiness every 30s
- Self-healing — auto-resume the agent on CI failures, merge conflicts, or reviewer change requests
- Completion — squash-merge the PR and close the linked issue
The key idea is the feedback loop. Optio doesn't just run an agent and walk away — when CI breaks, it feeds the failure back to the agent. When a reviewer requests changes, the comments become the agent's next prompt. It keeps going until the PR merges or you tell it to stop.
Built with Fastify, Next.js, BullMQ, and Drizzle on Postgres. Ships with a Helm chart for production deployment.
FWIW, a "cheaper" version of this is triggering Claude via GitHub Actions and `@claude`ing your agents like that. If you run your CI on Kubernets (ARC), it sounds pretty much the same
These kind of system don't really work as theory without any human in between tasks to keep it intact
The parallel execution model makes sense for independent tickets but I'm wondering what happens when agent A is halfway through a PR touching shared/utils.py and agent B gets assigned a ticket that needs the same file. Does the orchestrator do any upfront dependency analysis to detect that, or do you just let them both run and deal with the conflict at merge time?
Looks cool, congrats on the launch. Is there any sandbox isolation from the k8s platform layer? Wondering if this is suitable for multiple tenants or customers.
Hey @jawiggins, would you considering using https://github.com/redwoodjs/agent-ci?
I'm working on something a little similar but mines more a dev tool vs process automation but I love where yours is headed. The biggest issue I've run into is handling retries with agents. My current solution is I have them set checkpoints so they can revert easily and when they can't make an edit or they can't get a test passing, they just restart from earlier state. Problem is this uses up lots of tokens on retries how did you handle this issue in your app?
Hot take: You should want to review your agents' output and progress.
I wonder, based on your experience, how hard would it be to improve your system to have an AI agent review the software and suggest tickets?
Like, can an AI agent use a browser, attempt to use the software, find bugs and create a ticket? Can an AI agent use a browser, try to use the software and suggest new features?
Is the pod per repo or per task ?
Why K6? Is there a way I could run it without
What’s the most complicated, finished project you’ve done with this?
And what stops it making total garbage that wrecks your codebase?
I love k8s, but having it as a requirement for my agent setup is a non-starter. Kubernetes is one method for running, not the center piece.
the misaligned columns in the claude made ASCII diagrams on the README really throw me off, why not fix them?
| | | |
[dead]
[dead]
[dead]
[dead]
[flagged]
I’ve come to the realization that these kind of systems don’t work, and that a human in the loop is crucial for task planning; the LLM’s role being to identify issues, communicate the design / architecture, etc before it’s handed off, otherwise the LLM always ends up doing not entirely the correct thing.
How is this part tackled when all that you have is GH issues? Doesn’t this work only for the most trivial issues?