Agent-to-agent pair programming

121 points • by axldelafosse • today at 1:47 AM • 57 comments • view on HN

Comments

I think they're trying to implement every management fad with AI agents and see if improves performance.

Personally, I have tried pair programming, and it hasn't really felt like something that works, for various reasons - the main one is that I (and my partner) have complex thought processes in my head, that is difficult and cumbersome to articulate, and to an onlooker, it looks like I'm randomly changing code.

➕ show 2 replies

yesensm • today at 8:04 AM

I’m curious whether anyone has measured this systematically. Right now most of the evidence for multi-agent setups still feels anecdotal.

➕ show 2 replies

dgb23 • today at 10:40 AM

If this approach turns out to be valuable, it's unlikely that it has anything to do with having multiple actual agents, but rather that it's valuable to have 2 configurations (system prompt, model, temp, context pruning, toolset etc.) of inside the same agent being swapped back and forth.

➕ show 1 reply

edf13 • today at 6:49 AM

Nice - I do something similar in a semi manual way.

I do find Codex very good at reviewing work marked as completed by Claude, especially when I get Claude to write up its work with a why,where & how doc.

It’s very rare Claude has fully completed the task successfully and Codex doesn’t find issues.

➕ show 3 replies

ramon156 • today at 2:21 PM

I've always wondered what it would be like if we reversed the roles. I remember people claiming they had gotten better results if an agent started asking the questions.

What if we had an agent-to-agent network that contacted the human as a source of truth whenever they needed it. Keep a list of employees that are experts in said skill, then let them answer 1-2 questions.

Or are we speeding up our replacement like this?

➕ show 4 replies

sibtain1997 • today at 11:51 AM

The PLAN.md question is the one worth pulling on. Once the plan lives in git or the PR it's already downstream of intent and whoever defined what to build has already handed off. The harder problem is giving agents access to the original intent, not just the implementation plan derived from it. When there's drift between what was planned and what got built, a git-resident PLAN.md makes it hard to trace back to why the decision was made in the first place.

➕ show 1 reply

cadamsdotcom • today at 3:56 AM

The vibes are great. But there’s a need for more science on this multi agent thing.

➕ show 2 replies

alienreborn • today at 3:14 AM

I have been trying a similar setup since last week using https://rjcorwin.github.io/cook/

➕ show 1 reply

divan • today at 10:04 AM

You can also create a skill for reviewing (which calls gemini/codex as a command line tool) and set instructions on how and when to use. Very flexible.

➕ show 1 reply

etothet • today at 1:58 PM

"Letting the agents loop can result in more changes than expected, which are usually welcome..."

If "more changes than expected" means "out of scope", then I disagree. Those types of changes are exactly one of the things that's best to avoid whether code is being written by a person or an LLM.

➕ show 1 reply

rsafaya • today at 9:15 AM

I think the A2A space is wide open. Great to see this approach using App Server and Channels. I tried built something similar (at a high level) for a more B2C use case for OpenClaw https://github.com/agentlink-dev/agentlink users. Currently I think the major Agents have not fully owned the "wake the Agent" use case fully. Regardless this is a very cool approach. All the best.

➕ show 1 reply

woadwarrior01 • today at 2:00 PM

This is very reminiscent of the review-loop Claude Code plugin.

https://github.com/hamelsmu/claude-review-loop

➕ show 2 replies

vessenes • today at 3:11 AM

I prefer claude for generation / creativity, codex for bull-headed, accurate complaining and audit. Very rarely claude just doesn't "get it" and it makes sense to have codex direct edit. But generally I think it's happiest and best used complaining.

ph4rsikal • today at 5:49 AM

JDS wrote about this https://jdsemrau.substack.com/p/pair-programming-superbill-w...

shreyssh • today at 7:41 AM

This is interesting for code, but I'm curious about agent-to-agent coordination for ops tasks — like one agent detecting a database anomaly and another auto-remediating it

➕ show 1 reply

bradfox2 • today at 4:05 AM

Multi turn review of code written by cc reviewed by codex works pretty well. Been one of the only ways to be able to deliver larger scoped features without constant bugs. I've seen them do 10-15 rounds of fix and review until complete.

Also implemented this as a gh action, works well for sentry to gh to auto triage to fix pr.

➕ show 2 replies

zombot • today at 4:26 PM

Is there a prize yet for the most absurd application of AI? Pair programming seems a fair first step in the quest for this holiest of grails. How about an agentic implementation of the House of AI Lords?

jedisct1 • today at 3:09 AM

I systematically use reviewers agents in Swival: https://swival.dev/pages/reviews.html

Even with the same model (--self-review), that makes a huge difference, and immediately highlights how bad the first iterations of an LLM output can be.

➕ show 1 reply

dude250711 • today at 10:20 AM

The circle of slop.

➕ show 1 reply

reachsmith • today at 12:31 PM

[dead]

chattermate • today at 9:44 AM

[dead]

maxbeech • today at 11:52 AM

[dead]

mergeshield • today at 11:26 AM

[dead]

hikaru_ai • today at 7:16 AM

[dead]

kevinbaiv • today at 7:20 AM

[dead]

cestivan • today at 3:01 PM

[dead]

elicohen1000 • today at 7:28 AM

[dead]

georaa • today at 1:45 PM

[dead]

➕ show 1 reply

vacancy892 • today at 5:54 AM

[dead]

alt Hacker News

Agent-to-agent pair programming

Comments