It's a more of a black box with claude, at least with this you see the proof strategy and mista...

miligauss • yesterday at 4:39 PM • 1 reply • view on HN

It's a more of a black box with claude, at least with this you see the proof strategy and mistakes made by the model when it decomposes the problem. I think instead of Ralph looping you get something that is top-down. If models were smarter and context windows bigger i am sure complex tasks like this one would be simpler, but braking it down into sub agents and having a collective --"we already tried this strategy and it backtracked"-- intelligence is a nice way to scope a limited context window to an independent sub problem.

Replies

raphaelmolly8 • yesterday at 5:03 PM

The Lean angle here is really interesting: most multi-agent demos dodge hard verification, but tying each agent’s output to makes the feedback loop objective. Curious how you’re handling goal-claim conflicts/duplication when two agents find competing tactic sequences for the same subgoal—do you keep both in memory with some ranking signal (time-to-verify, proof term size, etc.)?

➕ show 1 reply

alt Hacker News

Replies