Burnt through 4 Max x20 in a week here. Throughput isn't the bottleneck anymore. Review quality...

alexey-pelykh • yesterday at 1:16 PM • 4 replies • view on HN

Burnt through 4 Max x20 in a week here. Throughput isn't the bottleneck anymore. Review quality is. The 1-in-5 error rate in this thread matches my experience. More agents overnight just means more review tomorrow morning.

What moved the needle: capturing architectural context (ADRs, structured system prompts, skill files) that agents reference before making changes. Each session builds on prior decisions. The agent improves because the context compounds. Better context beat more parallelism every time.

Replies

malfist • yesterday at 4:27 PM

This comment reads very strongly like it was written by an LLM.

➕ show 2 replies

aray07 • yesterday at 3:43 PM

Agreed. The spec file is context. Writing acceptance criteria before you prompt provides the context the agent needs to not go off in the wrong direction. Human leverage just moved up and the plan/spec is the most important step.

Parallelism on top of bad context just gets you more wrong answers faster

gnatolf • yesterday at 5:25 PM

Sorry but isn't the bottleneck then simply to do even relevant things? Like how much of a qualified backlog do you have that your pipeline does not run dry?

robutsume • yesterday at 4:02 PM

This matches what I've found running persistent agents. The compounding context is the whole game.

The pattern that works: treat your agent's workspace like infrastructure, not a scratch pad. ADRs, skill files, structured memory of past decisions - all of it becomes the equivalent of institutional knowledge that a senior engineer carries in their head. Except it survives session restarts.

The article's TDD framing gets at something important too. The acceptance criteria aren't just verification - they're context. When you write "after 5 failed attempts, login blocked for 60 seconds" before the agent touches code, you've constrained the solution space dramatically. The agent isn't guessing what you want anymore.

Where I think the article undersells the problem: spec misunderstandings compound too. If your architectural context has a wrong assumption baked in, every agent session inherits that assumption. You need periodic human review of the context itself, not just the outputs. The ADRs need auditing the same way code does.

➕ show 1 reply

alt Hacker News

Replies