This matches my experience, especially "don’t draw the owl" and the harness-engineering id...

EastLondonCoder • yesterday at 9:07 PM • 3 replies • view on HN

This matches my experience, especially "don’t draw the owl" and the harness-engineering idea.

The failure mode I kept hitting wasn’t just "it makes mistakes", it was drift: it can stay locally plausible while slowly walking away from the real constraints of the repo. The output still sounds confident, so you don’t notice until you run into reality (tests, runtime behaviour, perf, ops, UX).

What ended up working for me was treating chat as where I shape the plan (tradeoffs, invariants, failure modes) and treating the agent as something that does narrow, reviewable diffs against that plan. The human job stays very boring: run it, verify it, and decide what’s actually acceptable. That separation is what made it click for me.

Once I got that loop stable, it stopped being a toy and started being a lever. I’ve shipped real features this way across a few projects (a git like tool for heavy media projects, a ticketing/payment flow with real users, a local-first genealogy tool, and a small CMS/publishing pipeline). The common thread is the same: small diffs, fast verification, and continuously tightening the harness so the agent can’t drift unnoticed.

Replies

protocolture • today at 12:45 AM

>The failure mode I kept hitting wasn’t just "it makes mistakes", it was drift: it can stay locally plausible while slowly walking away from the real constraints of the repo. The output still sounds confident, so you don’t notice until you run into reality (tests, runtime behaviour, perf, ops, UX).

Yeah I would get patterns where, initial prototypes were promising, then we developed something that was 90% close to design goals, and then as we try to push in the last 10%, drift would start breaking down, or even just forgetting, the 90%.

So I would start getting to 90% and basically starting a new project with that as the baseline to add to.

ricardobeat • yesterday at 11:57 PM

No harm meant, but your writing is very reminiscent of an LLM. It is great actually, there is just something about it - "it wasn't.. it was", "it stopped being.. and started". Claude and ChatGPT seem to love these juxtapositions. The triplets on every other sentence. I think you are a couple em-dashes away from being accused of being a bot.

These patterns seem to be picking up speed in the general population; makes the human race seem quite easily hackable.

bdangubic • yesterday at 9:20 PM

This is the most common answer from people that are rocking and rolling with AI tools but I cannot help but wonder how is this different from how we should have built software all along. I know I have been (after 10+ years…)

➕ show 1 reply

alt Hacker News

Replies