logoalt Hacker News

ghshephardyesterday at 11:27 PM3 repliesview on HN

I think many of the people who have become "AI Pilled" (I'll include myself here) had it happen in the last 3 months. Even over the Christmas break, when the Wiggums loop got so much coverage - I still wasn't that blown away going into January/February- 50%+ of the time I'd just write the code myself. I like coding.

But - I don't know if it was April, or May - but very recently - the coding harnesses paired with decent SOTA models like Opus 4.8/GPT 5.5 - just started showing a lot more consistency, and completeness, and sometimes downright clever behavior - that they started to become way more useful.

Just one out of hundred+ examples - I gave Claude Code (Opus 4.8 High) a complex task that involved consul, vault - but I had neglected to give it sandbox permission to download from hashicorp.com. So - it created a entire test harness that simulated both the behavior of Vault and Consul - created all it's test cases, verified that they passed - and when I came back 40 minutes later said that it was all done.

It's test harnesses so accurately simulated the behavior of Vault/Consul - that on first try - no refactoring whatsoever - all of the protobuf/AESGCM/API behavior (that has varied significantly between versions) - worked.

This was something that would have taken me, someone super super familiar with the code and tools and APIs - a minimum of 3 solid days of work - and that would likely involve hundreds of attempts and refactors as I unwound all the weird encryption and packaging layers. It zero-shotted a full solution without having an API to test against

If these agents actually have an actual test-harness - It's honestly hard to imagine what they can't do - subject only to imagination and budget at this point.

Speaking personally - something changed Between January and, Let's say May - in which instead of seeing these things as mostly interesting technology demonstration, in which the flaws outweighed the benefits - I now genuinely think they are the future of programming. I'm dubious that I'll write much software manually in the future - beyond what I do for personal pleasure.


Replies

Jagerbizzleyesterday at 11:49 PM

Fantastic post. This sums up my experience perfectly with a near identical time frame to yours.

fragmedetoday at 1:35 AM

Asked to write a driver for macOS for some thing that didn't have macOS support, GPT-55 found Linux OS firmware on the vendors site, downloaded it, ran binwalk, extracted out the driver, got halfway to reimplementing it on macOS with barely any help from me. I did need to dive into it somewhat to get it across the line, but it showed some ingenuity along the way.