Agentic coding notes from Galapagos Island

158 points • by gm678 • today at 4:37 AM • 77 comments • view on HN

Comments

I'd like to highlight a different part of the article:

> In general, when I talk to software folks about testing, I'm coming from such a different place that they immediately look at me like I'm an alien, so let's talk about how we tested at this hardware company I worked for, Centaur, which informs my biases about how I like to work. Some of the things that we did that were or are unorthodox in the software world are:

> Hired dedicated QA / test engineers, with testing being a first-class career path on par with being a developer - No code review by default - Virtually no hand-written tests - Constant testing via what programmers sometimes called property based testing, randomized testing, fuzzing, etc., although we just called those tests (hand-written tests were called "hand tests"). - Large regeression test suite (3 months wall clock to execute on compute farm) - No unit tests

Anybody here tried that (or a similar) approach? Especially going all-in on property based testing and fuzzing with no unit tests.

I tried that approach somewhere before and the initial results were promising, but ran into political issues so the idea was canned.

➕ show 4 replies

martey • today at 5:56 AM

OP's alt text makes it clear that by "Galapagos Island" they mean Vancouver. I assumed that this was some sort of local nickname, but all of the references to "Galápagos of Canada" I could find are talking about Haida Gwaii instead.

➕ show 2 replies

bob1029 • today at 5:35 AM

A lot of the crazy ideas seem to have melted away in the face of massive context sizes. Today, I can put roughly a megabyte of utf8 text into my system prompt before things start to get weird.

That is a massive amount of information even if we are being sloppy with it. You can read The Hobbit and the first Harry Potter book cover-to-cover and still have room to spare. I would deeply struggle to develop a world model this detailed for any business. Anything that needs to get more specific than these narratives can be a SQL query tool into the data warehouse, grep over the codebase, MS graph API lookup, etc.

Giving the business a balanced way to collaborate over this one shared model of the world is a new challenge I am beginning to engage with. I've also noticed that the world model will compound on itself in terms of self-detection of update opportunities. The more constraints there are, the more likely we appear to violate one.

➕ show 1 reply

nasretdinov • today at 12:15 PM

I can agree with Dan on two things: LLMs do often produce incorrect results and that it's still useful for productivity when used in moderation. For me the wrong results actually cause some kind of ragebait response so I become much more motivated to learn more about the subject to actually generate correct response. After I've learnt the subject area enough I find I'm better off having LLM review my code instead of writing it.

I haven't even begun to try to comprehend how to use fuzzing testing to improve the ability to find bugs, but it sounds really interesting. I've seen mutation testing to be very useful for finding gaps in tests, so I can only imagine that fuzzing + LLMs might produce insane results.

zapnuk • today at 9:43 AM

There is a reasone we use left and right margin/padding.

This blog is quite unreadable for 27/32" monitors.

➕ show 3 replies

gwern • today at 6:26 AM

URL typo: "hange how he works](/productivity-velocity/)". (I make this kind of Markdown syntax error all the time and set up a lint for '](/'.)

You should talk to https://www.mechanize.work/ for sponsorship/credits and about environments.

cognitiveinline • today at 11:13 AM

It's really coming down to "Do we want to subscribe to a human with a salary of many ~$10000(s) ", or "Do we spend 100(s)$ on an AI subscription"

Even with it's issues, the latest models are going to disrupt the labor economics.

➕ show 2 replies

brcmthrowaway • today at 4:58 AM

This seems like the beginnings of AI psychosis, tbh.

➕ show 2 replies

joshka • today at 12:35 PM

TW;DR (too wide didn't read) ;)

➕ show 1 reply

TokenLens • today at 1:29 PM

[flagged]

foobarbecue • today at 5:40 AM

It's "Galapagos" or "Galápagos," not "Galapogos."

➕ show 2 replies

zuzululu • today at 4:59 AM

[dead]

nobodycares1 • today at 5:33 AM

[flagged]

➕ show 1 reply

zarzavat • today at 5:00 AM

Fable changes the game yet again, because it's API-only.

You're not likely to want to run Fable in a loop any more than you want to take a bunch of dollar bills and light them on fire. Every invocation of Fable has to be intentional, its context carefully managed. I feel like a babysitter.

➕ show 5 replies

alt Hacker News

Agentic coding notes from Galapagos Island

Comments