I like biasing it towards the fact that there is a bug, so it can't just say "no bugs! all...

merlindru • yesterday at 6:48 PM • 4 replies • view on HN

I like biasing it towards the fact that there is a bug, so it can't just say "no bugs! all good!" without looking into it very hard.

Usually I ask something like this:

"This code has a bug. Can you find it?"

Sometimes I also tell it that "the bug is non-obvious"

Which I've anecdotally found to have a higher rate of success than just asking for a spot check

Replies

majormajor • yesterday at 11:33 PM

Do you not run into too many false positives around "ah, this thing you used here is known to be tricky, the issue is..."

I've seen that when prompting it to look for concurrency issues vs saying something more like "please inspect this rigorously to look for potential issues..."

➕ show 1 reply

Nition • yesterday at 9:01 PM

Just in case you didn't read the full article, this is how they describe finding the bugs in the Linux kernel as well.

Since it's a large codebase, they go even more specific and hint that the bug is in file A, then try again with a hint that the bug is in file B, and so on.

kgwxd • today at 5:33 AM

> so it can't just say "no bugs! all good!"

If anyone, or anything, ever answers a question like that, you should stop asking it questions.

jiggawatts • today at 12:16 AM

As a meta activity, I like to run different codebases through the same bug-hunt prompt and compare the number found as a barometer of quality.

I was very impressed when the top three AIs all failed to find anything other than minor stylistic nitpicks in a huge blob of what to me looked like “spaghetti code” in LLVM.

Meanwhile at $dayjob the AI reviews all start with “This looks like someone’s failed attempt at…”

alt Hacker News

Replies