LLMs work best when the user defines their acceptance criteria first

415 points • by dnw • yesterday at 1:17 AM • 370 comments • view on HN

Comments

jswelker • yesterday at 4:38 PM

I also write plausible code. Not much of a moat.

seba_dos1 • yesterday at 4:34 PM

s/code/stuff/

maremmano • yesterday at 4:25 PM

this won't age well.

JasonHEIN • yesterday at 10:24 AM

Bro you are like saying "OH LLM can't do X within 10 days which few people spend over decades" Live a life bro applause and change the title to "it can do xyz" instead of adding the "critical and critical" ...

akoboldfrying • yesterday at 12:26 PM

The following paragraph appears twice:

> Now 2 case studies are not proof. I hear you! When two projects from the same methodology show the same gap, the next step is to test whether similar effects appear in the broader population. The studies below use mixed methods to reduce our single-sample bias.

nprateem • yesterday at 7:14 AM

In the last month I've done 4 months of work. My output is what a team of 4 would have produced pre-AI (5 with scrum master).

Just like you can't develop musical taste without writing and listening to a lot of music, you can't teach your gut how to architect good code without putting in the effort.

Want to learn how to 10x your coding? Read design patterns, read and write a lot of code by hand, review PRs, hit stumbling blocks and learn.

I noticed the other day how I review AI code in literally seconds. You just develop a knack for filtering out the noise and zooming in on the complex parts.

There are no shortcuts to developing skill and taste.

➕ show 1 reply

bamboozled • yesterday at 5:55 AM

I'm sure this is because they are pattern matching masters, if you program them to find something, they are good at that. But you have to know what you're looking for.

riffraff • yesterday at 6:52 AM

To be fair, people do too.

mentalgear • yesterday at 8:24 AM

> I write this as a practitioner, not as a critic. After more than 10 years of professional dev work, I’ve spent the past 6 months integrating LLMs into my daily workflow across multiple projects. LLMs have made it possible for anyone with curiosity and ingenuity to bring their ideas to life quickly, and I really like that! But the number of screenshots of silently wrong output, confidently broken logic, and correct-looking code that fails under scrutiny I have amassed on my disk shows that things are not always as they seem.

Same experience, but the hype bros do only need a shiny screengrab to proclaim the age of "gatekeeping" SWE is over to get their click fix from the unknowingly masses.

cat_plus_plus • yesterday at 2:16 AM

That's very impressive. Your LLM actually wrote a correct code for a full relational database on the first try, like it takes 2.5 seconds to insert 100 rows but it stores them correctly and select is pretty fast. How many humans can do this without a week of debugging? I would suggest you install some profiling tools and ask it to find and address hotspots. SQL Lite had how long and how many people to get to where it is?

➕ show 1 reply

STARGA • yesterday at 5:31 AM

[dead]

newzino • yesterday at 2:57 PM

[dead]

shablulman • yesterday at 3:21 PM

[dead]

jeff_antseed • yesterday at 7:19 AM

[dead]

genie3io • yesterday at 8:30 AM

[dead]

thisguySPED • yesterday at 2:16 AM

[flagged]

thisguySPED • yesterday at 2:17 AM

[flagged]

user3939382 • yesterday at 4:46 AM

I have great techniques to fix this issue but not sure how it behooves me to explain it.

mmaunder • yesterday at 2:19 AM

But my AI didn't do what your AI did.

Cherry picked AI fail for upvotes. Which you’ll get plenty of here an on Reddit from those too lazy to go and take a look for themselves.

Using Codex or Claude to write and optimize high performance code is a game changer. Try optimizing cuda using nsys, for example. It’ll blow your lazy little brain.

➕ show 2 replies

satvikpendem • yesterday at 3:55 PM

Oftentimes, plausible code is good enough, hence why people keep using AI to generate code. This is a distinction without a difference.

➕ show 3 replies

serious_angel • yesterday at 2:22 AM

Holy gracious sakes... Of course... Thank you... thank you... dear katanaquant, from the depths... of my heart... There's still belief in accountability... in fun... in value... in effort... in purpose... in human... in art...

- <http://archive.today/2026.03.07-020941/https://lr0.org/blog/...> (I'm not consulting an LLM...)

- <https://web.archive.org/web/20241021113145/https://slopwatch...>

alt Hacker News

LLMs work best when the user defines their acceptance criteria first

Comments