It's a gut feeling. We _know_ LLMs can't be _that_ good as they are promoted. I'v...

csbartus • today at 5:01 AM • 1 reply • view on HN

It's a gut feeling.

We _know_ LLMs can't be _that_ good as they are promoted.

I've spent the last 6 months creating a production grade app from scratch with Claude where I wrote no single line of code. I've reviewed code and it was looking good, almost completely following my templates, workflows, skills.

Now I've started to make minor manual updates and I'm horrified. Claude has no idea why there were those templates and instructions in place. It followed them blindly without grasping their spirit. The end result is like a very junior dev copy-pasting answers from Stack Overflow into the codebase. No consistency, chaotic application of different conventions, duplicated code, ghost code (does nothing), and perhaps more as I'm digging in.

The pros: The code works, all tests pass (43% code / 57% tests, 1:1.3 ratio), the UI looks good with visible glitches

The cons: I'll have to rewrite most of the code on the long run, make it fit, easy to maintain.

The verdict: I wouldn't started this project alone. Claude get me through to v0.1.0 / MVP where I've focused solely on the product: technologies, architecture, functionality, and usability. Now it's easier to refactor all for v0.2.0 manually without Claude.

So this might be our gut feeling: we know it's something good, but not as good as the stakeholders might promote. We know it helps in some ways but it's a nightmare in other ways.

We are not anti-AI but rather pragmatic: Not that AI enthusiasts we are expected to be.

Replies

hn_throw2025 • today at 8:42 AM

> No consistency, chaotic application of different conventions, duplicated code, ghost code (does nothing), and perhaps more as I'm digging in.

I didn’t understand this part. You said you reviewed the code and it was looking good, so how did the cruft creep in? Were you reviewing every diff, or taking an occasional sample?

alt Hacker News

Replies