logoalt Hacker News

zarzavatyesterday at 4:06 PM5 repliesview on HN

Perhaps I've missed a few weeks worth of progress, but I don't think that AIs have become more trustworthy, the errors are just more subtle.

If the code doesn't compile, that's easy to spot. If the code compiles but doesn't work, that's still somewhat easy to spot.

If the code compiles and works, but it does the wrong thing in some edge case, or has a security vulnerability, or introduces tech debt or dubious architectural decisions, that's harder to spot but doesn't reduce the review burden whatsoever.

If anything, "truthy" code is more mentally taxing to review than just obviously bad code.


Replies

xantronixyesterday at 4:40 PM

I know there are good uses of LLMs out there. I do. But.

The current fever pitch mandates from above seem to want it applied liberally, and pushing back against that is so discouraging and often career-limiting as to wear the fabric of one's psyche threadbare. With all the obvious problems being pointed out to people, there are just as many workarounds; and these workarounds, as is often revealed shortly thereafter, have their own problems, which beget new solutions, ad infinitum.

At some point it genuinely seems like all this work is for the sake of the machine itself. I suppose that is true: The real goal has become obscured at so many firms today, that all that remains is the LLM. Are the people betting the farm and helping implement the visions of those who have done so guaranteed a soft exit to cushion them from the consequences, or is rationality really being discarded altogether?

Sure, sound engineering principles can help work around these problems, but what efficiency is truly gained, in terms of cognitive load, developer time, money, or finite resources? Or were those ever an earnest concern?

show 2 replies
hintymadyesterday at 9:58 PM

> I don't think that AIs have become more trustworthy, the errors are just more subtle.

Honest question: what about the counter-argument that humans make subtle mistakes all the time, so why do we treat AI any differently?

A difference to me is that when we manually write code, we reason about the code carefully with a purpose. Yes we do make mistakes, but the mistakes are grounded in a certain range. In contrast, AI generated code creates errors that do not follow common sense. That said, I don't feel this differentiation is strong enough, and I don't have data to back it up.

show 4 replies
asdfman123yesterday at 9:49 PM

You can direct LLMs to do test-driven development, though. Write several tests, then make sure the code matches it. And also make sure the agent organizes the code correctly.

show 1 reply
sanderjdyesterday at 11:18 PM

Yeah I relate to this. I think working in smaller chunks helps a lot. (Just like how it is for work done by humans!)

christoff12yesterday at 4:35 PM

This has generally been the case, though. As mentioned in the post, "You want solutions that are proven to work before you take a risk on them" remains true and will be place where the edges are found.

show 1 reply