I don't think that this is very hypocritical on the part of the developer holding such views.

fao_ • today at 1:37 AM • 1 reply • view on HN

I don't think that this is very hypocritical on the part of the developer holding such views. Typing code has never been the bottleneck, building the mental model has. You need the mental model so you know how the domain and the actual model will interact, which is needed for pre-empting what tests you need, what QA you need to do, etc etc. and the limitations of the system. You can demo this out with a specification but all specifications eventually meet the domain head on, and often with catastrophic consequences, and you still need to do this sort of work anyway when writing the specification.

Fundamentally, LLM do not construct a consistent mental model of the codebase (this can be seen if you, uh, read LLM code,), and this is Bad for a lot of reasons. It's bad for long-term maintainability, it's bad for modelling this code accurately and it's behaviour as a system, it's bad for testing and verifying it, etc. Pretty much all of the tasks around program design require you to have that mental model.

You can absolutely get an LLM to show you a mental model of the code, but there is absolutely nothing that can 100% guarantee you that that's the thing it's using. Proof of this is to look at how they summarise documents, to look at how inaccurate a lot of documentation they generate is, and to look at how inaccurate a lot of their code summaries are. Those would be accurate if the LLM was forming a mental model while it worked. It's a program to statistically generate plausible text, the fact that we got the program to do more than that in the first place is very interesting and can imply a lot of things, but at the end of the day, whatever you ask for it, it will generate text. There is absolutely no guarantee around accuracy of that text and there effectively can never be.

Replies

devonkim • today at 12:20 PM

One of the core problems we have in software engineering is the longstanding philosophical problem around creation of cohesive, consistent, objective mental models of inherently subjective concepts like identifying a person, place, etc. Look at the endless lists of falsehoods programmers (tend to) believe about any topic.

You’re right that LLMs specifically have no guarantees about accuracy nor veracity of the text they generate but I posit that that’s the same with people, especially when filtered through the socialization process. The difference is in the kind of errors machines make compared to ones that humans make.

It’s frustrating we’re using anthropomorphic concepts like hallucinations when describing LLM behaviors when the fundamental units of computation and thus failures of computation are so different at every level.

alt Hacker News

Replies