“Each move from one layer of the tech stack to a higher one involved a function:
f(x) -> y
Given a specific x, you always get a specific y as the artifact being generated.”Not at all. If this were true then the Python code in question would generate deterministic binary. Of course that’s not what happens. The Python runs through an interpreter that may change behavior on different runs. It may change behavior version to version. It may even change behavior during multiple invocations of a function in the same running instance. Because all of that is abstracted away.
Same for the C code. You give up control and some determinism for the higher abstraction. You might get there same output between compilations on the same version but that’s not actually guaranteed and version to version consistent certainly isn’t.
Moving to a higher layer of abstraction very often results in less constrained behavior.
I don't feel that this piece explains its title very well (to me) though the idea expressed by the title is spot-on
I've gone through hand-coding HTML, CGI, CMSes, web frameworks, and CMSes built with web frameworks. Each is (roughly) a layer of abstraction on top of lower layers.
People talk about LLMs as an extension of this layering but they're not. With the layers of abstraction I've listed you can go down to the layers underneath and understand them if you take the time.
LLMs are something different. They're a replacement for or a simulation of the thinking process involved in programming at various layers.
I wouldn't agree that LLMs are a higher level of abstraction, but I've found they do help me think at a higher level of abstraction, by temporarily outsourcing cognitive load.
With changes like substantial refactors or ambitious feature additions, it's easy to exceed the infamous "seven things I can remember at once":
* the idea for the big change itself
* my reason for making the change
* the relevant components and how they currently work
* the new way they'll fit together after the change
* the messy intermediate state when I'm half finished but still need a working system to get feedback
* edge cases I'm ignoring for now but will have to tackle eventually
* actual code changes
* how I'm going to test this
Good lab notes, specs etc can help, but it's a lot to keep in mind. In practice these often turn into multi person projects, and communication is hard so that often means delay or drift. Having an agent temporarily worry about * wiring a new parameter through several layers
* writing a test harness for an untested component
* experimentally adding multibyte character support on a branch
frees up my mental bandwidth for the harder parts of the problem.The main benefit is to defer the concern until I have a mostly working system. Then I come back and review its output, since I'm still responsible for what it delivers, and I want better than "mostly working".
It's orthogonal to whether LLMs can be a useful abstraction layer, but ...
I have a feeling that if LLMs were built on a deterministic technology, a lot of the current AI-is-not-intelligent crowd would be saying "These LLMs can only generate one answer given a question, which means they lack human creativity and they'll never be intelligent!"
When people say things like that they mean it as a rough mental model.
Bit like when people say "it's like riding a bike" they're not actually talking about bicyle riding being the exact same activity.
Coming with this in response:
> f(x) -> P(y) ∪ P(z1) ∪ P(z2) ∪ ... P(zN)
is a failure in human communication not a disagreement about what LLMs are or aren't.
I don't think they fit in as a layer of abstraction, but instead are outside of it. An abstraction simplifies away the inner workings of what is being abstracted. The LLM exists outside of your code. It is not part of it, thus, it is not abstracting it away. If this were the case, a coworker would be an abstraction to code they own (you could argue this, but I think it erodes the meaning of abstraction). LLMs behave like program synthesizers rather than another layer of abstraction. They take natural language as input, and using fancy math produce a (hopefully) relevant and useful output based on that input. They can produce layers of abstraction, but are not part of a program's abstraction stack.
However, they can abstract away the need to understand implementation, similar to a coworker. They can summarize behavior, be queried for questions, etc, so you don't have to actually understand the inner workings of what is going on. This is a different form of abstraction than the typical abstraction stack of a program.
I don't agree with this take. Determinism is a nice property for abstractions to have, but it isn't necessary to be an abstraction.
And LLMs can handle very abstract concepts that could not possibly be encoded in C++, like the user's goal in using software.
There was an article on database UX and it compared the expectations of a database user and a user working with a search engine. It's interesting, because both are searching, right? Yet the database user expects the found set to be complete or it expects an explanation on why this record is in it and this one is not. A search engine user does not expect things like that and will put up with false positives and negatives if their number is not too big.
LLMs are not inherently non-deterministic during inference. I don't believe non-determinism implies lack of abstraction. Abstraction is simply hiding detail to manage complexity.
I agree, but I think it's for a different reason than what the author says: LLMs are a very leaky abstraction compared to other levels, meaning it's much harder to convey the true intent of logic you are trying to encode through natural language, and often by doing so you are just relying on the LLM to "get it right", which is inherently messy business. Oftentimes, that leakiness just doesn't matter that much. Other times, it does.
I always think the determinism discourse on LLMs is off the point. The elephant in the room is semantic preservation. Compilers can most often preserve semantics across abstractions, while LLMs most often cannot.
For sure the problem isn't that clear-cut, for the siren's call of AI coding is to induce a system out of prompts with ambiguous semantics. It's hardly surprisingly you get unpredictable outcomes when giving ambiguous commands to human collaborators, and that in the case of LLMs they resolve ambiguity with probabilistic approximation.
The claim is that compilers were f(x) -> y, and LLMs are f(x) -> P(y | z1 | z2 | ... z3).
But how were various combinations of popular programming languages, operating systems and hardware platforms not effectively f(x) -> P(y | z1 | z2 | ... z3)? Suppose you were quick on the take and were writing in Unix and C in the early 80s and found yourself porting your program from a PDP-something to an 8088 PC, or to a 68k Mac, dealing with DOS extenders, printer drivers, different versions of C (remember K&R style?) or C++? Remember MFC? The evolution of the STL?
LLMs are similar to that maelstrom, just on a faster timescale.
You're right, but the reality is that the people who are excited about LLMs don't care about determinism. They are happy to hand off the thinking to a third party, even if it will give wrong answers they don't notice.
Tangential to the subject matter but has anyone else noticed that night time tends to have more people arguing that LLMs are intelligent and the daytime tends to have more arguing that they aren’t?
Really anything can (and must) be written to justify delegated thought. See: replies to this thread.
"AI is an abstraction" only makes sense if working with a contractor to write your app is also an abstraction. An absurd dilution of the word "abstraction" to the point of meaninglessness.
LLMs are deterministic, the same model under the same conditions will produce the same output, unless some randomness is purposefully injected. Neural networks in general can be thought of as universal function approximators.
This is absurd. The author misrepresents the type of "abstraction" that people mean. This abstraction ladder goes as follows:
- contributing individually
- contributing as a tech lead
- contributing as a technical manager
- leaving the occupation to open a vanity business, such as a gastropub or horse shoeing serviceThe exact code might not be deterministic but the behavior can be if your spec uses something like Dafny or TLA+ and is detailed enough
There are a few things being confused because people are having to learn/re-learn/re-discover basic computer science classes, but both formal specifications and informal specifications - such as pseudocode (I balk imagining how many AI users might not know this term), or natural language documentation - are all forms of abstraction. Programming languages and underlying models of computation all enable varying degrees of hiding details or emphasizing important ideas/information. Human thought and language, and mathematics, are already examples of abstraction in general. LLMs thus also purport to provide a (via computational model alternative to Turing machines) higher kind of abstraction, the debate is whether it is a good one, if its hallucinations make it unreliable, etc.
In other words, LLMs are probabilistic, not deterministic.
This makes sense, but you need to understand that you're ignoring the compiler once you're past the machine code level which isn't an abstraction right, it's the root. So ignoring that part of the missive, goin from C to Python, different compilers do add different machine code.
C and Python have a bunch of different compilers, so you don't if you take the same code, the f' output can be different. There's determinism within the same compiler. Add in different architectures, and the machine code output definitely is more varied than presented.
But that's still a manageable; then what if you add in all the dependencies, well you get a more florid complexity.
So really, it's a shitty abstraction rather than an inaccurate analogy. If you lined them up in levels, there could be some universe where they are a valid abstraction. But it's not the current universe, because we know the models function on non-determinism.
I'd posit if there was a 'turtles all the way down' abstraction for the LLM, it's simply coming from the other end, the one where human mind might start entering the picture.
I'm not sure why people struggle with the fact that an abstraction can be built on top of a non-deterministic and stochastic system. Many such abstractions already exist in the world we live.
Take sending a packet over a noisy, low SNR cell network. A high number of packets may be lost. This doesn't prevent me, as a software developer, from building an abstraction on top of a "mostly-reliable" TCP connection to deliver my website.
There are times when the service doesn't work, particularly when the packet loss rate is too high. I can still incorporate these failures into my mental model of the abstraction (e.g through TIMEOUTs, CONN_ERRs…).
Much of engineering and reliability history revolves around building mathematical models on top of an unpredictable world. We are far from solving this problem with LLMs, but this doesn't prevent me from thinking of LLMs as a new level of abstraction that can edit and transform code.