logoalt Hacker News

ChrisGreenHeuryesterday at 3:58 AM9 repliesview on HN

I may have bad news for you on how compilers typically work.


Replies

sarchertechyesterday at 4:06 AM

The difference is that what most languages compile to is much much more stable than what is produced by running a spec through an LLM.

A language or a library might change the implementation of a sorting algorithm once in a few years. An LLM is likely to do it every time you regenerate the code.

It’s not just a matter of non-determinism either, but about how chaotic LLMs are. Compilers can produce different machine code with slightly different inputs, but it’s nothing compared to how wildly different LLM output is with very small differences in input. Adding a single word to your spec file can cause the final code to be far more unrecognizably different than adding a new line to a C file.

If you are only checking in the spec which is the logical conclusion of “this is the new high level language”, everyone you regenerate your code all of the thousands upon thousands of unspecified implementation details will change.

Oops I didn’t think I needed to specify what going to happen when a user tries to do C before A but after B. Yesterday it didn’t seem to do anything but today it resets their account balance to $0. But after the deployment 5 minutes ago it seems to be fixed.

Sometimes users dragging a box across the screen will see the box disappear behind other boxes. I can’t reproduce it though.

I changed one word in my spec and now there’s an extra 500k LOC to implement a hidden asteroids game on the home page that uses 100% of every visitor’s CPU.

This kind of stuff happens now, but the scale with which it will happen if you actually use LLMs as a high level language is unimaginable. The chaos of all the little unspecified implementation details constantly shifting is just insane to contemplate as user or a maintainer.

show 1 reply
hndcyesterday at 4:06 AM

Deterministic compilation, aka reproducible builds, has been a basic software engineering concept and goal for 40+ years. Perhaps you could provide some examples of compilers that produce non-deterministic output along with your bad news.

show 2 replies
jcranmeryesterday at 4:20 AM

Compilers aim to be fully deterministic. The biggest source of nondeterminism when building software isn't the compiler itself, but build systems invoking the compiler nondeterministically (because iterating the files in a directory isn't necessarily deterministic across different machines).

csmantleyesterday at 5:06 AM

If you are referring to timestamps, buildids, comptime environments, hardwired heuristics for optimization, or even bugs in compilers -- those are not the same kind of non-determinism as in LLMs. The former ones can be mitigated by long-standing practices of reproducible builds, while the latter is intrinsic to LLMs if they are meant to be more useful than a voice recorder.

rezonantyesterday at 5:34 AM

You'll need to share with the class because compilers are pretty damn deterministic.

show 2 replies
leptonsyesterday at 5:05 AM

Compilers are about 10 orders of magnitude more deterministic than LLMs, if not more.

show 1 reply
JackSlateuryesterday at 2:35 PM

Reproductible builds are a thing (that are used in many many places)

r0b05yesterday at 4:03 AM

Elaborate please

Applejinxyesterday at 8:52 AM

I love the 'I may have' :)

show 1 reply