logoalt Hacker News

GuB-42today at 11:37 AM3 repliesview on HN

It had been shown that LLMs don't know how they work. They asked a LLM to perform computations, and explain how they got to the result. The LLM explanation is typical of how we do it: add number digit by digit, with carry, etc... But by looking inside the neural network, it show that the reality is completely different and much messier. None of it is surprising.

Still, feeding it back its own completely made up self-reflection could be an effective strategy, reasoning models kind of work like this.


Replies

wongarsutoday at 2:16 PM

Which should be expected, since the same is true for humans. The "adding numbers digit by digit with carry" works well on paper, but it's not an effective method for doing math in your head, and is certainly not how I calculate 14+17. In fact I can't really tell you how I calculate 14+17 since that's not in the "inner monologue" part of my brain, and I have little introspection in any of the other parts

Still, feeding humans their completely made-up self-reflection back can be an effective strategy

show 2 replies
phpnodetoday at 12:46 PM

The explanation becomes part of the context which can lead to more effective results in the next turn, it does work, but it does so in a completely misleading way

FireBeyondtoday at 6:55 PM

Right. Last time I checked this was easy to demonstrate with word logic problems:

"Adam has two apples and Ben has four bananas. Cliff has two pieces of cardboard. How many pieces of fruit do they have?" (or slightly more complex, this would probably be easily solved, but you get my drift.)

Change the wordings to some entirely random, i.e. something not likely to be found in the LLM corpus, like walruses and skyscrapers and carbon molecules, and the LLM will give you a suitably nonsensical answer showing that it is incapable of handling even simple substitutions that a middle schooler would recognize.