logoalt Hacker News

teiferertoday at 4:06 PM4 repliesview on HN

Why would that be desirable?

If we take the human brain as an example, it's pretty bad at computation. Multiply two 10-digit numbers takes forever, despite the enormous size of its neural network. It's not the right tool for the job - a few deterministic logic gates could do that much more efficiently. That same circuit can't do much else, but multiplying, oh boy, it's good at that! Why do we think that artificial neural nets would be the right tool for that job? What's wrong with letting the LLM reach out to an ALU to do the calculation, just like a human would do? It's surely going to be quicker and require less energy.


Replies

soerxpsotoday at 4:14 PM

The embedded programs can be connected to the other weights during training, in whatever way the training process finds useful. It doesn't just have to be arithmetic calculation. You can put any hard-coded algorithm in there, make the weights for that algorithm static, and let the training process figure out how to connect the other trillion weights to it.

OneDeuxTriSeiGotoday at 4:48 PM

One of the big appeals of this is it gives a mechanism for "teaching" models a geometric intuition and better spacial reasoning.

Not necessarily pure number crunching but the boundary between rote algorithms and fuzzy intuition based models that humans in particular excel at.

pegasustoday at 4:20 PM

> Why would that be desirable?

If we never try, we'll never know. I wouldn't be surprised if there is something to gain from a form of deterministic computation which is still integrated with the NN architecture. After all, tool calls have their own non-trivial overhead.

show 1 reply
itigges22today at 6:44 PM

[flagged]