> here’s how LLMs actually work
But how is that useful in any way?
For all we know, LLMs are black boxes. We really have no idea how did ability to have a conversation emerge from predicting the next token.
I thought the Hinton talking to Jon Stewart interview gives a rough idea how they work. Hinton got Turing and Nobel prizes for inventing some of the stuff https://youtu.be/jrK3PsD3APk?t=255
> We really have no idea how did ability to have a conversation emerge from predicting the next token.
Uh yes, we do. It works in precisely the same way that you can walk from "here" to "there" by taking a step towards "there", and then repeating. The cognitive dissonance comes when we conflate this way of "having a conversation" (two people converse) and assume that the fact that they produce similar outputs means that they must be "doing the same thing" and it's hard to see how LLMs could be doing this.
Sometimes things seems unbelievable simply because they aren't true.
> We really have no idea how did ability to have a conversation emerge from predicting the next token.
Maybe you don't. To be clear, this is benefiting massively from hindsight, just as how if I didn't know that combustion engines worked, I probably wouldn't have dreamed up how to make one, but the emergent conversational capabilities from LLMs are pretty obvious. In a massive dataset of human writing, the answer to a question is by far the most common thing to follow a question. A normal conversational reply is the most common thing to follow a conversation opener. While impressive, these things aren't magic.