Yes they do…? Who cares if they just predict the next token? The outcome is that they can invent new abstractions. You could claim that the invention of this new idea is a combination of an LLM and a harness, but that combination can solve logic puzzles and invent abstractions. If a really large spinning wheel could invent proofs that were previously unsolved, that would be a wildly amazing spinning wheel. I view LLMs similarly. It is just fancy autocomplete, but look what we can do with it!
Said differently, what is prediction but composition projected forward through time/ideas?
"Who cares if they just predict the next token?"
Exactly. I also only write one word at a time. Who knows what is going on in order to come up with that word.
Ask an LLM to invent a new word and post it here, I will be waiting. You will see that it simply combines words already in the training data.