LLMs fundamentally work by predicting the next word (token). But that should not be used to diminish their potential capabilities. It's like saying that human brains "just predict (or produce) the next electrical impulse". Fundamentally correct, but says nothing about the potential emergent capabilities of scaled-up systems that work like that.
Emergent properties of complex systems should not be diminished just because the underlying operating principle is simple.