Fun play on words. But yes, LLMs are Large Language Models, not Large World Models. This matters because (1) the world cannot be modeled anywhere close to completely with language alone, and (2) language only somewhat models the world (much in language is convention, wrong, or not concerned with modeling the world, but other concerns like persuasion, causing emotions, or fantasy / imagination).
It is somewhat complicated by the fact LLMs (and VLMs) are also trained in some cases on more than simple language found on the internet (e.g. code, math, images / videos), but the same insight remains true. The interesting question is to just see how far we can get with (2) anyway.
Large Language Models is a misnomer- these things were originally trained to reproduce language, but they went far beyond that. The fact that they're trained on language (if that's even still the case) is irrelevant- it's like claiming that student trained on quizzes and exercise books are only able to solve quizzes and exercises.
1. LLMs are transformers, and transformers are next state predictors. LLMs are not Language models (in the sense you are trying to imply) because even when training is restricted to only text, text is much more than language.
2. People need to let go of this strange and erroneous idea that humans somehow have this privileged access to the 'real world'. You don't. You run on a heavily filtered, tiny slice of reality. You think you understand electro-magnetism ? Tell that to the birds that innately navigate by sensing the earth's magnetic field. To them, your brain only somewhat models the real world, and evidently quite incompletely. You'll never truly understand electro-magnetism, they might say.
> This matters because (1) the world cannot be modeled anywhere close to completely with language alone
LLMs being "Language Models" means they model language, it doesn't mean they "model the world with language".
On the contrary, modeling language requires you to also model the world, but that's in the hidden state, and not using language.