The amount of faith a person has in LLMs getting us to e.g. AGI is a good implicit test of how much a person (incorrectly) thinks most thinking is linguistic (and to some degree, conscious).
Or at least, this is the case if we mean LLM in the classic sense, where the "language" in the middle L refers to natural language. Also note GP carefully mentioned the importance of multimodality, which, if you include e.g. images, audio, and video in this, starts to look like much closer to the majority of the same kinds of inputs humans learn from. LLMs can't go too far, for sure, but VLMs could conceivably go much, much farther.
And the latest large models are predominantly LMMs (large multimodal models).