Reasoning is a pattern that is embedded within the token patterns but the llms are imitating reasoning via learning symbolic reasoning patterns.
The very fact that it memorized the Ceasar cipher rot13 pattern is due to it being a Linux command and it had examples of patterns of 13 shifted letters. If you asked it to figure out a different shift it struggled.
Now compound that across all intelligent reasoning problems in the entirety of human existence and you'll see how we will never have enough data to make agi with this architecture and training paradigm.
But we will have higher and higher fidelity maps of symbolic reasoning patterns as they suck up all the agent usage data for knowledge work tasks. Hopefully your tasks fall out of distribution of the median training data scope