In high school math class our teacher swapped out all the symbols in the epsilon delta definition of limits, and asked us what this equation expresses, and many students struggled to interpret it.
I don't think this test shows that an LLM doesn't "understand". It shows more that it has similar failure modes as humans.
Well first of all I think there is more implicit data encoded in the symbols of the epsilon delta definition of limits. In the Mealy example they really just labels for arbitrary sets. The LLM actually failed a much simpler relabeling exercise. Setting that aside, I still think the analogy is flawed.
The student is mid learning process and its entirely reasonable for them one to be relying on pattern recognition until they have fully internalized the subject. The model is fully trained and should thus have internalized their understanding of the subject.
Additionally the student can update their understanding when pattern recognition fails. The model is fully cooked and will never do more then pattern recognition.