Humans have the ability to ignore and generally not remember things after a short scan, prioritize what's actually important etc. But to an LLM a token is a token.
There's attempts at effectively doing something similar with analysis passes of the context - kinda what things like auto-compaction is doing - but I'm sure anyone who has used the current generation of those tools will tell you they're very much imperfect.
Isn’t the purpose of self attention exactly to recognize the relevance of some tokens over others?
The “a token is a token” effect makes LLMs really bad at some things humans are great at, and really good at some things humans are terrible at.
For example, I quickly get bored looking through long logfiles for anomalies but an LLM can highlight those super quickly.