So humans can get pen and paper and sleep and rest, but LLMs can't get files and context resets?
Give the LLM the ability to use a tool that looks up instructions and records instructions from/to files, instead of holding it in context window, and to actively manage its context (write a new context and start fresh), and I think you would find the LLM could probably do it about as reliable as a human?
Context is basically "short term memory". Why do you set the bar higher for LLMs than for humans?