Pseudocode for LLM inference:
while (sampled_token != END_OF_TEXT) {
probability_set = LLM(context_list)
sampled_token = sampler(probability_set)
context_list.append(sampled_token)
}
LLM() is a pure function. The only "memory" is context_list. You can change it any way you like and LLM() will never know. It doesn't have time as an input.
As opposed to what? There are still causal connections, which feel sufficient. A presentist would reject the concept of multiple "times" to begin with.