> statistically most likely token.
Statistically most likely in what context, given which preconditions? Because each prompt sequence is unique so the probability of any token following it is unknown.
It’s not unknown because that’s what the model computes. It’s matrix multiplication just like shaders.
It’s not unknown because that’s what the model computes. It’s matrix multiplication just like shaders.