Don't LLMs create the same outputs based on the same inputs if the temperature is 0? Maybe I'm just misunderstanding.
Unfortunately not. Various implementation details like attention are usually non-deterministic. This is one of the better blog posts I'm aware of:
https://thinkingmachines.ai/blog/defeating-nondeterminism-in...
Unfortunately not. Various implementation details like attention are usually non-deterministic. This is one of the better blog posts I'm aware of:
https://thinkingmachines.ai/blog/defeating-nondeterminism-in...