>I think it has more to do with LLM's being statistical models than human creativity lacking in the input. The creativity and millions of voices and tones may be there, but since these models tend to go for the most likely next words, polishing this away becomes a feature.
I have always thought this is a rather misguided view as to what LLMs do and indeed what statistical models are. When people describe something as 'just statistics' I feel like they have a rather high-school-ish view of what statistics represents and are transferring this simplistic view to what is going on inside a LLM. Notably they do not find the most probable next word. They find the probability of every word that could come next. That is a far richer signal than most imagine.
And ultimately it's like saying that human brains are just chemical bonds changing and sometimes triggering electrical pulses that causes some more chemicals to change. Complex arrangements of simple mechanisms can produce human thought. Pointing at any simple internal mechanism of an entity without taking into account the structural complexity would force you to assume that both AI and Humans are incapable of creativity.
Transformers are essentially multi-layer perceptron with a mechanism attached to transfer information to where it is needed.
> They find the probability of every word that could come next.
If we're being pedantic, they find a* probability for every token (which are sometimes words) that could come next.
What actually ends up being chosen depends on what the rest of the system does, but generally it will just choose the most probable token before continuing.
* Saying the probability would be giving a bit too much credit. And really calling it a probability at all when most systems would be choosing the same word every time is a bit of a misnomer as well. During inference the number generally is priority, not probability.