logoalt Hacker News

DavidSJyesterday at 9:37 AM2 repliesview on HN

Yes, the actual LLM returns a probability distribution, which gets sampled to produce output tokens.

[Edit: but to be clear, for a pretrained model this probability means "what's my estimate of the conditional probability of this token occurring in the pretraining dataset?", not "how likely is this statement to be true?" And for a post-trained model, the probability really has no simple interpretation other than "this is the probability that I will output this token in this situation".]


Replies

mr_toadyesterday at 12:51 PM

It’s often very difficult (intractable) to come up with a probability distribution of an estimator, even when the probability distribution of the data is known.

Basically, you’d need a lot more computing power to come up with a distribution of the output of an LLM than to come up with a single answer.

podnamiyesterday at 9:41 AM

What happens before the probability distribution? I’m assuming say alignment or other factors would influence it?

show 1 reply