logoalt Hacker News

podnamiyesterday at 9:41 AM1 replyview on HN

What happens before the probability distribution? I’m assuming say alignment or other factors would influence it?


Replies

DavidSJyesterday at 9:47 AM

In microgpt, there's no alignment. It's all pretraining (learning to predict the next token). But for production systems, models go through post-training, often with some sort of reinforcement learning which modifies the model so that it produces a different probability distribution over output tokens.

But the model "shape" and computation graph itself doesn't change as a result of post-training. All that changes is the weights in the matrices.