I think people are misunderstanding reward functions and LLMs. LLMs don't actually have a rew...

xandrius • yesterday at 10:25 PM • 1 reply • view on HN

I think people are misunderstanding reward functions and LLMs.

LLMs don't actually have a reward system like some other ML models.

storus • today at 12:10 AM

They are trained with one, and when you look at DPO you can say they contain an implicit one as well.

alt Hacker News