logoalt Hacker News

fauriayesterday at 7:03 PM0 repliesview on HN

RLHF: Reinforcement learning from human feedback - https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...