logoalt Hacker News

RLHF from Scratch

61 pointsby onurkanbkrctoday at 11:39 AM2 commentsview on HN

Comments

fauriatoday at 7:03 PM

RLHF: Reinforcement learning from human feedback - https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...

alansabertoday at 2:11 PM

Looks good. I am a big advocate for these hands on demos as being the best way for beginners to learn ML