logoalt Hacker News

cold_harbortoday at 11:17 AM1 replyview on HN

what's wild is they accidentally solved it — pretraining IS unsupervised learning at scale, RLHF IS reinforcement learning. they just didnt know the recipe yet


Replies

jmalickitoday at 12:06 PM

pretraining isn't unsupervised, it is self-supervised - meaning it is moderately more scale limited.

show 2 replies