I think you are assuming training from scratch, which I doubt is happening here. Fine-tuning and RL,...

slashdave • yesterday at 5:49 PM • 0 replies • view on HN

I think you are assuming training from scratch, which I doubt is happening here. Fine-tuning and RL, especially based on synthetic feedback (coding skill, in particular) can be ongoing and is where these models obtain truly useful abilities.

alt Hacker News