logoalt Hacker News

slashdaveyesterday at 5:49 PM0 repliesview on HN

I think you are assuming training from scratch, which I doubt is happening here. Fine-tuning and RL, especially based on synthetic feedback (coding skill, in particular) can be ongoing and is where these models obtain truly useful abilities.