logoalt Hacker News

jrmgyesterday at 7:00 AM1 replyview on HN

I have the same worry about LLMs in general - I know that ‘model collapse’ seems to be an unfashionable idea, but when the internet’s just full of garbage (soon?…), what are we going to train these things on?


Replies

tehjokeryesterday at 2:26 PM

They moved away from raw text and are now working with verifiable synthetic data (eg math, games, code) to improve general reasoning.