logoalt Hacker News

stingraycharlestoday at 6:10 AM0 repliesview on HN

Underlying data changes all the time, as do training methodologies / preferences.

You do realize that these LLMs are trained with a metric ton of synthetic examples? You describe the kind of examples / behavior you want, let it generate thousands of examples of this behavior (positive and negative), and you feed that to the training process.

So changing this type of data is cheap to change, and often not even stored (one LLM is generating examples while the other is training in real-time).

Here's a decent collection of papers on the topic: https://github.com/pengr/LLM-Synthetic-Data