> There's precious little training material left that isn't generated by LLMs themselves.
> Percentage-wise this is quite exaggerated.
How exaggerated?
a) The percentage is not static, but continuously increasing.
b) Even if it were static, you only need a few generations for even a small percentage to matter.
> You consider this above factor to lead to model collapse? You’ve only mentioned one factor here; this isn’t enough. I’m aware of the GIGO factor, yes. Still there are at least ~5 other key factors needed to make a halfway decent scaling prediction.
What are those other factors, and why isn't GIGO sufficient for model collapse?