logoalt Hacker News

smallerizetoday at 11:00 AM0 repliesview on HN

I think it means most of the training data is short. And a lot of the long-context examples are conversations where the middle turns are less important.