logoalt Hacker News

logicprogyesterday at 10:46 PM0 repliesview on HN

Also from that exact same study (why not cite the actual study? It's quite readable) the LLMs couldn't recite more than a small fraction of many other books, often ones just as well known[0] — in fact, from the bar charts shown in the exact news article you cited, it's pretty clear that Sonnet 3.7 was a massive outlier, and so was Harry Potter and the Sorcerer's Stone, so it really seems to me like that's an extremely unrepresentative example, and if all the other LLMs couldn't recite even a small fraction of all the other books except that one outlier pairing, despite them being widely reproduced classics, why would we expect LLMs to actually regurgitate regularly, especially a relatively unknown open source project that probably hasn't been separately reproduced that many times?

Not to mention the fact that, as the other commenters mention, that appears to just... not have happened at all in this case, so it's a moot point.

[0]: https://arxiv.org/pdf/2601.02671