logoalt Hacker News

sajithdilshantoday at 3:11 PM1 replyview on HN

What would have been more interesting is if LLMs were tested with questions where the direct solutions are not publicly available (so not in training data). In that case I wonder how much of hallucinations would happen or if it tries to connect dots with what’s available publicly and come up with a direct solution


Replies

christianstumptoday at 3:47 PM

I don't understand why you expect that an answer known to the researcher but which has never been published should be in the training data. You possibly missunderstand what these problems look like -- we made them all publicly available on the website, so please have a look: https://math.sciencebench.ai/benchmarks/benchmarks-in-leipzi...

show 1 reply