logoalt Hacker News

nghnamtoday at 9:04 AM0 repliesview on HN

I’d be careful about reading too much into these numbers. The test only looks at cases where the model doesn’t know the answer, so it doesn’t show how often users will actually see hallucinations.