logoalt Hacker News

simianwordsyesterday at 8:23 PM3 repliesview on HN

There's something off with this because Haiku should not be that good.


Replies

camgunztoday at 6:55 AM

Hallucination benchmarks accept "I don't know", which Haiku did at least a little. Here are other benchmarks corroborating: https://suprmind.ai/hub/ai-hallucination-rates-and-benchmark...

rattraytoday at 1:36 AM

I've been very curious about that too. I wonder if it's actually much better at admitting when it doesn't know something, because it thinks it's a "dumber model". But I haven't played with this at all myself.

jwpapiyesterday at 8:37 PM

The hallucination benchmark is hallucinating