"Extraterrestrial life exists somewhere in the universe."
GPT-5.4: Misleading
Opus 4.7: Misleading
Gemini 3: FALSE
Gemini 3 (Retrieval): FALSE
Sonar Pro: FALSE
It's a weird fact claim, because the ground truth is "nobody knows for sure" and that's not one of the available options.
Of the available options, "Misleading" is probably the best, since something that is most likely true but unproven is presented as fact
But "unknown or undecidable" should have been a category.
Looks like an ongoing theme and a very poor benchmark. Not at all the claims I expected.
I would think ‘false’ is the only correct answer a there’s no evidence to prove the claim, so the claim is safely assumed false.
Then again maybe that’s why I’m an atheist, not an agnostic?
I would argue, FALSE is the correct answer, since this is not a fact, you can know for sure. The logical inverse is also FALSE.
> It's a weird fact claim, because the ground truth is "nobody knows for sure" and that's not one of the available options.
It's even weirder to suggest that the disagreement is indicative of a problem. If you asked five very knowledgeable humans on this subject to select the correct answer on a multiple-choice questionnaire, they would almost certainly vary significantly more than these 5 LLMs.
Not to say that hallucination isn't a problem, but this is a lousy way to test it.