logoalt Hacker News

conceptiontoday at 2:01 PM2 repliesview on HN

This brought up an inadvertent benchmark I accidentally made between the big three AIs. I had them all research an old BBS i used to use, hunting for a DOOR game i played on it. I gave details about the bbs that i could remember and a pretty well defined description of the game and threw it at the deep researchers to see what they could fine.

ChatGPT gave me about a ten page report on who ran the bbs and the name of the game. When I looked into it the game was totally different and the guy named had nothing to do with the bbs. “Since these were both popular items at the time, I just inferred.” But it had fabricated the entire report. Nothing in it was true.

Gemini did the same thing but the report was about twenty pages. 100% hallucinated.

Claude said it couldn’t find any information.

Best advertisement I’ve ever lived.

I still hunt for the door game today….


Replies

SubiculumCodetoday at 2:56 PM

The research reports can be useful where there is a lot of information, but they'll straight up couch the lack of information with phrases like "emerging trends" while rephrasing a hypothesis as evidence.

wizzwizz4today at 3:02 PM

You could ask on Retrocomputing Stack Exchange. These question are on-topic: https://retrocomputing.stackexchange.com/tags/identify-this-....