logoalt Hacker News

arm32today at 4:51 PM3 repliesview on HN

The title got me, I'll admit it—except that the benchmark is a game where the models are told to lie.


Replies

TheMrZZtoday at 5:15 PM

Disclaimer: I work at Kradle.

They were never told to lie: one AI is given more information than the others, and the goal of the experiment is to understand how they're gonna leverage that advantage.

Indeed the selfish (optimal?) strategy is to lie, yet some decide to tell the truth anyway. That's why it's an interesting benchmark! More info in the research article: https://kradle.ai/research/four-bridges (released before Fable)

forgot-my-pwtoday at 5:06 PM

Had to Google this to learn more. For those who are interested: https://kradle.ai/research/four-bridges

peesemtoday at 5:03 PM

it's unclear to me whether they were actually told to lie or just told to survive / convince others. either way it is somewhat coerced but i think there is still a difference

show 1 reply