logoalt Hacker News

isomorphic_duckyesterday at 11:13 PM1 replyview on HN

If Claude Mythos and Fable 5 are the same underlying models just with different safeguards, I fail to see how TerminalBench has them at different scores.


Replies

sothatsittoday at 12:05 AM

Refusals, presumably.