logoalt Hacker News

throwa356262yesterday at 6:59 PM2 repliesview on HN

According to people who have access to Mythos, it is slightly worse than GPT-5.5-xhigh. At least for security tasks.

Hold on, I think this claim needs some hard data. Here you go gentlemen:

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...


Replies

aesthesiayesterday at 7:26 PM

See the later post testing a newer Mythos checkpoint, though: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber...

show 1 reply
ACCount37yesterday at 7:24 PM

That claim keeps contradicted hard by other parties, who say Mythos beats 5.5 resoundingly on both autonomous search and discovery and creation of complex exploit chains.

There might be a harness difference, but also, this CTF-type benchmark might not capture the capability difference fully.

show 1 reply