GPT-5.5: Mythos-Like Hacking, Open to All

62 points • by rs_rs_rs_rs_rs • yesterday at 6:16 PM • 17 comments • view on HN

Comments

They say its mythos like, without actually comparing it to Mythos (fair enough, it's not public) but the bar for a model to be mythos-like has to be that you can produce as many novel and high severity security vulns outlined in the Mythos redteam blog. I haven't seen any other lab produce a report like that yet. The proof is in the pudding.

➕ show 1 reply

immanuwell • today at 8:52 AM

Those miss-rate numbers are genuinely eye-opening - dropping from 40% to 10% in what sounds like a single generation is no joke - though it's worth taking any vendor-adjacent benchmark with a grain of salt until the broader security community kicks the tires

WhiteDawn • yesterday at 11:18 PM

First you need to get through the safety net. I’ve had many productive gpt5.4 sessions hit a roadblock of “ethicality” and pollute the context with multiple rounds of trying to convince it to continue

nsingh2 • yesterday at 9:51 PM

These plots are terrible. Why is categorical data connected across categories with lines? Why not just use bar plots?

Like in the "Web Vulns in OSS" plot, white box data for Opus 4.7 is not available, but the absurd linear interpolation across categories implies it should be near 60.

➕ show 2 replies

mertcikla • yesterday at 10:34 PM

why does this read like an openai ad?

strange_quark • yesterday at 10:15 PM

Wasn't it already confirmed that small open-weight models were able to detect most of the same headline vulns as mythos? How is this any different?

➕ show 2 replies

alt Hacker News

GPT-5.5: Mythos-Like Hacking, Open to All

Comments