I don't know about Mythos but the chart understates the capability of the current frontier mode...

samuelknight • yesterday at 11:58 PM • 0 replies • view on HN

I don't know about Mythos but the chart understates the capability of the current frontier models. GPT and Claude models available today are capable of Web app exploits, C2, and persistence in well under 10M tokens if you build a good harness.

The benchmark might be a good apples-to-apples comparison but it is not showing capability in an absolute sense.

alt Hacker News