Their isolation approach is totally different from Mythos approach though. Mythos had to evaluate wh...

chirau • today at 5:25 PM • 2 replies • view on HN

Their isolation approach is totally different from Mythos approach though. Mythos had to evaluate whole code bases rather than isolated sections. It's like saying one dog walked into the Amazon jungle and found a tennis ball and then another team isolated a 1 square kilometer radius that they knew the ball was definitely in and found the same ball.

Replies

hakanderyal • today at 8:33 PM

Even that would be more meaningful test. They basically coated the ball with a strong smell, then they prepped the dog with that smell, then set it loose in a 5x5 meter area.

"Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior")."

kennywinker • today at 5:36 PM

I don’t think mythos can ingest an entire codebase into context. So it’s spinning off sub-agents to process chunks. Which supports their thesis: the harness is the moat. The tooling is whats important, the model is far far less important.

➕ show 1 reply

alt Hacker News

Replies