No one is claiming an agent can do 50% of arbitrary tasks. It's just 50% of METR's ...

raincole • today at 5:12 AM • 2 replies • view on HN

No one is claiming an agent can do 50% of arbitrary tasks. It's just 50% of METR's benchmark set.

> I think you're overestimating, or oversimplifying

Yeah if you only read comments on HN but not the actual linked article you will get oversimplified conclusion. Like, duh?

TeMPOraL • today at 8:52 AM

> Yeah if you only read comments on HN but not the actual linked article you will get oversimplified conclusion. Like, duh?

Curiously, for most submissions it's the opposite - comments are much more useful and nuanced than the source being discussed.

boxedemp • today at 5:14 AM

Sorry for stating something so obvious. I'll comment less from now on.

alt Hacker News