Those are supposed to be issues? After reading your list my impression of ARC-AGI has gone up rather...

fc417fc802 • yesterday at 9:30 PM • 4 replies • view on HN

Those are supposed to be issues? After reading your list my impression of ARC-AGI has gone up rather than down. All of those things seem like the right way to go about this.

Replies

red75prime • today at 5:44 AM

No, those aren't issues. But it's good to know the meaning of those numbers we get. For example, 25% is about the average human level (on this category of problems). 100% is either top human level or superhuman level or the information-theoretically optimal level.

➕ show 1 reply

girvo • yesterday at 10:30 PM

Yeah I'm quite surprised as to how all of those are supposed to be considered problems. They all make sense to me if we're trying to judge whether these tools are AGI, no?

➕ show 3 replies

stingraycharles • today at 9:48 AM

“no harnass at all” might be an issue, though, as these types of benchmarks are often gamified and then models perform great on them without actually being better models.

stonogo • today at 1:43 AM

They are severe problems if your income is tied to LLM hype generation.

alt Hacker News

Replies