logoalt Hacker News

dyauspitryesterday at 11:09 PM1 replyview on HN

If anything this makes the test much harder for the LLM to get high scores and that makes the scores they’re getting all that much more impressive.


Replies

daveguytoday at 2:13 PM

The scroes they're getting are on the order of 0-1% for this ARC-AGI-3 benchmark.