If you read the charter of the eval (or any eval, really), this statement is pretty silly.
The whole point of each eval version is to identify a chunk of challenges that humans do well that AI can't. When AI gets to ~80, you move to the next chunk. When you run out of challenges, you have AGI.
If you read the charter of the eval (or any eval, really), this statement is pretty silly.
The whole point of each eval version is to identify a chunk of challenges that humans do well that AI can't. When AI gets to ~80, you move to the next chunk. When you run out of challenges, you have AGI.