logoalt Hacker News

tinthedevtoday at 8:21 AM0 repliesview on HN

You misunderstand the "test" here to mean programming, rather than test against the model's capabilities.