logoalt Hacker News

root-parenttoday at 2:02 PM1 replyview on HN

"...Between April 1 and May 15, 2026, a group of 49 mathematicians compiled a dataset of research-level mathematics questions with known answers... We present the resulting collection of 100 questions....We evaluated these questions in three stages: a single attempt by five state-of-the-art LLMs....we concluded Stage 3 with only 2 unsolved questions. This demonstrates that the mathematical reasoning capabilities of LLMs are becoming impressive..."


Replies

rabidvermintoday at 2:16 PM

mathematics questions with known answers...

... that are therefore liable to be in the training data?

show 3 replies