> the spicy autocomplete can solve difficult open math problems
No it can't. It can't even solve my son's 4th grade math homework. (This is a real use case for me, not a dumb benchmark.)
You just know nothing about math and are happy to parrot bullshit AI salesmen are selling you.
> You just know nothing about math and are happy to parrot bullshit AI salesmen are selling you.
Not the parent poster here. I do know things about math. I wrote a few papers related to the unit distance problem (https://arxiv.org/abs/2311.10069, https://arxiv.org/abs/2406.15317) and spent quite some time trying to solve it. I had no chance of coming up with the proof that the spicy autocomplete came up with. Dumb benchmark, sure.
I would genuinely be interested in knowing what you're doing that led you to this conclusion.
I would be shocked if I was unable to solve 4th grade math homework with any of the contemporary frontier models. I spend most days using them to do significantly more complex things than that.
We've already long past that threshold.
Reasoning models with access to Python have been able to solve 4th grade math homework for over a year now. Prove me wrong: show me a 4th grade math problem they can't handle.
Terrence Tao disagrees with what you're saying. I think he's in a slightly better position to speak on the subject.