> exaggeration and spin made by corporate marketing.
corporate marketing spins and hypes, but this is an ultimately pretty academic and mathematical field. The loud LinkedIn promoters are not building these systems.
"if one looks into the details and think critically about what is being claimed, one can see that consumers are jumping to false conclusions."
well then help us out here: can you be specific? To me it sounds a lot like goalpost moving. You're telling me that in 2020 if I showed you a system that can solve an Erdos problem or disprove a conjecture (just recently showed up) you wouldn't be blown away?
> That said, it is very cool how an LLM helped human mathematicians in the recent specific Erdos problem solution announced by OpenAI. Just don't jump to the conclusion that anybody can input any Erdos problem into an LLM and a solution will come out the other end.
Woah woah, that's not the conclusion I'm jumping to. That's not at all how these headlines happen. Solving problems like this is almost prohibitively expensive today, and they more often than not lead nowhere. The point I'm making is, today, 4 years since ChatGPT, we have systems that can and have solved them. First we had things like AIME and IMO benchmarks, then people said "well those are just cheats in the training data, wait for it to solve a real math problem" -- ok but now we're solving real math problems.
> well then help us out here: can you be specific?
In "Remarks on the disproof of the unit distance conjecture" (https://arxiv.org/abs/2605.20695) I think Melanie Matchett Wood's remark is the most informative: "It is easy to jump to hasty conclusions, but what we can learn about humans, AI, and mathematics from this development is somewhat subtle. I believe if the level and type of human expertise that is represented on this note had been assembled to find a counterexample to this conjecture a month ago, and those people put in similar amounts of time working on it than they did to reading and thinking about ChatGPT’s solution, the mathematicians would have found a counterexample. However, without the claimed proof by ChatGPT, there is no particular reason anyone would have tried to look for a counterexample, assembled a group of experts with the appropriate expertise, or that the experts would have agreed to turn their attention to this problem."
Some readers might find some of the other remarks more appealing or more informative. I encourage folks to read these remarks rather than the OpenAI marketing video and spin.
> To me it sounds a lot like goalpost moving. You're telling me that in 2020 if I showed you a system that can solve an Erdos problem or disprove a conjecture (just recently showed up) you wouldn't be blown away?
I'm not sure what goalpost you are talking about. Regarding 2020: it depends on the framing, how much I know about the conjecture, the details of the computer system. I an easily imagine not being blown away. But I don't really see the relevance of our emotional reactions to computers doing new things we've never seen computers do before. If the goalpost is being "blown away" by what computers can do, then that happened I think around 1990 when I heard a computer program generate a vaguely human sounding voice. In math, I think it happened when I saw Mathematica simplified a huge nasty complicated algebra expression around 2000. I've been "blown away" by new things computers can do many times over the past 36 years.
> Woah woah, that's not the conclusion I'm jumping to. ... ok but now we're solving real math problems.
Sounds like we are in agreement then that (1) LLMs can not solve any given Erdos problem and (2) computers are solving more real math problems than they were before.
I honestly do think what the OpenAI group did with an LLM recently is a new milestone worthy of attention if one is interested in computer assisted mathematics. I don't mean to diminish the LLM feat. I just mean to throw shade on the corporate marketing, language, slick video, and spin.