logoalt Hacker News

therobots927today at 2:26 AM6 repliesview on HN

It’s a well known problem in higher mathematics that even if you’ve solved a problem, often the proofs are incredibly long and complex and require an extensive amount of time spent by peers to review it.

It would be great if someone could explain to me how AI improves this situation. Even if AI thinks it’s solved a problem, unless the proof is incredibly efficient and well explained, it will be difficult to verify the correctness. One hallucination in 300 steps of logic is enough to destroy the entire proof.


Replies

hilbertseriestoday at 3:12 AM

In 2012 Mochizuki claimed to have proved the abc conjecture by developing a new branch of mathematics. He was a respected mathematician, but the theories he had developed were so complex no one could determine if he was correct. It took six years until two number theorists dissected the proof and found a fatal flaw in it.

jstanleytoday at 5:21 AM

This is what Lean is for: https://lean-lang.org/

If you have the LLM generate Lean code, and it compiles, then the proof is correct and you don't need to bother checking its working. (You still need to check that it is proving the theorem you asked it to prove).

show 1 reply
jonahxtoday at 3:01 AM

> It would be great if someone could explain to me how AI improves this situation.

It's main utility is in the search step, not the verification step. The search is the bulk of the work and creativity. Separately, as the sibling commenter pointed out, it will likely get better at the verification step as well, with integrations of tools like Lean.

> One hallucination in 300 steps of logic is enough to destroy the entire proof.

The situation with human mathematicians is not much different. Eg, Wiles original proof of Fermat's Last Theorem contained errors found by reviewers, which he later repaired.

show 1 reply
skipkeytoday at 2:49 AM

I would imagine that in the future AI will be doing proofs in Lean or whatever the successor to it, which gives you a pretty good confidence it’s correct.

throw310822today at 1:08 PM

> even if you’ve solved a problem, often the proofs are incredibly long and complex and require an extensive amount of time spent by peers to review it.

I hope mathematicians have the self-respect of not establishing rules such as "if a proof is longer than X, break it up in smaller ones, otherwise it'll be outright rejected".