Someday, human mathematicians might end up doing proofs for proofs. When a codebase gets too large...

jdw64 • today at 5:43 AM • 7 replies • view on HN

Someday, human mathematicians might end up doing proofs for proofs.

When a codebase gets too large, you eventually can't understand all of it. Even code I wrote myself, I can't fully grasp it.

In those cases, we usually write tests.

But when tests get too big, we end up writing tests for the tests.

Eventually, it feels like we're heading into an era of proofs for proofs.

For me, this problem usually unfolds like this:

1.I can't trust SDKs or Stack Overflow code.

2.So I write tests.

3.But I can't trust the tests either.

4.So I use test coverage, mutation testing, property testing, and fuzzing.

5.If that's still not enough, I add formal verification.

6.And then the problem becomes: can I trust the verifier?

That's how it ends up. Wouldn't human work shift toward verifying the verification systems themselves?

Replies

bodzioney • today at 6:13 AM

Interactive theorem provers are what verifies (or proves) the proof here. This means you don’t even have to look at the actual proof to check its correctness. You just have to make sure the theorem definition is what you wanted (not to say this is trivial) and that no nonsense axioms were defined etc. Now for verifying the theorem prover itself, this is kind of a chicken and the egg problem. We know the mathematical foundations are solid. But to check the implementation of said foundations would require… another theorem prover. In practice, most theorem provers try to make their kernel as small as possible so it can be reasoned about by humans. Coming back to the original topic, mathematics isn’t all proofs though. Someone has to come up with new theories and models. I guess this is what mathematicians would be doing in the future if AI becomes better at proving things than humans. But I could see it narrowing the field, just like it’s doing in others.

practal • today at 12:04 PM

I think 6) is a very good point. The simple reaction to it is, well, I just define a small verification kernel that I trust, and the rest is just scaffolding that does not need to be trusted in order to have full confidence in the verification. Of course, that is not really true, because how do you know that the data arrives properly at the kernel, and is properly read off the kernel? In practice though, the small verification kernel idea works very well. I don't think that this is the final stage of how these systems are designed, though. I think we need to model the full system inside the system, and verify it this way. We can trust this verification because we are verifying it with a system with a small verification kernel, but afterwards, we can replace the small kernel with the modules we have proven to be correct now.

mockerell • today at 6:00 AM

That’s an interesting way to think about it. While tests don’t satisfy mathematicians‘ standards for rigor one could instead look at interactive proofs from complexity theory. These are of interest if a problem doesn’t allow for short proofs, i.e. when the problem is not in NP [1]. In your scenario an adapted AI-assisted theorem prover would be the prover, and a mathematician the verifier.

[1] https://en.wikipedia.org/wiki/Interactive_proof_system

➕ show 1 reply

captainbland • today at 12:15 PM

I guess the question becomes about if we are meaningfully capable of doing this kind of verification any better than just adding another layer of automation

anon-3988 • today at 6:24 AM

What _should_ happen is that we create abstraction. We create a proof for a concept or algorithms. Then we can move on. We need to develop artifacts that can carry proof. We cannot keep writing piles and piles of texts and Python code to build infrastructure. We would be rewriting the same thing 1000x.

teiferer • today at 7:31 AM

> proofs for proofs.

What does that even mean? Sorry, this is just a nonsensical term.

The issue is not that the proofs could be wrong. It's that humans don't understand them even if they are verifibly correct.

In contrast, with software you don't know if it's correct. That's what you have a test for. Even if you understand it, there could be a bug. And tests could have bugs too, so you can have tests for tests. But proofs that are verified are correct. That's it.

Imagine you have a program that is formally proved to be correct. You don't need a test for it. However you might not understand it. Having a test or not does not change that.

TL;DR: Correctness and understandability are (mostly) independent properties.

➕ show 2 replies

Rekindle8090 • today at 6:33 AM

[dead]

alt Hacker News

Replies