I wonder if AI systems like ChatGPT will be helpful in this area. Something that...

Taikonerd · on April 10, 2023

I've had the same thought. Formally proven code can give powerful security assurances (cf. Project Everest[1]), but it's also very, very labor-intensive. I've heard rules of thumb like, "100x as much time to prove the software correct as to write it in the first place."

If LLM systems are going to give us a virtual army of programmers, I think formally proving more systems software (device drivers, browser engines, etc.) would be a great use of their time.

[1]: https://www.microsoft.com/en-us/research/blog/project-everes...

AlotOfReading · on April 10, 2023

I've had extremely poor results attempting to get current LLMs to generate proofs, for code or otherwise.

ogogmad · on April 10, 2023

Combining it with AutoGPT (or the ideas it's based on) and a formal prover like Lean might be the answer. Have you tried?

Basically, if you don't allow GPT to iteratively write, execute, criticise and then correct its code, you won't get good results.

rqtwteye · on April 10, 2023

I think it probably needs to be a specialized combination of LLM and hardcoded knowledge.

deterministic · on April 14, 2023

CompCert is a proven correct C compiler proven correct by a very small team in a few years. Used by Airbus for avionics software. GCC after 25 years of work still has bugs.

markusde · on April 10, 2023

Yeah, I'm cautiously excited about how AI and FM might work together. I don't think LLM's can ever be trusted to verify programs itself, but anything which can reduce the annotation overhead for programmers is a super useful thing!

rqtwteye · on April 10, 2023

I don't think a pure LLM approach will work (or maybe it will, who knows?). I am more thinking of a hybrid that combines LLM or other AI with some hardcoded knowledge for this domain

cartoonfoxes · on April 10, 2023

Most definitely. I've been playing with using ChatGPT to generate proof texts in Isabelle/HOL, since it lets me verify the correctness of the output before code generation.

stfutechbros · on April 10, 2023

ChatGPT is basically the opposite of formal verification...

It itself is not verified, and its results are so inaccurate that you need to verify them anyway.