I wonder if AI systems like ChatGPT will be helpful in this area. Something that’s able to track all requirements of a system and understands the code enough to validate it against the requirements.
I've had the same thought. Formally proven code can give powerful security assurances (cf. Project Everest[1]), but it's also very, very labor-intensive. I've heard rules of thumb like, "100x as much time to prove the software correct as to write it in the first place."
If LLM systems are going to give us a virtual army of programmers, I think formally proving more systems software (device drivers, browser engines, etc.) would be a great use of their time.
CompCert is a proven correct C compiler proven correct by a very small team in a few years. Used by Airbus for avionics software. GCC after 25 years of work still has bugs.
Yeah, I'm cautiously excited about how AI and FM might work together. I don't think LLM's can ever be trusted to verify programs itself, but anything which can reduce the annotation overhead for programmers is a super useful thing!
I don't think a pure LLM approach will work (or maybe it will, who knows?). I am more thinking of a hybrid that combines LLM or other AI with some hardcoded knowledge for this domain
Most definitely. I've been playing with using ChatGPT to generate proof texts in Isabelle/HOL, since it lets me verify the correctness of the output before code generation.