Hacker News new | past | comments | ask | show | jobs | submit login

I wonder if AI systems like ChatGPT will be helpful in this area. Something that’s able to track all requirements of a system and understands the code enough to validate it against the requirements.



I've had the same thought. Formally proven code can give powerful security assurances (cf. Project Everest[1]), but it's also very, very labor-intensive. I've heard rules of thumb like, "100x as much time to prove the software correct as to write it in the first place."

If LLM systems are going to give us a virtual army of programmers, I think formally proving more systems software (device drivers, browser engines, etc.) would be a great use of their time.

[1]: https://www.microsoft.com/en-us/research/blog/project-everes...


I've had extremely poor results attempting to get current LLMs to generate proofs, for code or otherwise.


Combining it with AutoGPT (or the ideas it's based on) and a formal prover like Lean might be the answer. Have you tried?

Basically, if you don't allow GPT to iteratively write, execute, criticise and then correct its code, you won't get good results.


I think it probably needs to be a specialized combination of LLM and hardcoded knowledge.


CompCert is a proven correct C compiler proven correct by a very small team in a few years. Used by Airbus for avionics software. GCC after 25 years of work still has bugs.


Yeah, I'm cautiously excited about how AI and FM might work together. I don't think LLM's can ever be trusted to verify programs itself, but anything which can reduce the annotation overhead for programmers is a super useful thing!


I don't think a pure LLM approach will work (or maybe it will, who knows?). I am more thinking of a hybrid that combines LLM or other AI with some hardcoded knowledge for this domain


Most definitely. I've been playing with using ChatGPT to generate proof texts in Isabelle/HOL, since it lets me verify the correctness of the output before code generation.


ChatGPT is basically the opposite of formal verification...

It itself is not verified, and its results are so inaccurate that you need to verify them anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: