Hacker News new | past | comments | ask | show | jobs | submit login

I've had the same thought. Formally proven code can give powerful security assurances (cf. Project Everest[1]), but it's also very, very labor-intensive. I've heard rules of thumb like, "100x as much time to prove the software correct as to write it in the first place."

If LLM systems are going to give us a virtual army of programmers, I think formally proving more systems software (device drivers, browser engines, etc.) would be a great use of their time.

[1]: https://www.microsoft.com/en-us/research/blog/project-everes...




I've had extremely poor results attempting to get current LLMs to generate proofs, for code or otherwise.


Combining it with AutoGPT (or the ideas it's based on) and a formal prover like Lean might be the answer. Have you tried?

Basically, if you don't allow GPT to iteratively write, execute, criticise and then correct its code, you won't get good results.


I think it probably needs to be a specialized combination of LLM and hardcoded knowledge.


CompCert is a proven correct C compiler proven correct by a very small team in a few years. Used by Airbus for avionics software. GCC after 25 years of work still has bugs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: