Indeed, database vendors should aim to have their software "Jepsen certified".

cheeseprocedure · on April 22, 2015

Agreed!

As an example, HashiCorp included Jepsen test results in their Consul documentation: https://consul.io/docs/internals/jepsen.html

takeda · on April 22, 2015

As Kyle himself mentioned (I think it might even be in relation to this document), Jepsen cannot prove that your software is safe, only prove that it isn't.

He even called them out for modifying timeout value to pass the tests[1].

[1] https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

GordyMD · on April 22, 2015

> Jepsen cannot prove that your software is safe, only prove that it isn't.

This doesn't just apply to Jepsen but to all software tests. You only ever test a finite set of scenarios, so you can't really ever guarantee your software is 'safe'/bug-free - only that it does not fail in common/expected scenarios.

tel · on April 22, 2015

Depending on how large "software tests" is in your mind this can easily be not the case.

Kyle often mentions the tool TLA+ which enables complete and total formal analysis of certain systems. This can be a test which is complete and therefore a positive proof of correctness.

It's also not difficult to test smaller components of your system if you note that the state space here is small enough to be exhausted. This is a very important thing to look in to in serious, important components of a software system.

Finally, though it's futuretech today still, dependent types offer a general system for embedding proofs of correctness directly into code thus elevating positive proof to a computational artifact just like any other.

hibikir · on April 22, 2015

Type systems and proofs have their limitations, if just because for a proof to be worth anything, I need a perfect description of what I want, and that doesn't make any sense outside of trivial cases.

I once worked with a relatively well known, prove everything developer that you might know by name. He's written books and everything. He built an algorithm to make a distributed system of equal peers figure out when it needed to start more nodes, or could shut some of them down. He wrote a proof, in code. He wrote a paper. When in production, the system would not work as advertised, and he blamed it on other pieces, because the algorithm was proven correct! So the problem stayed there for months.

After he left the company, I decided to figure the problem out, so I read through the proofs, the paper, and the code: All the single letter variables you could possibly want. I figured out that yes, the algorithm was flawless, as long as every operation in the system was atomic and instantaneous. Instead of the proof, I built a small simulator that didn't have such flawed assumptions, and got the exact same behavior as the production system. So the proof was perfect, as long as we made assumptions that are impossible in our universe. And the entire algorithm was less than 200 lines of code.

So whenever we have a reality that is difficult to model (and let me tell you, distributed systems fit the bill), dependent types will not save you, haskell or no haskell. Proofs will always be limited by your assumptions.

So while the tools you mention are nice. They hit the same limits as everything else we build. Whether to write a proof in idris, use generative testing, or just some example testing, is really all a tradeoff, but you will never escape from bad specification, as all specifications are bad.

tel · on April 22, 2015

Sure, never indicated otherwise I hope!

But that said, the style of analysis is totally different in each case. It's not true that you can never prove correctness. It's just a certain option with certain tradeoffs.

And there are success stories! Tons of older ones back when proof was a major component of compute programming. More modern ones like validated C kernels and compilers.

throwaway832975 · on April 24, 2015

I don't know of any case in industry where TLA+ has been used to prove a spec correct. AFAIK it's only been used for model checking. Read the "Formal Methods at AWS" paper for details.