The question boils down to "what can an attacker learn by owning the server" and...

zeeboo · on March 27, 2019

You can know exactly what the attacker can learn because you can see ALL of the information that your client passes to the server by auditing ONLY the client. Your argument applies equally well to every single router or middle box on the internet, and it's just as wrong there.

You prove that it doesn't matter by assuming that keybase is running the most malicious code possible, auditing your client, and deciding that the system is still secure. This is what auditing the client means.

Additionally, to bring up this fact again because it has only been hand-waved away: Even if the server was open source, there is no guarantee they are running that code. Thus, there is no benefit to security until systems exist (somehow?) to prove the server is running the code you expect.

Double additionally: even IF you can prove that the server is running what you expect, how do you know that some box, after https is peeled off, but before the request makes it to the server, is not sending the same request off to some other, malicious, server?

josh2600 · on March 27, 2019

I am going to say this one more time because I think it's a real point and I think you're dismissing it out of hand is unreasonable: there are things that can be learned from the server.

It's one thing to tell people that you aren't logging anything. It's another thing to show everyone you're not logging anything except the account creation date and last access date by open sourcing the software and then show exactly that in a response to a national security letter: https://www.aclu.org/open-whisper-systems-subpoena-documents.

zeeboo · on March 27, 2019

Did you notice how your proof rests entirely on the NSA letter, and not the source code of the server at all? Isn't a world conceivable where they open sourced the server, with no logging, and then sent an NSA letter that contained information that wasn't logged in the open source code? If this is somehow impossible, please explain how.

josh2600 · on March 27, 2019

Did you notice how you can compare the NSA letter to the source code and realize the effect is that they're the same?

If you didn't have the NSA letter, would you be able to verify the source code? If another project got an NSA letter and responded to it, would it tell you anything about the source code?

This is simple: Having the source code means you get to learn more from the other signals, no pun intended, of how that source code is used.

Again, as we move to a world where servers have more verifiable code running on them, the value of having open source code will increase.

zeeboo · on March 27, 2019

I don't understand your points about the NSA letters, which makes me think that my point was missed. I am saying that the NSA letter claiming that only some information was logged is fully independent of the open source code of the server. Assuming the NSA letter reflects the truth, there could be more information or less information that what appears to be collected from the open source server code because, once again, the server does not have to be running the open source code, and even if it were, that does not preclude other systems from running against the same information the server has access to. Hence, open sourcing the server does not affect the security of the system at all. If the system is insecure without knowledge of how the server works, then the system is insecure. Period.

I think you're trying to argue that open source is good, and I agree with you. Open sourcing the server has many benefits. The only point I have consistently been trying to make is that open sourcing it does not help with determining the security of the system, whatsoever.

edit:

> If you didn't have the NSA letter, would you be able to verify the source code?

No, but even if the code was open sourced, you would not be able to verify the code that is running.

> If another project got an NSA letter and responded to it, would it tell you anything about the source code?

It would tell you something about the code they are running, yes, but nothing about the code they open sourced.

> This is simple: Having the source code means you get to learn more from the other signals, no pun intended, of how that source code is used.

This is equally simple: the source code that is open may have nothing to do with the source code that is running, and you must assume that they are not equal when auditing the security of the system.

josh2600 · on March 27, 2019

Just to be extra clear, the chances of someone lying to the NSA in a letter are really, really low. Given that we can compare the response to the NSA to what is expected and it matches, we can make some inferences that the software running on the servers is as presented.

In contrast, if you received an NSA letter for keybase and they delivered similar information, you couldn't make any suppositions about the server's code.

To be extra, extra clear, to me, the future of the private internet is further verifiability of remote systems. That begins with Open Source. I concede that we aren't there for most parts of the systems we use today, but we are getting better (see attested contact discovery in Signal as one example).

zeeboo · on March 27, 2019

Why would I not be able to make inferences about the software the servers are running if the chances on lying on the letter is low? I haven't read Signal's source code, and yet I believe with just as much confidence that they aren't logging extra information as if keybase had sent the same NSA letter. To me, Signal's source code is effectively closed, and reading it wouldn't increase my belief. (Have you read all of their server's source code? If not, why do you justify your belief?)

The article on attested contact discovery states "Of course, what if that’s not the source code that’s actually running? After all, we could surreptitiously modify the service to log users’ contact discovery requests. Even if we have no motive to do that, someone who hacks the Signal service could potentially modify the code so that it logs user contact discovery requests, or (although unlikely given present law) some government agency could show up and require us to change the service so that it logs contact discovery requests.", which is exactly the point I'm making. They choose to solve it by signing code and ensuring that exactly that code is running (seems like they just move the trust to Intel. Hopefully SGX never has any bugs like https://github.com/lsds/spectre-attack-sgx or issues with the firmware, as noted by the Intel SGX security model document), which is fine, but an equally valid way to do this is to make it so that the secure operation of the system does not depend on what code the server is running.

Doing that has some tradeoffs: there's usually overhead with cryptography, or an algorithm you need may not even be possible (Signal disliked those tradeoffs for this specific algorithm), but for some algorithms, it's entirely possible to do. For example, one can audit OpenSSL's code base, and determine, regardless of what the middle boxes or routers do, that the entire system is secure. Just replace OpenSSL with keybase's client, and middle boxes with keybase's servers, and do the auditing. Hence, open sourcing the server is not necessary for security. Would it be great if more systems could be audited? Absolutely. Is it always necessary for security? Absolutely not.

edit: Another quote from the article: "Since the enclave attests to the software that’s running remotely, and since the remote server and OS have no visibility into the enclave, the service learns nothing about the contents of the client request. It’s almost as if the client is executing the query locally on the client device." Indeed, open sourcing the code running in the secure enclave is effectively open sourcing more code in the client.

josh2600 · on March 28, 2019

Just to be clear, code running on a remote server is not code running in the client. Just because the server attests to the client doesn’t mean the client is running that code. You still have to do all of the threat modeling for the attested code differently from the threat modeling for the client.

I’m not yet prepared to publicly get into all of the nuances of SGX, but I think it’s worth noting that there’s something very interesting happening there. I look forward to being able to discuss my team’s technical findings on the subject in public.

To summarize why this is so interesting: the attack surface is the whole system. Enclaves let us extend parts of our trust model to systems we don’t own. That is a real change and, if it works, it’s going to change how systems are designed at a deep level. The problem is that there aren’t very many working implementations of sgx in the Wild (signal is the only one I know of).

We’ll see where the wind blows.

zeeboo · on March 28, 2019

Enclaves are interesting, and I also look forward to all of the new things they allow. But all of that has nothing to do with open sourcing the server being important for security if given the ability to audit the client, and the client is not designed to require a cooperating server.

I'm tired of trying to get you to understand this point and have you respond with red-herrings and FUD. Please be intellectually honest when asking keybase to open source their server in the future, and don't claim that it's relevant to the security of the system.

josh2600 · on April 3, 2019

The openness of the code is always relevant to the security of the system. To pretend otherwise is dishonest.

Good day.

zeeboo · on April 3, 2019

I'll believe you once you tell me how the openness of a core internet router is important to the security of visiting a website over https. Good job keeping up the FUD!

josh2600 · on April 9, 2019

... because https only protects the payload in flight, not at rest, and not the origin and destination ip. It depends on your threat model.

Please don’t come back and tell me “oh but you can use a vpn”. We have yet to build a real vpn that can stand up to serious scrutiny.

Edit: to be clear, we might be able to build routers that actually are blind routing oracles, but we haven’t done that yet.