Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can elaborate on these points.

The service acts more like a key value store (this is a simplified explanation, but for your questions it will do).

You give it a value, it gives you back a token, which you can later exchange for the original value.

This means the real value is stored in the encryption service, not in the receiving applications database. This gives us the flexibility to perform key rotation (and even upgrade our ciphers as the crypto landscape evolves) at any time without having to worry about where the the encrypted value is being used, as the only data stored outside the service are opaque tokens.

As for de-anonymizing, the service is not designed to take an encrypted value and return its token. If that were possible, we wouldn't have done a very good job encrypting it ;)



For de-anonymizing, the idea is to give the encrypted service the plain text and get a matching token. But then that will be more of a hash. If you are encrypting where all the tokens are different, you can't do a join or analysis. You can't for instance count how many unique phone numbers you have. If a user is using your app, how do they see their PI data?


> If you are encrypting where all the tokens are different, you can't do a join or analysis.

That would hopefully be part of the reason for doing it this way.

I once worked on a system where we encrypted most customer data on registration and took it entirely off line once a day (so new data was in encrypted form online for a day, and then was air-gapped permanently).

The fact that marketing etc. had to request reports to be run manually on the airgapped customer database was an important barrier that made them think about how they could meet their needs without it.

Sometimes, of course, they had genuine needs that needed access to the unencrypted data, but it was rare.

I'm a big fan of making it take extra effort to do these things - time and resources seems to be a far stronger barrier than requiring authorization.


You're correct that it does make certain kinds of analysis more difficult.

However that doesn't mean we can't ever get access to the original data. Most of our current BI needs to can be met using the un-encrypted data, but for example, if we did want to answer your phone number question, we could craft a special purpose program to perform the analysis without compromising user privacy.

1. Select all phone number tokens

2. Decrypt

3. Produce counts (total unique, etc)

Said program would have to go through normal code review and approvals, and then deployed into the secure zone (so it could access the encryption service).




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: