Hacker News new | past | comments | ask | show | jobs | submit login
The Rise of Fully Homomorphic Encryption (acm.org)
285 points by yarapavan on Sept 29, 2022 | hide | past | favorite | 123 comments



> Today, conventional wisdom suggests that an additional performance acceleration of at least another 1 million times would be required to make FHE operate at commercially viable speeds. At the moment, Cornami is the only commercial chip company to announce a forthcoming product that can meet and exceed that performance level.

Is there any comparison performance benchmark for these Cornami chips on real world algorithms? The data given by https://cornami.com/fully-homomorphic-encryption-fhe/ doesn't really help me.


Keep in mind that this article was written by Cornami, so i would take any assertions about cornami solving all the problems with a huge heaping of salt.


Does the ACM have submission standards anymore? They wrote the article and refer to themselves in third person.


What an amazing coincidence.


I don't know anything about Cornami's products or where they are in the manufacturing stage. However, I do work in FHE.

To give you sense of performance, today you can multiply 2 encrypted 8192-bit values in BFV with typical (not optimal) scheme parameters in 17ms on a single core of an M1 Macbook Air. This is the most expensive operation by a wide margin. The ciphertexts for these parameters is about 0.5MB and the keys are maybe a meg or two.

The algorithm you want make fast for most schemes is the Number Theory Transform (NTT), which is basically a Fast Fourier Transform (FFT) for finite fields. This algorithm has only O(nlog(n)) operations, so the computational intensity relative to memory accesses is fairly low. This stands in contrast to something nice like matrix multiplication where matrices are O(n^2) but require O(n^3)[1] computation. Unfortunately due to Amdahl's law, you have to make not just NTT fast, but all the other boring O(n) operations schemes need to do.

If you want to make FHE fast enough to justify an ASIC, you'll have to avoid data movement and basically keep everything in on-chip SRAM. Waiting 400 clock cycles for data is a non-starter. For algorithms with bootstrapping, your bootstrapping key might be 100MB, so you'll probably want a chip with like 512MB of on-chip memory to hold various keys, ciphertexts, etc. You then need to route and fan-out that data as appropriate.

You then need to also pack a ton of compute units that can quickly do NTTs on-chip, but are also versatile to do all the other "stuff" you need to do in FHE, which might include multiplication and addition modulo some value, bit decomposition, etc. And you'll probably doing operations on multiple ciphertexts concurrently as you traverse down an arithmetic or binary circuit (FHE's computational model). Figuring out the right mix of what an ALU is and how programmable it needs to be is tricky business.

For larger computations, maybe you stream ciphertexts in and out of DRAM in the background while you're computing other parts of the graph.

Making an FHE accelerator is neither easy nor cheap (easily a 50-100M+ investment), but I think it is possible. My SWAG is that you might be able to turn the 17ms latency into like 50-100us but with way more throughput to execute a large circuit graph(s).

[1]: Strassen algorithm git out of here


I feel like the real problem is that much like QC people HE people selling it as "you can do some arbitrary computation [faster/completely securely]". Both QC and HE as currently theorized are extraordinarily limited in the kind of computations that can be performed. In the case of HE I cannot see how any of the schemes could be made to perform their quintessential example of an encrypted query applied to an encrypted database producing an encrypted result. Equivalently in QC land you have Grover's algorithm making DB queries magically faster.

Old man shakes fist at clouds.


I understood about 20% of that but I really appreciate the comment.

Are there any companies doing pioneering work on this now? What aspects of FHE does your employer do? How would you say the future is looking for FHE?


We're building an FHE compiler[1] and an accompanying zero-knowlege proof (ZKP) library for proving things about encrypted quantities. As with much of today's cutting-edge crypto, we're targeting Web3 applications, as this is an area where we see immediate use cases. However, our compiler and ZKP libraries are stand-alone so you can definitely use them in other applications.

My impression is that there are many parallels computing at large where custom hardware is becoming more and more prevalent. You can run arbitrary C programs with FHE by building a CPU out of binary gates and running on that, but it will run at 1Hz[2] emulated on 8GPUs. So, computing fibonacci(5) takes like 16 minutes. Conversely, you could create an arithmetic circuit that does it in like 16us. However, working with circuits is hard, let alone the additional requirements FHE imposes.

Today, our compiler lets you write Rust code that turns into an arithmetic circuit in the BFV scheme. It also manages parameter selection, which is another annoying part of FHE: choose parameters too small and decryption will fail due to excessive noise, but larger parameters slow down the computation and make ciphertexts larger.

Overall, the FHE has a ton of promise, but is currently in the chicken and egg phase whereby there isn't much commercially available because there isn't a market because there isn't anything available. We're trying to be an egg and grow along with a market. FHE is a big area and there's a ton to explore, like multi-key encryption[3]. FHE is currently nascent, but I believe its future is bright where it can be appropriately used.

[1]: https://docs.sunscreen.tech/

[2]: https://www.usenix.org/conference/usenixsecurity21/presentat...

[3]: https://eprint.iacr.org/2020/180


If no one answers here, might try on the FHE.org discord, loads of researchers there who probably wrote a paper on exactly that.


The subreddit /r/crypto is good quality as well (some researchers there).


Any chance of an IRC or Matrix bridge?


thanks! will try there


This feels more like a press release than an actually insightful article.

Would practical FHE be interesting? Sure. Is it happening? Doesn't seem like it is any time soon.


Our team has been working on making FHE practical. Performance has come a long way in the past few years so FHE can indeed be "practical" for certain applications.

If you'd like to check it out yourself, feel free to take a look at our team's FHE compiler and playground [0].

[0]: https://playground.sunscreen.tech/


I don't think it was just a press release, the linked PDF had a nice overview of how we got here and some advances in the last decade. Decent little review type article with some hyperbole!

I think the title maybe a little too optimistic / vague by saying it's "near" without indicating what else is needed to get there / when it might happen ;).


The byline is "Mache Creeger, Cornami Inc.". Cornami is a company that sells FHE accelerator chips. Many of the talking points seem to be very similar to the Cornami website, such as the over-emphasis on post quantum crypto despite being very irrelavent in context.


Somehow missed it, read before caffeine, good call


Doesn’t fully homomorphic encryption have the Tux Image problem cited in block cypher discussions?

With a symmetric cipher, I could figure out the blood type of every employee pretty easily. With an asymmetric cipher, I could figure out everyone who has my blood type, and the blood types of anyone who reveals that information.

If the point is to filter data when you aren’t allowed to know what the data is, then the act of being in the filter or not reveals some of that information. It’s just a game of twenty questions.


I think you are mixing it up with order-preserving encryption and other stuff related to encrypted databases.

In the FHE model, the assumption usually is - you have some data, someone else does some encrypted calculations, you get the encrypted answer back, you decrypt the answer and read it. The adversary cannot play 20 questions because they only calculate the encrypted answers, they are not allowed to see what the answers are.


Ah, right, I'd forgotten that bit. Dumb server, smart clients.


I think you are thinking of how ECB ends up with the identical blocks having identical encrypted form due to key and IV reuse. I don't think this is a requirement for all forms of FHE.


In particular, most FHE schemes inherently add randomness to encryptions as an artifact of using Ring Learning with Errors (RLWE) for hardness. This means that Enc(pk, m) != Enc(pk, m) if you run the algorithm twice; each key and message pair can produce many different ciphertexts.


It's always nice to see when some new field has managed not to experience every single classic blunder firsthand to learn not to do that. So there is something akin to a salt in the data that keeps identical records from being searchable, that's good to know.

Do you by chance have a simple way to explain how the search works then? Because superficially it seems like you might assume that you're looking for Enc(pk, m') == Enc(pk, m) and apparently that does not work.


By search, I assume you mean how would you do a database search with FHE referred to in the article. A simple example of private information retrieval is as follows:

Suppose Bob has an array of data he arranges into an mxn matrix, A. This data is not encrypted, but is encoded appropriately. Note that many FHE schemes allow you to compute ciphertext-plaintext operations.

Alice can send him 2 vectors x and y encrypted under her key, where x and y are all zero except for single 1. Bob homomorphically computes Ax = b. Since x is all zeros except for element i, the operation Ax effectively selects the ith column of A. Bob then computes dot(b, y). Since y is all zeros except for a 1 at element j, the dot product effectively selects the jth row of y. Bob sends the dot product back to Alice, which due to FHE is still encrypted under her key.

Alice decrypts the result and has looked up the j,ith element in A without Bob learning Alice's query or which data was involved in processing her search.

The default program on the Sunscreen[1] compiler playground shows this exact algorithm.

Disclaimer: I am an employee of Sunscreen.

[1]: https://playground.sunscreen.tech/


Looks like some kind of ad that tries to discredit regular encryption by claiming that it's already compromised (it isn't), or that it will be very soon. But lo! Here is the knight in shining armour coming to the rescue (FHE)! Soon. Maybe.


No, the point of FHE isn't that regular encryption is already compromised. It's that you can do processing on encrypted data while it's encrypted, without decrypting it. This opens up many more possibilities. For example, a cloud provider might store your data only in encrypted form and you can still do queries to pick out particular data or do some basic analysis, with the algorithm running on cloud computers, the result delivered to you in encrypted form, which you then decrypt with your private key.

The only problem is that there's a large performance penalty still, though there has been major progress in making it more efficient.


The article does put quite a lot of emphasis on "soon broken" encryption algorithms, so can't blame kebman for that comment. It only mentions what you pointed out almost in passing.

Also not an expert here, but if "Valuable insights through AI (artificial intelligence), big data, and analytics can be extracted from data", then you'd be a fool to believe this will protect your privacy, right? Or am I missing something? I want encryption that protects me from corporations, not encryption that protects the data corporations have from us and ups their surveillance game. I guess it's no coincidence a lot of research seems to be done by M$.


If FHE could be made to work (and we are a long way off), the cloud provider would not be able to see your data even though you're doing processing on the data that remains on the cloud provider's CPUs, so in that sense it would protect you from Amazon/Google/Microsoft/plug in your cloud provider here.


Then how can they extract "insights" from the data if they can't see the data? At what point do the "insights" defeat the purpose of protecting your data from the cloud provider? Or are the claims in the article bogus?


fhe solves a ton of issues in sass products that don’t look great under audit. things where we sign off on audit today with fancy contracts called “data privacy agreements.” i think it will take some time (20 years?) but i expect zero knowledge for most of your data to be table stakes for saas offerings


Saas is all about collecting, controlling, and exploiting your data though. If they can still mine your data and leverage or sell the information they get out of it that's not really "zero knowledge". I don't expect companies will stop being interested in making money at all costs in 20 years, especially where the costs are mostly to you and your privacy.


Actually saas is about providing services for money.


Money and control is what it's about. That usually means making someone dependent on you in order to access/use their own stuff, making it hard to migrate their data away from your service, and taking every advantage of the data being collecting.

I've never seen a saas product that isn't using and/or "sharing" their customer's data for their own benefit somehow. If they exist at all, they're the exception and not the rule.


This is completely wrong way to view SaaS. It's just about making money, the control part is just so they can try and squeeze more money out of you. Control is not the goal. Money is.


I guess that's fair... ultimately money is everything, but I do think there are absolutely companies who highly value the control aspect as well. It can give them the ability to censor, act as a gatekeeper, and nickel and dime.

Saas seems a lot more predatory and risky than most products/services. You hand over money, you hand over control, you hand over your data and all of it leaves you varying degrees of vulnerable.

I guess I shouldn't expect a pragmatic view of saas to be popular around here (some of you are likely working on your own saas projects after all), but the reasons saas is attractive for companies to offer are the same reasons that make me hesitate to use them.


Money is the goal of the individual Cogs in the machine. Control is the goal of the machine itself, which uses money to incentivize (power) its Cogs. If you're a Big Tech company you essentially have endless free money to leverage compared to your Cogs. The machine itself doesn't care about money. That free money train is coming to an end though over the next decade. Or at least that's the reality I'm planning for. :)


Typically that "service" involves doing something with the data server side (such as displaying it in a slick website).

FHE does have potential applicability here, but i think the potential is a bit overblown because there are a lot of devil in the details issues.


This is a really misleading article. It skims over lots of practical issues with FHE, such as the cost of the extra work which will severely limit its applicability, and more critically, the necessity to use the same key(!!) to encrypt/decrypt every input and output. It also conflates FHE with quantum-resistant encryption, simply because most FHE algorithms currently use lattice-based math, and a 2006 paper observed that no quantum algorithms were known that could outperform traditional computing for lattice math. Not a very strong claim, imo.

It goes on to grossly overstate the extent to which current IT systems are at risk as well as the extent to which FHE would even address actual IT threats. Plus, anytime anyone claims something is “provably secure”, they are leaving out crucial parts of the system, like the interface with humans or key rotation.

And then there’s the part where the author is a VP of business development at a company that makes FHE hardware. Sigh.


Great write up on the state of the field, but when I checked last, the current problem is performance. I didn't see much on that in the article.

A few years ago there were papers on evaluating simple logic circuits in an FHE context and it took 2h hours for what was basicially 5-6 NOR gates.


Performance is still problematic for many applications (particularly in ML).

Our team has been working on making FHE more accessible to engineers via a compiler; we've found usability to be a much bigger obstacle than performance.

You might be surprised to see how far performance has come! For (an admittedly small example of) matrix-vector multiplication, we can do key generation, encryption, computation, decryption, and compilation in less than 5 seconds on a MacBook [0].

[0]: https://playground.sunscreen.tech/


Fwiw we have at least some reason to hope in this general context that between clever systems work and tightening theoretical bounds via additional assumptions and clever reasoning we might get to practical implementations for some applications.

As (maybe weak) evidence the progress on practical implementations of PCPs/SNARGs

https://dl.acm.org/doi/pdf/10.1145/2641562


It’s much much faster now, and performance is improving 10x every couple of year. With the current trend, FHE will be applicable to 80% of usecases by 2025


I am still amazed by the amount of people that still mix security and encryption in-use with privacy. And truth is the whole privacy/security industry is doing nothing to change that.

Take this for example

"Valuable insights through AI (artificial intelligence), big data, and analytics can be extracted from data—even from multiple and different sources—all without exposing the data, secret decryption keys, or, if need be, the underlying evaluation code."

FHE gives you NO guarantee about the code that is running on the encrypted data. I can run a leaky AI model or a SELECT * on encrypted data and still get the output. What I can do (and that's assuming there is open-sourced, auditable code) is to make sure that anyone with hypervisor access on that machine cannot dump my data out during processing.

A very powerful concept for remote processing, supply chain security, and overall reducing trust; but completely unrelated to privacy.


> FHE gives you NO guarantee about the code that is running on the encrypted data. I can run a leaky AI model or a SELECT * on encrypted data and still get the output. What I can do (and that's assuming there is open-sourced, auditable code) is to make sure that anyone with hypervisor access on that machine cannot dump my data out during processing.

I might be misunderstanding but i think this is misleading. Any code can be run, but the person running the code cannot see the results (or any side effects), so they cannot leak data.


Depends on what do you mean by "person running the code"

If by person you mean the admin of the machine, then yes. If by person you mean the developer of the FHE-based application, then "maybe" If by person you mean the analyst who would in the end order an AI python model to be executed through the FHE-based software on a machine. Then no, that person will in the end get back human-readable results. Be that a model, or a table from an SQL DB running in FHE


The way I have understood FHE is that any algorithm that would operate on the data would, by definition, be unable to produce any result that was intelligible to anyone except the person holding the original key. At no point during the execution of an FHE algorithm is the data decrypted. The amazing thing is exactly that the code running on the data does not understand the data it is consuming nor the data that it is producing.

Maybe someone who has actually studied homomorphic encryption can chime in.


Thats true. But it will still produce some data, and that data will be viewed by someone eventually who owns the key to decrypt it. FHE tells you nothing about what this product should be. It could as well be a full copy of the original data.

For example: I run an ML model using FHE on some data I shouldn't have access to in plaintext. The expected outcome of this workflow is a trained ML model on that data. FHE tells me nothing about the quality of this model. It could as well be an overfit model that spits out all the sensitive data.


Sorry but I still fail to see how that would be a problem, since the output of the program (e.g. the ML model parameters) would themselves not be intelligible to you. To make _any_ (non-cryptanalytical) inference on the plaintext of the homomorphically encrypted data _necessarily_ requires that the attacker at some points can access or execute some classical code on the plaintext. This would obviously violate the "fully" part of FHE.

Edit: Okay so I might now understand you refer to a scenario where the user submits their data in homomorphic form to the cloud, where an AI model is trained on it. The AI model parameters are later returned to the user's device, which then decrypts them with the user's key and executes a classical model with those parameters, and then resubmits the user's data after processing with the said ML model (unencrypted) back to the cloud. It's true the user usually has no way of auditing the code / model that runs on their device, but isn't that rather easily alleviated by opening up the APIs for communicating with the cloud part of the service?


Close. The first part is fair.

A more real-life example.

I am a pharma company and I want to execute a query on some hospital data. The hospital doesn't want to give me the data in plaintext but they are fine with me getting some aggregate insights from their data that are not PII.

Now lets assume I decide to do that using FHE. I can now compute my query on the encrypted hospital data and I never see the plaintext data.

What do we "win" in this scenario? We can do this computation wherever we want because no matter where the computation is done, the data will be encrypted, so no risk for the infra provider to see that data.

What we don't "automatically win" in this scenario? 1. Guarantees that indeed I am running an SQL query on that data and not something else along with it -> That is only possible to guarantee if the FHE software is properly audited (same with any software tbh, but easier with FHE and similar techs because of the integrity guarantees due to encryption). 2. Guarantees that the SQL query I made will not leak patient data in the end (through linking additional data, or diff attacks) (same with any other SQL query)

People who are deep into these technologies will say "yes of course" thats not an FHE problem. And that is true. But every FHE vendor I've seen blur that difference by not specifying what kind of attacks they protect against when they talk about "protecting privacy".

Heck, most of them they don't even talk about the attestation process and how their clients can make sure that they can trust the software running in encrypted form. Yes, these hold true for all software, but the point (for me) of encryption in-use is to make sure we hold software to a higher trust standard than today, not just replace a trusted party with another one.


This example is confusing because its unclear who the trusted parties are and who you are trying to protect the data from. Quite frankly this feels like you are mostly pointing out that FHE wont work if you use it incorrectly . Normal encryption won't work either if you give the bad guy your key.

> But every FHE vendor I've seen blur that difference by not specifying what kind of attacks they protect against when they talk about "protecting privacy".

Agree with you here. FHE is an impractical technology at this stage. I'm pretty sure all commercial FHE vendors are borderline scammers, and have a loose relationship with the truth.


For clarity, let's assume the hospital stores its records in plaintext. For the pharma company, the hospital encrypts the patient records with a secret key. Now they let the pharma company run their homomorphic algorithm and send the values back. Only problem is the pharma company can not read those results without having access to the key. FHE is completely redundant in this use case - the hospital could have simply run the pharma company's SQL and audited the code and outputs.

What is FHE actually good for then? Let's imagine you are a top secret agent and you get instructions to fly to Bulgaria as a part of your mission. You have other hostile agents constantly monitoring you, trying to understand your next move. But there's a problem - to buy a plane ticket to Bulgaria you need to know the name of its capital city. You can't just type it to Google, because these other agents have actually infiltrated the Google servers and can see everything you search (assume once you actually know the name of the capital, you can somehow buy the actual ticket without "them" knowing..)

Lukcily though, CloudCorp offers a public homomorphic query service for all world capitals. This service allows you to send a query for the capital of any country over an intercepted connection, and get back the result. Even if the hostile agents had infiltrated CloudCorp and were monitoring all your comms, they would not be able know which country's capital you just queried. Not even CloudCorp could do that, you are the only person who knows what you asked and what was the result.

How such service would be implemented is explained in good detail in this tutorial, completele with working code: https://github.com/homenc/HElib/tree/master/examples/BGV_cou...

P.S. The capital of Bulgaria is Sofia.


FHE only reveals information to the person who has the keys for it, not to arbitrary people in the middle. So if you had access to the keys for the input data, then you have access to the keys for the output data.


But the model itself is encrypted. You should assume a model trained on sensitive data at least partially includes the sensitive data and treat it as sensitive, as FHE intrinsically does. If you’re saying once you decrypt them model you need to keep treating it with sensitivity and not give it to untrusted compute in the plain, then yes you’re right. But that’s nothing to do with FHE because you stopped using FHE the moment you decrypted it. What stupidity you do after generations of PhD protected your data in untrusted compute using FHE is your stupidity alone and says nothing about FHE.

The other way I read what you’re saying you’re saying the holder of the model after they decrypt it may not be trusted with the model or the original data. But they hold the decryption key to both. So, why did you share the key to someone you don’t trust? That breaks the model too.


Wouldn't doing the same thing without FHE also result in the same problem?


Yeah, and many more. But I've seen multiple people argue that using FHE will magically solve all their privacy problems and its far from true. FHE (and similar technologies) solve a piece of that "puzzle" and most providers somehow gloss it over.


What i mean, is the only person who can learn anything about the data, is the entity who posesses the decryption key.

In any sane deployment of FHE the key holder is the person who owns the data, not the app developer and not the person "running" the program.


> A very powerful concept for remote processing, supply chain security, and overall reducing trust; but completely unrelated to privacy.

It is still related to privacy, but the privacy "attacker" is the execution place, allowing for outsourcing of computations and storage without running into data leaks or violations of data protection laws.

Maybe you use a different definition of privacy?


You are right. It is related to privacy, but its not the whole story. Thats what I am trying to say. Running FHE will not magically solve the "data leakage" problem of your AI models, and I believe that the people/companies who don't make that distinction are misleading.


This isn't an area I know much about so I'll stay out of the main conversation, but would like to point out that considering you wrote "completely unrelated to privacy" in your first comment, following it with "It is related to privacy, but its not the whole story. Thats what I am trying to say." makes me unsure what your point is or if you actually mean or understand it. Sorry for being blunt.


You are right to be blunt. I didn't want to define the separation of input and output privacy in a comment. In retrospect maybe I should have but I can't edit anymore. It is however common HN practice to pick out the words that suit someone and construct an very specific argument based on these words, often missing the spirit of the comment.

Yes, when I wrote the comment I had in mind "output" privacy while FHE is dealing with "input" privacy. It is related to privacy, but not in the way most people think about it.

If you go to a random person and ask them about privacy they will not think about the threat model of a cloud provider leaking their data, but they will think of the thread model of a pharma company knowing exactly what drug they bought and when. That notion of privacy is not covered by FHE(alone). And even the first notion of privacy is covered only if the FHE program has a way to attest itself so you know that what you expect to run is indeed what is running.


> FHE gives you NO guarantee about the code that is running on the encrypted data.

I'm open to correction, but it's my understanding that the strongest form of FHE allow users to submit an encrypted executable with embedded data as input, which is then processed by an untrusted server. I'd definitely call it an ultimate form of privacy. The computational cost is prohibitively expensive, and conditional branch is impossible in the standard implementation, so it's largely an academic exercise. But last time someone on HN told me currently the achievable performance on a modern computer is roughly equivalent to a 1970s mainframe, so I guess some niche applications are still possible.

Weaker forms of FHE don't have this level of privacy, and they do not claim so. Nevertheless, relevant development still represents progress on cryptography and privacy researches as a whole.


In the context of cryptographic protocols we sometimes use "privacy" to refer to the notion of "confidentiality". The latter is, I think, a cleaner word that avoids the collision with human notions of privacy.

In this case the real danger is that the availability of "privacy-preserving technologies" like FHE, MPC and Differential Privacy will actually do more to undermine human privacy than all the non-confidential tech that come before. This will mostly occur by allowing corporations to build sophisticated statistical/ML models using data that would previously never have been allowed out of its confidential silo.


Secure multiparty computation (MPC) does help here.

Suppose you have multiple organizations that want to run some computation on their joint data, without revealing their data to each other. Each organization has their own machine that runs the MPC protocol. They have full control over their machine, and can inspect that the code correctly executes the protocol. Only once all organizations agree, will the computation take place, and within the security model of the protocol, it is guaranteed that only the correct computation output is revealed to the designated parties.


> FHE gives you NO guarantee about the code that is running on the encrypted data.

You mean in order to validate the data is authentic? Otherwise the code running on the data is irrelevant, as it can't access the data itself (and thus preserves the before-mentioned privacy).


The code decrypts the data at some point (for use, presentation, etc). If the code is crappy, insecure, etc. then the data will be exposed, and the data being encrypted wont help at all...


Decrypted on the local computer, not the untrusted remote computer as it were.


If the same company makes the software at the local end, there's still code there than can expose the data or the encryption key...


Big assumption there.


I agree! That's why remote attestation or simply verifiability is such an important feature of these schemes. Semantic attestation ofc means having access to the source code and that IMO makes open source the natural choice. Audits might be an option where OSS is not desired for other reasons (likely business related). Not an expert on FHE, but confidential computing provides that attestation feature and it's just a matter of the software to make use of it.


> run a leaky AI model or SELECT *

That's the point though isn't it? Only the person who wants the results can get them or even see the inputs. That restricts the data available for shitty AI and precludes any Joe Schmo from scanning the whole database.

If your threat model is instead that you don't trust the FHE endpoint, then much how you want HTTPS termination to happen in a place you control you also in this case just encrypt the stuff you care about on your own devices.


Fully homomorphic encryption is a toolset, it's not a specific configuration.

Your scenario has these parties, 1) a patient whose data we're discussing, 2) the hospital they shared it with, and 3) a pharma company looking to use the data. The hospital wants to promote this use without leaking any PII.

You're right that the hospital has no idea about the queries ("the code") but they control the server and which messages it will send in response.

As you point out, the hospital wouldn't run a FHE database capable of full-text extraction specifically because that would amount to simply sending all the data to the pharma company.

Instead they'd run a specialized FHE-DB server which would, for instance, return only row counts. The pharma company would run secret queries and if the hospital had one or more patients who matched the query the pharma company would know to the contact the hospital and then once paperwork is signed they could rerun the query with a signed token from the hospital and finally the query would return the actual PII.


I think the killer app for FHE is an Ethereum-esque Globally distributed VM (yes eye-roll I hate Crypto/Blockchain nonsense as well). To me that was always the big interesting concept behind Ethereum, running some sort of code with persistent state. Obviously no free lunches so we gotta pay for that somehow to incentivize people to pay for power on computing equipment they aren't personally utilizing. But somehow "crypto" got caught on the literally first example of a distributed systems correctness: debit/credit of synchronized accounts.

I feel FHE combined with slightly cheaper cost might enable things like community run server-less apps that have user state stored and processed by untrusted nodes with persistent state stored and accessible only by the data-owner. E.g. a simple excel-esque web app which only serves the UI while State and calculations are running on this hypothetical system at no cost to the apps creator with me paying only for exclusively my usage. They provide the code but no one but me can extract my data and the results of any computations, and for the privilege I pay the system.

I miss the days of upload and forget software that just relies on client resources and so require little upfront investment from developers, I feel FHE plus distributed computing could enable this.

I am aware "Web3" claims to want this future as a concept but the cost and utter lack of confidentiality (I can observe all data to and from a contract as well as the sender/receivers identity) makes it a super-niche borderline useless VM. For distributed governance sure, it's a public ballot box (the preface to the first distributed systems example, a single account with credits), but for any application/user data absolutely unacceptable.


Multiple teams are working on FHE smart contracts, including us, so it’s definitely happening. Adding ZK to the mix would be awesome for scalability and indeed to avoid replicating the FHE computation


I don't really follow how a customer would be able to pay for their usage of a VM globally distributed across untrusted nodes without "crypto/blockchain nonsense" involved. Where in the system would the credit card endpoint be located?


I don't know if FHE and ZKP are related, but it seems to me that privacy is a huge topic in web3 right now.


Non-technical comment to consider the conseqeunces of FHE. This is not to diminish the amazing work that has gone into FHE, and the theoretical use cases for FHE in a few fields I've worked in are significant. The challenge I found in working with people who want the data is that they really do just want the data.

Examples include government agencies who used made up in-house encryption schemes to get their data sharing plan past their legal privacy and security gates and then there was a secret key a small cadre had who could unscramble it after it was distributed in the sector, researchers rejecting synthesized data for uncontrolled test environments because "it was too hard," when really they just wanted the data sets outside the legal controls on it, rejecting differential privacy queries because they didn't want to come up with or specify their queries first based on metadata and again just wanted the data, rejecting identifying the individuals with access to millions of peoples health information data because as institutions they felt entitled to it, banks and payment firms rejecting zero knowledge proofs of user attributes because it violated KYC, and these are just a few.

There has been a concerted effort to squeeze the data toothpaste out of the tube when it comes to health information and other types, and so I am ambivalent about FHE use cases because its primary use case is side stepping rules that protect the privacy of data subjects.

The question I would have is, if data synthesis, legal risk-based de-identification, differential privacy, and cryptographic tokenization protocols were insufficient, what technical improvement in actual accountability does FHE offer to data subjects, and given the size of the data sets this facilitates, what are the consequences of its failure modes?

Given the entire history of cryptography is defined by one party convincing their targets that a given scheme provides them security, the way that FHE scales to giving data collectors impunity "because it's encrypted!" seems like it is vulnerable to every criticism leveled at blockchains, where just because it's encrypted doesn't mean it isn't laundering.


"The question I would have is, if data synthesis, legal risk-based de-identification, differential privacy, and cryptographic tokenization protocols were insufficient, what technical improvement in actual accountability does FHE offer to data subjects, and given the size of the data sets this facilitates, what are the consequences of its failure modes?"

It doesn't. Because its not aiming at solving these problems. Encryption in-use is aiming to solve trusting hardware (and maybe code) you don't own. Privacy is a different (IMO more complex) problem.


The whole cloud provider using FHE usecase always seemed a bit utopian to me. As you say, most of the time they dont want to provide user privacy, they want your data. Maybe i could imagine some sort of B2B case where there are strong requirements working out, but i struggle imagining it for consumer use cases.

Not to mention, if you are outsourcing data computation, presumably its a lot of computation or you would do it yourself, so the overhead seems extra important in that case.

The most convincing case i've heard is blockchain stuff - where everything is distributed to non trusted parties. (Normally i hate bitcoin hype, but maybe FHE would let you do something interesting with it)


Yeah, I don't think this will work on commercial scale exactly for the same reasons as blockchain is useless for anything outside the illegal niches where you need to avoid the legal banking system.

You may let people store homomorphic data on your servers and even run your algorithms on that data, but you have no way of handling customer complaints or fine-tuning / debugging your service because you can not understand ANY of the customer data you are storing.

Similarly, blockchain sounds like a good idea until you need to reverse a transaction: https://www.pcmag.com/news/cryptocom-sues-woman-after-accide...


[dead]


I dont know what you are doing, but if you are using FHE explicitly for its post quantumness, you are doing something wrong as there are much much better choices if you need post-quantum versions of traditional primitives.


Setting aside issues about knowing where the data originates, such encrypted data is really really difficult to use if no-one is providing a way to link the possible calculations performed to observables. If all you have are encrypted inputs and outputs, it is unclear what is being modelled, for example.

I don't think FHE is primarily aimed at privacy use cases anyway, more at ways of cooperating etc. where transparency could be detrimental to some or all parties.


This article skips over the elephant in the room, via a couple of casual references to “performance”.

You did some experiments with HE in 2019 and it involves orders of magnitude slowdown — thousands of times slower than regular computation. I don’t see this speeding up either.


General purpose FHE is indeed quite slow. But by focusing on specific subproblems, like private information retrieval (get rows from a database without revealing anything about your queries), it is possible to achieve acceptable performance. There's been lots of recent improvement here: see papers [0] and [1] from this year, which both achieve GB/s throughput for private database lookups.

And for a more tangible demo of FHE, we built open-source webapps that let you privately browse Wikipedia [2] or look up live Bitcoin address balances [3]. This is FHE running in the browser today, returning results in seconds.

[0] https://eprint.iacr.org/2022/368 (disclaimer: this is our paper)

[1] https://eprint.iacr.org/2022/949

[2] https://spiralwiki.com

[3] https://btc.usespiral.com


To make it fast I think we'll need custom silicon. I wonder how many stealth startups exist working on FHE chips?


This isn't a problem that silicon can fix. FHE requires orders of magnitude more operations to be done, and/or more complex operations, to achieve the same results. GP silicon already runs these operations as fast as possible, it's just that there are too many of them, and they are not even parallelizable.


I was gonna say, but yeah if you can't make it more parallel, then yeah custom silicon definitely would bottleneck on that. TBF I know nothing about this, but I have done a lot of parallelization and speeds up of algos on FPGA that are relatively easy to parallelize and seen tremendous speed ups over general purpose CPUs.


I wouldn't hold my breath on this. Reducing it by x40 AND making sure that hardware is not leaky.


Wait, how would hardware be possibly leaky? The encryption/decryption can leak data, sure, but that's unrelated to FHE. If your hardware can leak info about the plaintext on the side processing the encrypted data, then you could do the same in software anyway, and the FHE scheme itself is clearly broken...


It's clear v4dok doesn't understand how FHE works, maybe due to confusion with enclaves like SGX. The hardware isn't able to leak the plaintext because it doesn't have the key; it executes on the ciphertext.


The hardware is not relevant if it's simply being used as an accelerator. FHE reveals no information about the underlying data, so the privacy leakage is the same, no matter if you're running the FHE code on ENIAC or on a modern supercomputer.


At least the one that employs the author of this article.


Are the used operations that exotic that special hardware can achieve massive speedups?


Not at all, but imagine having to perform 10.000 multiplications on numbers of 10.000 digits just to encrypt "This is my password".

And that, differently, for each access (write and read) to each cell in the table.

Not what you would say feasible right now.


Spiral is an interesting use case of FME for database fetching that just popped up, https://usespiral.com/

General FME computation is obviously not likely to be practical or cost effective any time soon, but there may be some specific use cases, like querying a database, where hiding the information you want to know from the server is advantageous. There are adversarial environments where knowing the information your adversary is interested in provides a competitive edge.

I'm curious to dig more into their implementation.


Happy to answer any questions! Yeah, we're taking a narrow application (fetching from a database) and making it really practical. You can privately check a Bitcoin balance at https://btc.usespiral.com or privately read Wikipedia at https://spiralwiki.com. Our code for the Bitcoin site is at https://github.com/spiralprivacy/clients.


If you want to see a practical use case of Fully Homomorphic Encryption (FHE) with Machine Learning.

> https://www.zama.ai/post/titanic-competition-with-privacy-pr...

Its main ambition is to show that FHE can be used for protecting data when using a Machine Learning model to predict outcomes without degrading its performance.

Disclaimer: I'm working at Zama (cited in the article posted).



There is also use cases listed here:

> https://fhe.org/fhe-use-cases

Also anyone can contribute and add resources as it lives on an open source github repo.


I've just watched a presentation about Cosmian (https://cosmian.com/) and their solution boasts using FHE, at a significant price though: computations and queries are about 1000 slower than on unencrypted data according to their CTO. I was quite impressed that it even works at all, though :)


There are a few companies in this space. Duality is one of the bigger ones. https://dualitytech.com/


Don’t those sorts of differences suggest that perhaps enclave decryption would be as fast and support more functionality?


Probably but the main objective is to keep data safe in the cloud while keeping everything encrypted (traffic and data), all the time.


That's certainly fair. One of the oldest classic physical security failures I know of, which really stuck with me, is "who did background checks on the janitors?" It doesn't literally have to be janitors, but there are a lot of people in your space that we assume are not threats, even though there is a lot of overlap between "hacker" and "people who work for vendors".

I knew someone who was responsible for delivering backups from a secure data center to a lockbox every couple of days. Unfortunately the bank was only a few blocks from the data center so I'm not sure how much physical separation that really provided. Also this particular person would have been able to do absolutely nothing about being mugged for the disks if someone actually cared. But maybe I'm a little too paranoid.

I recall once having to drop the night's deposit off from the restaurant I worked at. They were down a manager due to illness, it was on the way home, I was a figurative if no longer a literal boyscout, so the math on "shenanigans if hinkley leaves with the money" versus "shenanigans if there is no night manager in the store" apparently leaned toward me. I'm glad they trusted me but I was a nervous wreck for four blocks until that bag went into the night deposit box.

Anything that can move "precious cargo" without a human failure mode is alright in my book.


"Only the paranoid survive" is my motto :)


> To achieve unrestricted homomorphic computation, or FHE, you must choose F to be a set of functions that is complete for all computation (e.g., Turing complete). The two functions required to achieve this goal are bit Addition (equivalent to Boolean XOR) and bit Multiplication (equivalent to Boolean AND), as the set {XOR, AND} is Turing complete.

This - "set {XOR, AND} is Turing complete" - is incorrect. You need to also have "true" constant.


Wait, has any quantum computers performed a real computation faster than a normal computer one yet ?


No "useful" computation yet, but "yes", quantum hardware has been able to sample from some programmable convoluted contrived probability distributions from which classical computer can not sample efficiently. But that is irrelevant here because:

More importantly, FHE and post-quantum crypto are two completely orthogonal topics.

Homomorphic is the property that (classical or quantum) computation can be performed on the encrypted (classical or quantum) data without decrypting it. Homomorphic encryption with classical data on classical computers is rather difficult and fascinating. Homomorphic encryption on quantum computers is trivially easy (if you already have a standard scalable quantum computer, which do not exist yet).

"Post-quantum" is the property that a classical computer can efficiently perform encryption that can not be broken even by a quantum computer.


> FHE (fully homomorphic encryption) provides quantum-secure computing on encrypted data

Today, if I understand it correctly, that means the encryption can't be broken on a computer with resources < whatever is required to calculate the square root of 16 ;)


Something is obviously not quantum-secure if it's broken on a classical computer. FHE schemes in particular are instantiated with schemes that are believed to offer both classical security and post-quantum security.


It was a joke, built on some comments on a recent Security Now podcast, where they made fun of Quantum computing's current inability to compute the simplest things accurately, and the possibility that Quantum computing never will evolve into something that can surpass legacy computers.


Healthcare is stuck in pre 2000 IT technologies because of privacy concerns. I hope that with FHE, Heathcare providers can move to cloud technologies without fear of losing privacy


They can today with other PETs like confidential computing. But change takes time and a lot of education when it's based on new types of technology.


fhe doesn't solve this at all. if the problem takes little enough power do be run with the, you can run it without cloud compute or fhe easily.


Well, it's not about compute power and whether you can run things on-prem. It's about sharing, synchronizing, and collaborating on data. That's where PETs open up new opportunities.


>• Opportunities will exist for new data-licensing revenue models that do not risk confidential data disclosure.

for health data, this is a game changer in so many ways


What is the actual value proposition of HE? The purpose of encryption is to hide information, if you are able to to do any meaningful comparison between two encrypted records, you have an information leak, and encryption has failed.


The value proposition is that the hosting provider running your computation does not know what was computed.


There is no information leak, because the information is never decrypted. The encrypted operations are performed directly on the encrypted data and the still-encrypted result is returned to the client, to be decrypted at their convenience to view the result. That is the magic of FHE.

Two identical cleartext values would likely not encrypt to the same ciphertext value (for example, you could get around that easily on the client end by simply incrementing any duplicate value by 1 before sending and then decrementing it by 1 again on return, assuming that is done undoably given the other operations happening); any comparison operation would also likely be encrypted and thus unknown to the server; so the server couldn't just linearly compare any two encrypted values to make deductions.


Imagine running an inference on a model in the cloud.

Usually the cloud will have access to your model. That poses a problem if your model is highly sensitive. (Imagine the NSA wanting to run a model on North Korean servers. NK would immediately snatch up that model.)

With FHE, you can theoretically avoid that. Someone can upload an encrypted model to the cloud. The cloud can do some computation on it (inference) and deliver an encrypted result. Then you can decrypt the result in the comfort of your own government^Whome.

Obviously this is a bit of a stupid example, but just think of all the scenarios right now where you'd want to offload your computation on someone else, but you don't want to let them see the computation.


here is a good project with HE

https://github.com/deroproject/derohe


[dead]


This has nothing to do with FHE...


200+ Top High-Quality Dofollow Backlinks Sites List: Many People Wanted To Create Backlinks But They Don't Know That Where To Create Backlinks And Which Website Are Best For Creating Backlinks So Guys If You Are Also One Of Them Then Don't You Worry Because I Will Guide You And Will Provide You Top 200+ Best Quality Dofollow Backlinks Site List Which Will Help You To Increase Your Website Domain Authority Your Website Domain Rating Page Authority And Many More


I know its low effort, but would an anti-homomorphic encryptionist be regarded homophobic?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: