Unveiling the true value of privacy
Ask your clients, your friends, or anyone on the street: nobody likes to share their data, especially with people or companies they don't know. We routinely press 'no' on pop-ups asking for cookies in our browser, and of course you’re just as likely to say 'no' to any proposal to share your data to improve a product, even anonymously. People value privacy, even if that's somehow in contradiction with the fact that some of us also share the most private moments on social media.
Does this diffused privacy awareness influence how companies relate to their customers and manage their products and services? Well, not quite. Most of the time, organizations simply don't implement anything which would truly protect the privacy of their users. They typically say things like "your data is protected" or "we are trusted by big companies". And even where regulations and policies are in place, this might not always force these organizations to have particular protections, even when it comes to very sensitive data like health data or financial information. Why? Is it not possible to do more?
FHE: the future of privacy is already here
Luckily, there is a field of science and products dealing with this situation and proposing solutions. Cryptography is an ancient practice (and now, science) dedicated to securing communications from external actors threatening the privacy of the information shared. Throughout centuries, technology has continuously transformed codes and ciphers, looking for new and improved techniques able to withstand the challenges to data privacy, with potential applications across different technologies (AI, blockchain, Machine Learning).
In the case of Machine Learning, one of these solutions is privacy-preserving machine learning (PPML), whose goal is to provide machine learning services, with an additional protection to hide the data of the user, during either training or inferences.
Within this field, we could list three main techniques:
- The first one is basically having a trusted execution environment (TEE), which is a tampering resistant device receiving encrypted data, decrypting this data and performing the computations inside the secure enclave, and then returning the result encrypted. The security here relies on the physical security of the device
- The second one is called multi party computation (MPC), and has been known for a long time in cryptography. The main disadvantage of this solution is the enormous amount of communication between the parties. The security here relies on the fact that we expect at least one of the entities to be trusted
- The last solution is Fully Homomorphic Encryption (FHE), an encryption technique that enables data to be processed blindly without having to decrypt it
FHE has been known for decades, but it's only in the late 2000's that the first practical implementation was proposed. Contrary to classical cryptography, which is about protecting the data at rest or during transport, typically like we would do in https protocols, FHE also protects the data during computations. Instead of protecting the data just when you send it to the third party server (against eavesdroppers) but sharing the decryption key and the data with this server, you send an encrypted data that the third party server will never be able to see in the clear, but still be able to operate and compute on. A bit like if you received a message in a language you don't know, but were still able to run statistics on the number of used words or the length of the sentences.
FHE allows you to replace computations on data that you don't want to share by equivalent computations over encrypted data that no one but you will be able to read in the clear, as shown in this quick demonstration. Isn't it incredible? Doing health analysis without sending your medical record; asking trading companies for advice on your investments, without giving them any information on your belongings; sending your code to companies for security analysis, without sharing your intellectual properties: this is virtually possible with FHE.
Then the question is: why isn't it everywhere? Why isn't FHE everywhere already?
The answer is simple: it's not everywhere because it used to be too slow in practice, and the implementation/application of FHE used to be limited to a small group of cryptographers. Overcoming this second problem has long been part of our mission at Zama, where the focus is on creating open-source tools developers can use, even without any knowledge in cryptography. With Concrete ML you can turn your scikit-learn or Torch models into their equivalent models in FHE; you'll be surprised by how simple it is, and how accurate the results are.
Regarding speed, let's be clear: it's still true that some cases will still be too slow in practice. We don't pretend that all cases can be done in FHE today, with only CPUs. A minority of cases are indeed doable today in pure CPU, for example those that are for high-value inferences, and when there is no need for real-time. You are already used to waiting for the health analysis results, you may wait for the corresponding FHE execution. On the other hand, for other use-cases which require fast and frequent results, one may have to wait for the next revolution, which has already started: the rise of hardware accelerators. Many companies already took that fantastic business opportunity into consideration: big players like Intel or AMD are working on HW accelerators, as well as startups such as Cornami or Optalysys.
When the first products will become available, then we will be able to see improved speed of operations and truly explore the potential of the technology.
Benoit Chevallier-Mames is a security engineer and researcher and currently leads the Cloud and ML division at Zama, developing an FHE compiler and privacy-preserving ML libraries. He has spent more than 20 years between cryptographic research and secure implementations in a wide range of domains such as side-channel security, provable security, whitebox cryptography, fully homomorphic encryption and, more recently, machine learning. Prior to Zama, he securely implemented public-key algorithms on smartcards in Gemplus for seven years, worked for the French governmental ANSSI agency, and then designed and developed whitebox implementations at Apple for 12 years.