scispace - formally typeset
Search or ask a question
Author

Michael Naehrig

Bio: Michael Naehrig is an academic researcher from Microsoft. The author has contributed to research in topics: Encryption & Homomorphic encryption. The author has an hindex of 38, co-authored 96 publications receiving 7912 citations. Previous affiliations of Michael Naehrig include Eindhoven University of Technology & University of Bristol.


Papers
More filters
Proceedings Article
19 Jun 2016
TL;DR: It is shown that the cloud service is capable of applying the neural network to the encrypted data to make encrypted predictions, and also return them in encrypted form, which allows high throughput, accurate, and private predictions.
Abstract: Applying machine learning to a problem which involves medical, financial, or other types of sensitive data, not only requires accurate predictions but also careful attention to maintaining data privacy and security. Legal and ethical requirements may prevent the use of cloud-based machine learning solutions for such tasks. In this work, we will present a method to convert learned neural networks to CryptoNets, neural networks that can be applied to encrypted data. This allows a data owner to send their data in an encrypted form to a cloud service that hosts the network. The encryption ensures that the data remains confidential since the cloud does not have access to the keys needed to decrypt it. Nevertheless, we will show that the cloud service is capable of applying the neural network to the encrypted data to make encrypted predictions, and also return them in encrypted form. These encrypted predictions can be sent back to the owner of the secret key who can decrypt them. Therefore, the cloud service does not gain any information about the raw data nor about the prediction it made. We demonstrate CryptoNets on the MNIST optical character recognition tasks. CryptoNets achieve 99% accuracy and can make around 59000 predictions per hour on a single PC. Therefore, they allow high throughput, accurate, and private predictions.

1,246 citations

Proceedings ArticleDOI
21 Oct 2011
TL;DR: A proof-of-concept implementation of the recent somewhat homomorphic encryption scheme of Brakerski and Vaikuntanathan, whose security relies on the "ring learning with errors" (Ring LWE) problem, and a number of application-specific optimizations to the encryption scheme, including the ability to convert between different message encodings in a ciphertext.
Abstract: The prospect of outsourcing an increasing amount of data storage and management to cloud services raises many new privacy concerns for individuals and businesses alike. The privacy concerns can be satisfactorily addressed if users encrypt the data they send to the cloud. If the encryption scheme is homomorphic, the cloud can still perform meaningful computations on the data, even though it is encrypted.In fact, we now know a number of constructions of fully homomorphic encryption schemes that allow arbitrary computation on encrypted data. In the last two years, solutions for fully homomorphic encryption have been proposed and improved upon, but it is hard to ignore the elephant in the room, namely efficiency -- can homomorphic encryption ever be efficient enough to be practical? Certainly, it seems that all known fully homomorphic encryption schemes have a long way to go before they can be used in practice. Given this state of affairs, our contribution is two-fold.First, we exhibit a number of real-world applications, in the medical, financial, and the advertising domains, which require only that the encryption scheme is "somewhat" homomorphic. Somewhat homomorphic encryption schemes, which support a limited number of homomorphic operations, can be much faster, and more compact than fully homomorphic encryption schemes.Secondly, we show a proof-of-concept implementation of the recent somewhat homomorphic encryption scheme of Brakerski and Vaikuntanathan, whose security relies on the "ring learning with errors" (Ring LWE) problem. The scheme is very efficient, and has reasonably short ciphertexts. Our unoptimized implementation in magma enjoys comparable efficiency to even optimized pairing-based schemes with the same level of security and homomorphic capacity. We also show a number of application-specific optimizations to the encryption scheme, most notably the ability to convert between different message encodings in a ciphertext.

1,053 citations

Book ChapterDOI
11 Aug 2005
TL;DR: This paper describes a method to construct elliptic curves of prime order and embedding degree k = 12 and shows that the ability to handle log(D)/log(r) ~ (q–3)/(q–1) enables building curves with ρ ~ q/(q-1).
Abstract: Previously known techniques to construct pairing-friendly curves of prime or near-prime order are restricted to embedding degree $k \leqslant 6 $. More general methods produce curves over ${\mathbb F}_{p}$ where the bit length of p is often twice as large as that of the order r of the subgroup with embedding degree k; the best published results achieve ρ ≡ log(p)/log(r) ~ 5/4. In this paper we make the first step towards surpassing these limitations by describing a method to construct elliptic curves of prime order and embedding degree k = 12. The new curves lead to very efficient implementation: non-pairing operations need no more than ${\mathbb F}_{p^4}$ arithmetic, and pairing values can be compressed to one third of their length in a way compatible with point reduction techniques. We also discuss the role of large CM discriminants D to minimize ρ; in particular, for embedding degree k = 2q where q is prime we show that the ability to handle log(D)/log(r) ~ (q–3)/(q–1) enables building curves with ρ ~ q/(q–1).

899 citations

Journal Article
TL;DR: In particular, for embedding degree k = 2q where q is prime, the authors showed that the ability to handle log(D)/log(r) ∼ (q - 3)/(q - 1) enables building elliptic curves with p ∼ q/(q- 1).
Abstract: Previously known techniques to construct pairing-friendly curves of prime or near-prime order are restricted to embedding degree k ≤ 6. More general methods produce curves over Fp where the bit length of p is often twice as large as that of the order r of the subgroup with embedding degree k; the best published results achieve p = log(p)/log(r) ∼ 5/4. In this paper we make the first step towards surpassing these limitations by describing a method to construct elliptic curves of prime order and embedding degree k = 12. The new curves lead to very efficient implementation: non-pairing operations need no more than F p 4 arithmetic, and pairing values can be compressed to one third of their length in a way compatible with point reduction techniques. We also discuss the role of large CM discriminants D to minimize p; in particular, for embedding degree k = 2q where q is prime we show that the ability to handle log(D)/log(r) ∼ (q - 3)/(q - 1) enables building curves with p ∼ q/(q - 1).

469 citations

Book ChapterDOI
28 Nov 2012
TL;DR: A new class of machine learning algorithms in which the algorithm's predictions can be expressed as polynomials of bounded degree, and confidential algorithms for binary classification based on polynomial approximations to least-squares solutions obtained by a small number of gradient descent steps are proposed.
Abstract: We demonstrate that, by using a recently proposed leveled homomorphic encryption scheme, it is possible to delegate the execution of a machine learning algorithm to a computing service while retaining confidentiality of the training and test data. Since the computational complexity of the homomorphic encryption scheme depends primarily on the number of levels of multiplications to be carried out on the encrypted data, we define a new class of machine learning algorithms in which the algorithm's predictions, viewed as functions of the input data, can be expressed as polynomials of bounded degree. We propose confidential algorithms for binary classification based on polynomial approximations to least-squares solutions obtained by a small number of gradient descent steps. We present experimental validation of the confidential machine learning pipeline and discuss the trade-offs regarding computational complexity, prediction accuracy and cryptographic security.

440 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.
Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

2,593 citations

Proceedings ArticleDOI
22 May 2017
TL;DR: This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.
Abstract: We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model's predictions on the inputs that it trained on versus the inputs that it did not train on. We empirically evaluate our inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon. Using realistic datasets and classification tasks, including a hospital discharge dataset whose membership is sensitive from the privacy perspective, we show that these models can be vulnerable to membership inference attacks. We then investigate the factors that influence this leakage and evaluate mitigation strategies.

2,059 citations

Proceedings ArticleDOI
08 Jan 2012
TL;DR: A novel approach to fully homomorphic encryption (FHE) that dramatically improves performance and bases security on weaker assumptions, using some new techniques recently introduced by Brakerski and Vaikuntanathan (FOCS 2011).
Abstract: We present a novel approach to fully homomorphic encryption (FHE) that dramatically improves performance and bases security on weaker assumptions. A central conceptual contribution in our work is a new way of constructing leveled fully homomorphic encryption schemes (capable of evaluating arbitrary polynomial-size circuits), without Gentry's bootstrapping procedure.Specifically, we offer a choice of FHE schemes based on the learning with error (LWE) or ring-LWE (RLWE) problems that have 2λ security against known attacks. For RLWE, we have:• A leveled FHE scheme that can evaluate L-level arithmetic circuits with O(λ · L3) per-gate computation -- i.e., computation quasi-linear in the security parameter. Security is based on RLWE for an approximation factor exponential in L. This construction does not use the bootstrapping procedure.• A leveled FHE scheme that uses bootstrapping as an optimization, where the per-gate computation (which includes the bootstrapping procedure) is O(λ2), independent of L. Security is based on the hardness of RLWE for quasi-polynomial factors (as opposed to the sub-exponential factors needed in previous schemes).We obtain similar results to the above for LWE, but with worse performance.Based on the Ring LWE assumption, we introduce a number of further optimizations to our schemes. As an example, for circuits of large width -- e.g., where a constant fraction of levels have width at least λ -- we can reduce the per-gate computation of the bootstrapped version to O(λ), independent of L, by batching the bootstrapping operation. Previous FHE schemes all required Ω(λ3.5) computation per gate.At the core of our construction is a much more effective approach for managing the noise level of lattice-based ciphertexts as homomorphic operations are performed, using some new techniques recently introduced by Brakerski and Vaikuntanathan (FOCS 2011).

1,924 citations

Proceedings ArticleDOI
12 Oct 2015
TL;DR: This paper presents a practical system that enables multiple parties to jointly learn an accurate neural-network model for a given objective without sharing their input datasets, and exploits the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously.
Abstract: Deep learning based on artificial neural networks is a very popular approach to modeling, classifying, and recognizing complex data such as images, speech, and text The unprecedented accuracy of deep learning methods has turned them into the foundation of new AI-based services on the Internet Commercial companies that collect user data on a large scale have been the main beneficiaries of this trend since the success of deep learning techniques is directly proportional to the amount of data available for training Massive data collection required for deep learning presents obvious privacy issues Users' personal, highly sensitive data such as photos and voice recordings is kept indefinitely by the companies that collect it Users can neither delete it, nor restrict the purposes for which it is used Furthermore, centrally kept data is subject to legal subpoenas and extra-judicial surveillance Many data owners--for example, medical institutions that may want to apply deep learning methods to clinical records--are prevented by privacy and confidentiality concerns from sharing the data and thus benefitting from large-scale deep learning In this paper, we design, implement, and evaluate a practical system that enables multiple parties to jointly learn an accurate neural-network model for a given objective without sharing their input datasets We exploit the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously Our system lets participants train independently on their own datasets and selectively share small subsets of their models' key parameters during training This offers an attractive point in the utility/privacy tradeoff space: participants preserve the privacy of their respective data while still benefitting from other participants' models and thus boosting their learning accuracy beyond what is achievable solely on their own inputs We demonstrate the accuracy of our privacy-preserving deep learning on benchmark datasets

1,836 citations