Privacy-Preserving Deep Learning via Additively Homomorphic Encryption.

Home
/
Papers
/
Privacy-Preserving Deep Learning via Additively Homomorphic Encryption.

Posted Content•

Privacy-Preserving Deep Learning via Additively Homomorphic Encryption.

Le Trieu Phong¹, Yoshinori Aono¹, Takuya Hayashi¹, Lihua Wang¹, Shiho Moriai¹ - Show less +1 more•Institutions (1)

National Institute of Information and Communications Technology¹

01 Jan 2017-IACR Cryptology ePrint Archive-Vol. 2017, pp 715

TL;DR: This work revisits the previous work by Shokri and Shmatikov (ACM CCS 2015) and builds an enhanced system with the following properties: no information is leaked to the server and accuracy is kept intact, compared with that of the ordinary deep learning system also over the combined dataset.

read less

Abstract: We present a privacy-preserving deep learning system in which many learning participants perform neural network-based deep learning over a combined dataset of all, without revealing the participant...

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Federated Machine Learning: Concept and Applications

[...]

Qiang Yang¹, Yang Liu, Tianjian Chen, Yongxin Tong²•Institutions (2)

Hong Kong University of Science and Technology¹, Beihang University²

28 Jan 2019-ACM Transactions on Intelligent Systems and Technology

TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.

...read moreread less

Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

...read moreread less

2,593 citations

Cites background or methods from "Privacy-Preserving Deep Learning vi..."

...The above architecture is proved to protect data leakage against the semihonest server if gradient aggregation is done with SMC [9] or homomorphic encryption [51]....
[...]
...However, no security guarantee is provided and the leakage of these gradients may actually leak important data information [51] when exposed together with data structure, such as in the case of image pixels....
[...]
...A horizontal federated learning system typically assumes honest participants and security against an honest-but-curious server [9, 51]....
[...]
...The authors of [51] used additively homomorphic encryption to preserve the privacy of gradients and enhance the security of the system....
[...]
...A typical assumption is that the participants are honest whereas the server is honest but curious; therefore, no leakage of information from any participants to the server is allowed [51]....
[...]

Posted Content•

Federated Machine Learning: Concept and Applications

[...]

Qiang Yang¹, Yang Liu, Tianjian Chen, Yongxin Tong²•Institutions (2)

Hong Kong University of Science and Technology¹, Beihang University²

13 Feb 2019-arXiv: Artificial Intelligence

TL;DR: This work proposes building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.

...read moreread less

Abstract: Today's AI still faces two major challenges. One is that in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federated learning, vertical federated learning and federated transfer learning. We provide definitions, architectures and applications for the federated learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.

...read moreread less

1,317 citations

Cites background or methods from "Privacy-Preserving Deep Learning vi..."

...al model parameters. Security Analysis. The above architecture is proved to protect data leakage against the semihonest server, if gradients aggregation is done with SMC [9] or Homomorphic Encryption [51]. But it may be subject to attack in another security model by a malicious participant training a Generative Adversarial Network (GAN) in the collaborative learning process [29]. 2.4.2 Vertical Federa...
[...]
...om an optimization algorithm like Stochastic Gradient Descent (SGD) [41, 58], however no security guarantee is provided and the leakage of these gradients may actually leak important data information [51] when exposed together with data structure such as in the case of image pixels. Researchers have considered the situation when one of the members of a federated learning system maliciously attacks oth...
[...]
...e centralized model together with other data owners. A secure aggregation scheme to protect the privacy of aggregated user updates under their federated learning framework is also introduced [9]. Ref [51] uses additively homomorphic encryption for model paramter aggregation to provide security against the central server. In[60],amulti-taskstylefederatedlearningsystemisproposedtoallowmultiplesitestocom...
[...]
...: X i = X j, Y i = Y j, I i , I j, ∀D i,D j,i , j (2) Security Definition.A horizontal federated learning system typically assumes honest participants and security against a honest-but-curious server [9, 51]. That is, only the server can compromise ACM Trans. Intell. Syst. Technol., Vol. 10, No. 2, Article 12. Publication date: February 2019. 12:6 Q. Yang et al. (a) Horizontal Federated Learning (b) Vert...
[...]
... or cloud server. A typical assumption is that the participants are honest whereas the server is honest-but-curious, therefore no leakage of information from any participants to the server is allowed [51]. The training process of such a system usually contain the following four steps: •Step 1: participants locally compute training gradients, mask a selection of gradients with encryption [51], differen...
[...]

Posted Content•

Federated Learning in Mobile Edge Networks: A Comprehensive Survey

[...]

Wei Yang Bryan Lim¹, Nguyen Cong Luong, Dinh Thai Hoang², Yutao Jiao¹, Ying-Chang Liang¹, Qiang Yang¹, Dusit Niyato¹, Chunyan Miao³ - Show less +4 more•Institutions (3)

Nanyang Technological University¹, University of Technology, Sydney², Hong Kong University of Science and Technology³

26 Sep 2019-arXiv: Networking and Internet Architecture

TL;DR: In a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved, this raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale.

...read moreread less

Abstract: In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to unacceptable latency and communication inefficiency. To this end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer to the edge, where data is produced. However, conventional enabling technologies for ML at mobile edge networks still require personal data to be shared with external parties, e.g., edge servers. Recently, in light of increasingly stringent data privacy legislations and growing privacy concerns, the concept of Federated Learning (FL) has been introduced. In FL, end devices use their local data to train an ML model required by the server. The end devices then send the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in mobile edge networks since it enables the collaborative training of an ML model and also enables DL for mobile edge network optimization. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved. This raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale. In this survey, we begin with an introduction to the background and fundamentals of FL. Then, we highlight the aforementioned challenges of FL implementation and review existing solutions. Furthermore, we present the applications of FL for mobile edge network optimization. Finally, we discuss the important challenges and future research directions in FL

...read moreread less

701 citations

Cites methods from "Privacy-Preserving Deep Learning vi..."

...Although both the encryption techniques presented in [153] and [79] can prevent the curious server from extracting infor-...
[...]
...In [153], the homomorphic encryption technique is introduced to protect privacy of participants’ shared parameters from a honest-but-curious server....
[...]

Journal Article•DOI•

A survey on security and privacy of federated learning

[...]

Viraaji Mothukuri¹, Reza M. Parizi¹, Seyedamin Pouriyeh¹, Yan Huang¹, Ali Dehghantanha², Gautam Srivastava³, Gautam Srivastava⁴ - Show less +3 more•Institutions (4)

Kennesaw State University¹, University of Guelph², China Medical University (Taiwan)³, Brandon University⁴

01 Feb 2021-Future Generation Computer Systems

TL;DR: This paper aims to provide a comprehensive study concerning FL’s security and privacy aspects that can help bridge the gap between the current state of federated AI and a future in which mass adoption is possible.

...read moreread less

565 citations

Book Chapter•DOI•

Deep Leakage from Gradients

[...]

Ligeng Zhu¹, Zhijian Liu¹, Song Han¹•Institutions (1)

Massachusetts Institute of Technology¹

21 Jun 2019

TL;DR: In this paper, the authors show that they can obtain the private training set from the publicly shared gradients, which is called deep leakage from gradient and practically validate the effectiveness of their algorithm on both computer vision and natural language processing tasks.

...read moreread less

Abstract: Passing gradient is a widely used scheme in modern multi-node learning system (e.g, distributed training, collaborative learning). In a long time, people used to believe that gradients are safe to share: i.e, the training set will not be leaked by gradient sharing. However, in this paper, we show that we can obtain the private training set from the publicly shared gradients. The leaking only takes few gradient steps to process and can obtain the original training set instead of look-alike alternatives. We name this leakage as \textit{deep leakage from gradient} and practically validate the effectiveness of our algorithm on both computer vision and natural language processing tasks. We empirically show that our attack is much stronger than previous approaches and thereby and raise people's awareness to rethink the gradients' safety. We also discuss some possible strategies to defend this deep leakage.

...read moreread less

450 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

[...]

John C. Duchi¹, Elad Hazan², Yoram Singer³•Institutions (3)

University of California, Berkeley¹, IBM², Google³

01 Jan 2010

TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.

...read moreread less

Abstract: We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints. We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.

...read moreread less

7,244 citations

Book Chapter•DOI•

Public-key cryptosystems based on composite degree residuosity classes

[...]

Pascal Paillier

02 May 1999

TL;DR: A new trapdoor mechanism is proposed and three encryption schemes are derived : a trapdoor permutation and two homomorphic probabilistic encryption schemes computationally comparable to RSA, which are provably secure under appropriate assumptions in the standard model.

...read moreread less

Abstract: This paper investigates a novel computational problem, namely the Composite Residuosity Class Problem, and its applications to public-key cryptography. We propose a new trapdoor mechanism and derive from this technique three encryption schemes : a trapdoor permutation and two homomorphic probabilistic encryption schemes computationally comparable to RSA. Our cryptosystems, based on usual modular arithmetics, are provably secure under appropriate assumptions in the standard model.

...read moreread less

7,008 citations

"Privacy-Preserving Deep Learning vi..." refers background in this paper

...For decryption and CPA security, see the paper [25]....
[...]

Reading Digits in Natural Images with Unsupervised Feature Learning

[...]

Yuval Netzer¹, Tao Wang¹, Adam Coates¹, Alessandro Bissacco², Bo Wu², Andrew Y. Ng² - Show less +2 more•Institutions (2)

Google¹, Stanford University²

01 Jan 2011

TL;DR: A new benchmark dataset for research use is introduced containing over 600,000 labeled digits cropped from Street View images, and variants of two recently proposed unsupervised feature learning methods are employed, finding that they are convincingly superior on benchmarks.

...read moreread less

Abstract: Detecting and reading text from natural images is a hard computer vision task that is central to a variety of emerging applications. Related problems like document character recognition have been widely studied by computer vision and machine learning researchers and are virtually solved for practical applications like reading handwritten digits. Reliably recognizing characters in more complex scenes like photographs, however, is far more difficult: the best existing methods lag well behind human performance on the same tasks. In this paper we attack the problem of recognizing digits in a real application using unsupervised feature learning methods: reading house numbers from street level photos. To this end, we introduce a new benchmark dataset for research use containing over 600,000 labeled digits cropped from Street View images. We then demonstrate the difficulty of recognizing these digits when the problem is approached with hand-designed features. Finally, we employ variants of two recently proposed unsupervised feature learning methods and find that they are convincingly superior on our benchmarks.

...read moreread less

5,311 citations

Proceedings Article•

Large Scale Distributed Deep Networks

[...]

Jeffrey Dean¹, Greg S. Corrado¹, Rajat Monga¹, Kai Chen¹, Matthieu Devin¹, Mark Z. Mao¹, Marc'Aurelio Ranzato¹, Andrew W. Senior¹, Paul A. Tucker¹, Ke Yang¹, Quoc V. Le¹, Andrew Y. Ng¹ - Show less +8 more•Institutions (1)

Google¹

03 Dec 2012

TL;DR: This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training.

...read moreread less

Abstract: Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. Downpour SGD and Sandblaster L-BFGS both increase the scale and speed of deep network training. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories. We show that these same techniques dramatically accelerate the training of a more modestly- sized deep network for a commercial speech recognition service. Although we focus on and report performance of these methods as applied to training large neural networks, the underlying algorithms are applicable to any gradient-based machine learning algorithm.

...read moreread less

3,475 citations

"Privacy-Preserving Deep Learning vi..." refers background in this paper

...1) Asynchronous SGD (ASGD) [16], [27], No Privacy Protection: Both our system and that of [28] rely on the fact that neural networks can be trained via a variant of SGD called asynchronous SGD [16], [27] with data parallelism and model parallelism....
[...]
...Our system achieves identical accuracy to a corresponding deep learning system (i.e., asynchronous SGD (ASGD)) trained over the joint dataset of all participants....
[...]
...3) Our System: Our system can be called gradientsencrypted ASGD for the following reasons....
[...]
...5] can be called gradients-selective ASGD for the following reasons....
[...]
...1) Asynchronous SGD (ASGD) [16], [27], No Privacy Pro-...
[...]

Proceedings Article•DOI•

Deep Learning with Differential Privacy

[...]

Martín Abadi¹, Andy Chu¹, Ian Goodfellow, H. Brendan McMahan¹, Ilya Mironov¹, Kunal Talwar¹, Li Zhang¹ - Show less +3 more•Institutions (1)

Google¹

24 Oct 2016

TL;DR: In this paper, the authors develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrate that they can train deep neural networks with nonconvex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

...read moreread less

Abstract: Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

...read moreread less

2,944 citations