DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

doi:10.1109/TPDS.2021.3110104

Home
/
Papers
/
DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

Journal Article•DOI•

DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

Feng Zhang, Erkang Xue, Ruixin Guo, Guangzhi Qu¹, Gansen Zhao², Albert Y. Zomaya³ - Show less +2 more•Institutions (3)

University of Rochester¹, South China Normal University², University of Sydney³

01 Jun 2022-IEEE Transactions on Parallel and Distributed Systems (IEEE)-Vol. 33, Iss: 6, pp 1289-1302

TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.

read less

Abstract: Matrix factorization is a powerful method to implement collaborative filtering recommender systems. This article addresses two major challenges, privacy and efficiency, which matrix factorization is facing. We based our work on DS-ADMM, a distributed matrix factorization algorithm with decent efficiency, to achieve the following two pieces of work: (1) Integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property; (2) Introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency. We named our work DS-ADMM++, in which one ’+’ refers to differential privacy, and the other ’+’ refers to quantized techniques. DS-ADMM++ is the first to perform efficient and private matrix factorization under the scenarios of differential privacy and DS-ADMM. We conducted experiments with benchmark data sets to demonstrate that our approach provides differential privacy and excellent scalability with a decent loss of accuracy.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Proximal Alternating-Direction-Method-of-Multipliers-Incorporated Nonnegative Latent Factor Analysis

[...]

Fanghui Bi, Xin Luo, Bo Shen, Hongli Dong, Zidong Wang - Show less +1 more

01 Jun 2023-IEEE/CAA Journal of Automatica Sinica

TL;DR: Wang et al. as mentioned in this paper proposed a proximal alternating-direction-method-of-multipliers-based nonnegative latent factor analysis (PAN) model with two-fold ideas: 1) adopting the principle of alternating direction-method of multipliers to implement an efficient learning scheme for fast convergence and high computational efficiency; and 2) incorporating the proximal regularization into the learning scheme to suppress the optimization fluctuation for high representation learning accuracy to HDI data.

...read moreread less

Abstract: High-dimensional and incomplete (HDI) data subject to the nonnegativity constraints are commonly encountered in a big data-related application concerning the interactions among numerous nodes. A nonnegative latent factor analysis (NLFA) model can perform representation learning to HDI data efficiently. However, existing NLFA models suffer from either slow convergence rate or representation accuracy loss. To address this issue, this paper proposes a proximal alternating-direction-method-of-multipliers-based nonnegative latent factor analysis (PAN) model with two-fold ideas: 1) adopting the principle of alternating-direction-method-of-multipliers to implement an efficient learning scheme for fast convergence and high computational efficiency; and 2) incorporating the proximal regularization into the learning scheme to suppress the optimization fluctuation for high representation learning accuracy to HDI data. Theoretical studies verify that PAN converges to a Karush-Kuhn-Tucker (KKT) stationary point of its nonnegativity-constrained learning objective with its learning scheme. Experimental results on eight HDI matrices from real applications demonstrate that the proposed PAN model outperforms several state-of-the-art models in both estimation accuracy for missing data of an HDI matrix and computational efficiency.

...read moreread less

3 citations

Journal Article•DOI•

A Survey and Guideline on Privacy Enhancing Technologies for Collaborative Machine Learning

[...]

01 Jan 2022-IEEE Access

TL;DR: A detailed analysis of state of the art for collaborative ML approaches from a privacy perspective is provided in this paper , where a detailed threat model and security and privacy considerations are given for each collaborative method.

...read moreread less

Abstract: As machine learning and artificial intelligence (ML/AI) are becoming more popular and advanced, there is a wish to turn sensitive data into valuable information via ML/AI techniques revealing only data that is allowed by concerned parties or without revealing any information about the data to third parties. Collaborative ML approaches like federated learning (FL) help tackle these needs and concerns, bringing a way to use sensitive data without disclosing critically sensitive features of that data. In this paper, we provide a detailed analysis of state of the art for collaborative ML approaches from a privacy perspective. A detailed threat model and security and privacy considerations are given for each collaborative method.We deeply analyze Privacy Enhancing Technologies (PETs), covering secure multi-party computation (SMPC), homomorphic encryption (HE), differential privacy (DP), and confidential computing (CC) in the context of collaborative ML. We introduce a guideline on the selection of the privacy preserving technologies for collaborative ML and privacy practitioners. This study constitutes the first survey to provide an in-depth focus on collaborative ML requirements and constraints for privacy solutions while also providing guidelines on the selection of PETs.

...read moreread less

2 citations

Journal Article•DOI•

A Practical Second-order Latent Factor Model via Distributed Particle Swarm Optimization

[...]

Jialiang Wang, Yurong Zhong, Weiling Li

12 Aug 2022-arXiv.org

TL;DR: A practical SLF (PSLF) model is proposed, which realizes hyperparameter self-adaptation with a distributed particle swarm optimizer (DPSO), which is gradient-free and parallelized and indicates that PSLF model has a competitive advantage over state-of-the-art models in data representation ability.

...read moreread less

Abstract: — Latent Factor (LF) models are effective in representing high-dimension and sparse (HiDS) data via low-rank matrices approximation. Hessian-free (HF) optimization is an efficient method to utilizing second-order information of an LF model’s objective function and it has been utilized to optimize second-order LF (SLF) model. However, the low-rank representation ability of a SLF model heavily relies on its multiple hyperparameters. Determining these hyperparameters is time-consuming and it largely reduces the practicability of an SLF model. To address this issue, a practical SLF (PSLF) model is proposed in this work. It realizes hyperparameter self-adaptation with a distributed particle swarm optimizer (DPSO), which is gradient-free and parallelized. Experiments on real HiDS data sets indicate that PSLF model has a competitive advantage over state-of-the-art models in data representation ability.

...read moreread less

Proceedings Article•DOI•

Distributed-Particle-Swarm-Optimization-Incorporated Second-order Latent Factor Model

[...]

15 Dec 2022

TL;DR: In this article , a distributed adaptive SLF (DASLF) model is proposed, which realizes hyperparameter self-adaptation with a distributed particle swarm optimizer (DPSO), which is gradient-free and parallelized.

...read moreread less

Abstract: Latent Factor (LF) models are effective in representing high-dimension and sparse (HiDS) data via low-rank matrices approximation. Building an LF model is a large-scale non-convex problem. Hessian-free (HF) optimization is an efficient method to utilizing second-order information of an LF model's objective function and it has been utilized to optimize second-order LF (SLF) model. However, the low-rank representation ability of a SLF model heavily relies on its multiple hyperparameters. Determining these hyperparameters is time-consuming and it largely reduces the practicability of an SLF model. To address this issue, a distributed adaptive SLF (DASLF) model is proposed in this work. It realizes hyperparameter self-adaptation with a distributed particle swarm optimizer (DPSO), which is gradient-free and parallelized. Experiments on real HiDS data sets indicate that DASLF model has a competitive advantage over state-of-the-art models in data representation ability.

...read moreread less

Proceedings Article•DOI•

Optical Remote Sensing Image Deblurring Based on Deep Unfolding

[...]

17 Jul 2022

TL;DR: In this paper , the authors proposed a deblurring algorithm based on deep unfolding method, which is the combination of traditional algorithms and neural networks, which can achieve good performance and be interpretable at the same time.

...read moreread less

Abstract: Due to the atmospheric turbulence, defocusing, noise and other factors, the optical remote sensing image acquisition may become blurred. Therefore, it is critical of deblurring the images by algorithm. In recent years, neural network algorithms have shown excellent performance in optical re-mote sensing images deblurring. However, neural network algorithms have some limitations at the same time. They lack interpretability and need large amounts of training samples. The traditional deblurring algorithms are interpretable, but the performance is not as good as the neural network algorithms. In order to obtain an interpretable deblurring algorithm with good performance, this paper proposes a deblurring algorithm based on deep unfolding method, which is the combination of traditional algorithms and neural networks. It can achieve good performance and be interpretable at the same time. We demonstrate the effectiveness of the algorithm on remote sensing datasets with PSNR values and visual deblurring images. The experiments show the proposed algorithm has better deblurring results.

...read moreread less

References

PDF

Open Access

More filters

Book•

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

[...]

Stephen Boyd¹, Neal Parikh¹, Eric Chu¹, Borja Peleato¹, Jonathan Eckstein² - Show less +1 more•Institutions (2)

Stanford University¹, Rutgers University²

23 May 2011

TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.

...read moreread less

Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

...read moreread less

17,433 citations

Journal Article•DOI•

Matrix Factorization Techniques for Recommender Systems

[...]

Yehuda Koren¹, Robert M. Bell, Chris Volinsky•Institutions (1)

Yahoo!¹

01 Aug 2009-IEEE Computer

TL;DR: As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.

...read moreread less

Abstract: As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels

...read moreread less

9,583 citations

Book Chapter•DOI•

Calibrating noise to sensitivity in private data analysis

[...]

Cynthia Dwork¹, Frank McSherry¹, Kobbi Nissim², Adam Smith³•Institutions (3)

Microsoft¹, Ben-Gurion University of the Negev², Weizmann Institute of Science³

04 Mar 2006

TL;DR: In this article, the authors show that for several particular applications substantially less noise is needed than was previously understood to be the case, and also show the separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

...read moreread less

Abstract: We continue a line of research initiated in [10,11]on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which f = ∑ig(xi), where xi denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

...read moreread less

6,211 citations

Book Chapter•DOI•

Differential privacy

[...]

Cynthia Dwork¹•Institutions (1)

Microsoft¹

10 Jul 2006

TL;DR: In this article, the authors give a general impossibility result showing that a formalization of Dalenius' goal along the lines of semantic security cannot be achieved, and suggest a new measure, differential privacy, which, intuitively, captures the increased risk to one's privacy incurred by participating in a database.

...read moreread less

Abstract: In 1977 Dalenius articulated a desideratum for statistical databases: nothing about an individual should be learnable from the database that cannot be learned without access to the database. We give a general impossibility result showing that a formalization of Dalenius' goal along the lines of semantic security cannot be achieved. Contrary to intuition, a variant of the result threatens the privacy even of someone not in the database. This state of affairs suggests a new measure, differential privacy, which, intuitively, captures the increased risk to one's privacy incurred by participating in a database. The techniques developed in a sequence of papers [8, 13, 3], culminating in those described in [12], can achieve any desired level of privacy under this measure. In many cases, extremely accurate information about the database can be provided while simultaneously ensuring very high levels of privacy

...read moreread less

4,134 citations

Proceedings Article•DOI•

Robust De-anonymization of Large Sparse Datasets

[...]

Arvind Narayanan, Vitaly Shmatikov

18 May 2008

TL;DR: This work applies the de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world's largest online movie rental service, and demonstrates that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber's record in the dataset.

...read moreread less

Abstract: We present a new class of statistical de- anonymization attacks against high-dimensional micro-data, such as individual preferences, recommendations, transaction records and so on Our techniques are robust to perturbation in the data and tolerate some mistakes in the adversary's background knowledge We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world's largest online movie rental service We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber's record in the dataset Using the Internet Movie Database as the source of background knowledge, we successfully identified the Netflix records of known users, uncovering their apparent political preferences and other potentially sensitive information

...read moreread less

2,241 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Related Papers (1)

A Hybrid Recommender System for Pedagogical Resources

[...]

01 Jan 2022

Peng Zhou