scispace - formally typeset
Search or ask a question
Author

Erkang Xue

Bio: Erkang Xue is an academic researcher from China University of Geosciences (Wuhan). The author has contributed to research in topics: Differential privacy & Matrix decomposition. The author has an hindex of 2, co-authored 3 publications receiving 6 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.
Abstract: Matrix factorization is a powerful method to implement collaborative filtering recommender systems. This article addresses two major challenges, privacy and efficiency, which matrix factorization is facing. We based our work on DS-ADMM, a distributed matrix factorization algorithm with decent efficiency, to achieve the following two pieces of work: (1) Integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property; (2) Introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency. We named our work DS-ADMM++, in which one ’+’ refers to differential privacy, and the other ’+’ refers to quantized techniques. DS-ADMM++ is the first to perform efficient and private matrix factorization under the scenarios of differential privacy and DS-ADMM. We conducted experiments with benchmark data sets to demonstrate that our approach provides differential privacy and excellent scalability with a decent loss of accuracy.

13 citations

Journal ArticleDOI
TL;DR: This paper defines a tighter bound for binomial distribution and central limit theorem, and indicates that the reliability of the bound is related to the deviation of data, which can be measured by the data’s coefficient of standard deviation.
Abstract: A count-min sketch is a probabilistic data structure, which serves as a frequency table of events to process a stream of big data. It uses hash functions to map events to frequencies. Querying a count-min sketch returns the targeted event along with an estimated frequency, which is not less than the actual frequency. The estimated error, i.e., the difference between the estimated frequency and the actual, can be measured by a pre-defined confidence bound. However, the bound originally defined is too loose. The reason is that the Markov inequality used to derive the bound does not perform well. In this paper, based on binomial distribution and central limit theorem, we define a tighter bound. We indicate that the reliability of the bound is related to the deviation of data, which can be measured by the data’s coefficient of standard deviation. Our extensive experiments well support the effectiveness and efficiency of the new bound.

4 citations

Proceedings ArticleDOI
01 Aug 2019
TL;DR: The novelty of the work rests in that it is the first to successfully integrate the two innovative techniques to address both privacy and efficiency for MF.
Abstract: Matrix factorization (MF) is an essential technique to implement intelligent recommender systems widely applied in industry. Privacy and efficiency are two essential issues concerning MF. We leverage two techniques, differential privacy (DP) and distributed computing, to address the two concerns, respectively. (1) Differentially private MF is still challenging since conventional strategies lead to significant error accumulation; we adopt the objective function perturbation technique to tackle such a challenge. (2) We adopt the alternating direction method of multipliers (ADMM) framework to parallelize the factorization to improve performance; to implement this parallelization, we adopt the effective matrix split method and introduce a novel integration strategy for distributed DP based on the post-processing theorem. We identify our work as distributed differentially private MF based on ADMM. The novelty of the work rests in that it is the first to successfully integrate the two innovative techniques to address both privacy and efficiency for MF. We establish the mathematical model and conduct experiments to validate the soundness of our idea. The experimental results based on industrial datasets show that the distributed differentially private MF algorithm provides scalable speedup performance within a limited precision loss while preserving user privacy.

3 citations


Cited by
More filters
Posted Content
TL;DR: This survey provides a comprehensive and structured overview of the local differential privacy technology and summarise and analyze state-of-the-art research in LDP and compare a range of methods in the context of answering a variety of queries and training different machine learning models.
Abstract: With the fast development of Information Technology, a tremendous amount of data have been generated and collected for research and analysis purposes. As an increasing number of users are growing concerned about their personal information, privacy preservation has become an urgent problem to be solved and has attracted significant attention. Local differential privacy (LDP), as a strong privacy tool, has been widely deployed in the real world in recent years. It breaks the shackles of the trusted third party, and allows users to perturb their data locally, thus providing much stronger privacy protection. This survey provides a comprehensive and structured overview of the local differential privacy technology. We summarise and analyze state-of-the-art research in LDP and compare a range of methods in the context of answering a variety of queries and training different machine learning models. We discuss the practical deployment of local differential privacy and explore its application in various domains. Furthermore, we point out several research gaps, and discuss promising future research directions.

68 citations

01 Dec 2004
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.
Abstract: We introduce a new sublinear space data structure--the count-min sketch--for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc. The time and space bounds we show for using the CM sketch to solve these problems significantly improve those previously known--typically from 1/e2 to 1/e in factor.

65 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.
Abstract: Matrix factorization is a powerful method to implement collaborative filtering recommender systems. This article addresses two major challenges, privacy and efficiency, which matrix factorization is facing. We based our work on DS-ADMM, a distributed matrix factorization algorithm with decent efficiency, to achieve the following two pieces of work: (1) Integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property; (2) Introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency. We named our work DS-ADMM++, in which one ’+’ refers to differential privacy, and the other ’+’ refers to quantized techniques. DS-ADMM++ is the first to perform efficient and private matrix factorization under the scenarios of differential privacy and DS-ADMM. We conducted experiments with benchmark data sets to demonstrate that our approach provides differential privacy and excellent scalability with a decent loss of accuracy.

13 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.
Abstract: Matrix factorization is a powerful method to implement collaborative filtering recommender systems. This article addresses two major challenges, privacy and efficiency, which matrix factorization is facing. We based our work on DS-ADMM, a distributed matrix factorization algorithm with decent efficiency, to achieve the following two pieces of work: (1) Integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property; (2) Introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency. We named our work DS-ADMM++, in which one ’+’ refers to differential privacy, and the other ’+’ refers to quantized techniques. DS-ADMM++ is the first to perform efficient and private matrix factorization under the scenarios of differential privacy and DS-ADMM. We conducted experiments with benchmark data sets to demonstrate that our approach provides differential privacy and excellent scalability with a decent loss of accuracy.

7 citations

Journal ArticleDOI
TL;DR: This paper provides a state-of-the-art overview of approaches used in recognizing relational intelligence — with special focus given to Online Social Networks (OSNs).

7 citations