scispace - formally typeset
Proceedings ArticleDOI

Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy

TLDR
The main idea is to first gather a candidate set of heavy hitters using a portion of the privacy budget, and focus the remaining budget on refining the candidate set in a second phase, which is much more efficient budget-wise than obtaining the heavy hitters directly from the whole dataset.
Abstract
In local differential privacy (LDP), each user perturbs her data locally before sending the noisy data to a data collector. The latter then analyzes the data to obtain useful statistics. Unlike the setting of centralized differential privacy, in LDP the data collector never gains access to the exact values of sensitive data, which protects not only the privacy of data contributors but also the collector itself against the risk of potential data leakage. Existing LDP solutions in the literature are mostly limited to the case that each user possesses a tuple of numeric or categorical values, and the data collector computes basic statistics such as counts or mean values. To the best of our knowledge, no existing work tackles more complex data mining tasks such as heavy hitter discovery over set-valued data. In this paper, we present a systematic study of heavy hitter mining under LDP. We first review existing solutions, extend them to the heavy hitter estimation, and explain why their effectiveness is limited. We then propose LDPMiner, a two-phase mechanism for obtaining accurate heavy hitters with LDP. The main idea is to first gather a candidate set of heavy hitters using a portion of the privacy budget, and focus the remaining budget on refining the candidate set in a second phase, which is much more efficient budget-wise than obtaining the heavy hitters directly from the whole dataset. We provide both in-depth theoretical analysis and extensive experiments to compare LDPMiner against adaptations of previous solutions. The results show that LDPMiner significantly improves over existing methods. More importantly, LDPMiner successfully identifies the majority true heavy hitters in practical settings.

read more

Citations
More filters
Journal ArticleDOI

A Survey on the Edge Computing for the Internet of Things

TL;DR: A comprehensive survey, analyzing how edge computing improves the performance of IoT networks and considers security issues in edge computing, evaluating the availability, integrity, and the confidentiality of security strategies of each group, and proposing a framework for security evaluation of IoT Networks with edge computing.
Journal ArticleDOI

A Survey on IoT Security: Application Areas, Security Threats, and Solution Architectures

TL;DR: A detailed review of the security-related challenges and sources of threat in the IoT applications is presented and four different technologies, blockchain, fog computing, edge computing, and machine learning, to increase the level of security in IoT are discussed.
Journal ArticleDOI

Securing Fog Computing for Internet of Things Applications: Challenges and Solutions

TL;DR: The architecture and features of fog computing are reviewed and critical roles of fog nodes are studied, including real-time services, transient storage, data dissemination and decentralized computation, which are expected to draw more attention and efforts into this new architecture.
Proceedings Article

Locally Differentially Private Protocols for Frequency Estimation

TL;DR: This paper introduces a framework that generalizes several LDP protocols proposed in the literature and yields a simple and fast aggregation algorithm, whose accuracy can be precisely analyzed, resulting in two new protocols that provide better utility than protocols previously proposed.
Journal ArticleDOI

Privacy-Preserved Data Sharing Towards Multiple Parties in Industrial IoTs

TL;DR: This paper proposes a privacy-preserved data sharing framework for IIoTs, where multiple competing data consumers exist in different stages of the system, and provides for both algorithms a comprehensive consideration on privacy, data utility, bandwidth efficiency, payment, and rationality for data sharing.
References
More filters
Book ChapterDOI

Differential privacy: a survey of results

TL;DR: This survey recalls the definition of differential privacy and two basic techniques for achieving it, and shows some interesting applications of these techniques, presenting algorithms for three specific tasks and three general results on differentially private learning.
Journal ArticleDOI

Randomized response: a survey technique for eliminating evasive answer bias.

TL;DR: A survey technique for improving the reliability of responses to sensitive interview questions is described, which permits the respondent to answer "yes" or "no" to a question without the interviewer knowing what information is being conveyed by the respondent.
Proceedings ArticleDOI

Learning to rank using gradient descent

TL;DR: RankNet is introduced, an implementation of these ideas using a neural network to model the underlying ranking function, and test results on toy data and on data from a commercial internet search engine are presented.
Proceedings ArticleDOI

Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

TL;DR: A family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network, based on a special kind of hashing that is called consistent hashing.
Related Papers (5)