scispace - formally typeset
Open AccessPosted Content

Comparative Study of Differentially Private Synthetic Data Algorithms from the NIST PSCR Differential Privacy Synthetic Data Challenge

TLDR
In this paper, the authors present an in-depth evaluation of several differentially private synthetic data algorithms using actual differentially privacy synthetic data sets created by contestants in the 2018-2019 National Institute of Standards and Technology Public Safety Communications Research (NIST PSCR) Division's ''Differential Privacy Synthetic Data Challenge''.
Abstract
Differentially private synthetic data generation offers a recent solution to release analytically useful data while preserving the privacy of individuals in the data. In order to utilize these algorithms for public policy decisions, policymakers need an accurate understanding of these algorithms' comparative performance. Correspondingly, data practitioners require standard metrics for evaluating the analytic qualities of the synthetic data. In this paper, we present an in-depth evaluation of several differentially private synthetic data algorithms using actual differentially private synthetic data sets created by contestants in the 2018-2019 National Institute of Standards and Technology Public Safety Communications Research (NIST PSCR) Division's ``Differential Privacy Synthetic Data Challenge.'' We offer analyses of these algorithms based on both the accuracy of the data they created and their usability by potential data providers. We frame the methods used in the NIST PSCR data challenge within the broader differentially private synthetic data literature. We implement additional utility metrics, including two of our own, on the differentially private synthetic data and compare mechanism utility on three categories. Our comparative assessment of the differentially private data synthesis methods and the quality metrics shows the relative usefulness, the general strengths and weaknesses, and offers preferred choices of algorithms and metrics. Finally we describe the implications of our evaluation for policymakers seeking to implement differentially private synthetic data algorithms on future data products.

read more

Citations
More filters
Journal ArticleDOI

Synthetic Data - what, why and how?

TL;DR: In this paper , the authors propose approaches for empirically evaluating synthetic data, both in terms of its privacy and utility, and its utility and usefulness. But their approach is limited in scope.
Proceedings ArticleDOI

Synthetic and Private Smart Health Care Data Generation using GANs

TL;DR: In this paper, a GAN coupled with differential privacy mechanisms is proposed for generating a realistic and private smart health care dataset, which is not only able to generate realistic synthetic data samples but also the differentially private data samples under different settings: learning from a noisy distribution or noising the learned distribution.
Journal ArticleDOI

Statistical Data Privacy: A Song of Privacy and Utility

TL;DR: The statistical foundations common to both SDC and DP are discussed, major developments in SDP are highlighted, and exciting open research problems in private inference are presented.
Proceedings ArticleDOI

Archimedes Meets Privacy: On Privately Estimating Quantiles in High Dimensions Under Minimal Assumptions

TL;DR: This work shows how one can privately, and with polynomially many samples, output an approximate interior point of the FB, and produce an approximate uniform sample from the FB by constructing a private noisy projection oracle, all working under very mild distributional assumptions.
Book ChapterDOI

An Incentive Mechanism for Trading Personal Data in Data Markets

TL;DR: In this article, a pricing mechanism that takes into account the trade-off between privacy and accuracy is proposed to induce the data provider to accurately report her privacy price and optimize it in order to maximize the data consumer's profit within budget constraints.
References
More filters
Book ChapterDOI

Calibrating noise to sensitivity in private data analysis

TL;DR: In this article, the authors show that for several particular applications substantially less noise is needed than was previously understood to be the case, and also show the separation results showing the increased value of interactive sanitization mechanisms over non-interactive.
Proceedings ArticleDOI

Deep Learning with Differential Privacy

TL;DR: In this paper, the authors develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrate that they can train deep neural networks with nonconvex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
Proceedings ArticleDOI

Privacy integrated queries: an extensible platform for privacy-preserving data analysis

TL;DR: PINQ's unconditional structural guarantees require no trust placed in the expertise or diligence of the analysts, substantially broadening the scope for design and deployment of privacy-preserving data analysis, especially by non-experts.
Journal ArticleDOI

Data Privacy: Effects on Customer and Firm Performance

TL;DR: In this article, a conceptual framework grounded in gossip theory is used to link customer vulnerability to negative performance effects and show that transparency and control in firms' data management practices can suppress the negative effects of customer data vulnerability.
Related Papers (5)