DPCube: Releasing Differentially Private Data Cubes for Health Information
read more
Citations
A data- and workload-aware algorithm for range queries under differential privacy
The matrix mechanism: optimizing linear counting queries under differential privacy
Quantifying Differential Privacy under Temporal Correlations
DPPro: Differentially Private High-Dimensional Data Release via Random Projection
Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions
References
Differential privacy: a survey of results
Privacy-preserving data publishing: A survey of recent developments
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
A firm foundation for private data analysis
Improved histograms for selectivity estimation of range predicates
Related Papers (5)
Frequently Asked Questions (11)
Q2. What is the basic idea of the technique?
The basic idea is to apply probabilistic inference to integrate multiple differentially private views (histograms) of the original data to derive posterior distributions over the data sets.
Q3. What is the key novelty of DPCube?
a cell based partitioning based on the domains (not the data) is used to generate a fine-grained equi-width cell histogram.
Q4. What is the key step in DPCube?
a multi-dimensional partitioning is performed on Dc, the differentially private cell histogram which gives an approximation of the original data distribution.
Q5. What is the privacy preserving data publishing layer?
The record publisher is used to publish de-identified (using HIPAA safeharbor method) or anonymized individual records with a given privacy principle such as k-anonymity and l-diversity.
Q6. What is the main goal of DPCube?
In contrast to kdtree construction which desires a balanced tree, their main goal is to generate uniform or close to uniform partitions so that the approximation error when answering a query with predicates smaller than the partitions is minimized.
Q7. What is the purpose of the demo?
Through the user interface,the conference audience can freely issue predicate queries using different parameter settings and observe the result.
Q8. How did the authors adapt the proportional estimation technique to their two-phase strategy?
In addition to the proportional estimation using only the subcube histogram assuming a uniform distribution within a partition [18], the authors also adapted the inference technique in [10] originally designed for its hierarchical strategy to their two-phase strategy.
Q9. What is the purpose of this paper?
The authors will demonstrate loading structured and unstructured data into HIDE, deidentifying and anonymizing the data, and releasing differentially private data cubes using DPCube.
Q10. How will the authors demonstrate the performance of HIDE and DPCube?
In addition, the authors will use a large set of synthesized pathology reports (1 million reports) generated at the Department of Pathology and Laboratory Medicine at the School of Medicine at UCLA to demonstrate the performance of HIDE and DPCube.
Q11. What is the difference between a kdtree and a dp cu?
In addition to the variance-like metric defined in [18], DPCube also implements information gain to favor uniform or homogenous distributions in a similar way used in a decision tree construction to favor the class homogeneity.