scispace - formally typeset
Journal ArticleDOI

Secure, Privacy-Preserving Analysis of Distributed Databases.

Reads0
Chats0
TLDR
This article shows how tools from information technology—specifically, secure multiparty computation and networking—can be used to perform statistically valid analyses of distributed databases, and presents protocols for securely performing regression, maximum likelihood estimation, and Bayesian analysis.
Abstract
In industrial and government settings, there is often a need to perform statistical analyses that require data stored in multiple distributed databases. However, the barriers to literally integrating these data can be substantial, even insurmountable. In this article we show how tools from information technology—specifically, secure multiparty computation and networking—can be used to perform statistically valid analyses of distributed databases. The common characteristic of these methods is that the owners share sufficient statistics computed on the local databases in a way that protects each owner's data from the other owners. Our focus is on horizontally partitioned data, in which data records rather than attributes are spread among the databases. We present protocols for securely performing regression, maximum likelihood estimation, and Bayesian analysis, as well as secure construction of contingency tables. We outline three current research directions: a software system implementing the protocols, se...

read more

Citations
More filters

Registries for Evaluating Patient Outcomes: A User's Guide

TL;DR: Information on routine medical care and practice, with more clinical context than ever before, is provided.
Proceedings ArticleDOI

Privacy-Preserving Ridge Regression on Hundreds of Millions of Records

TL;DR: This work implements the complete system and experiments with it on real data-sets, and shows that it significantly outperforms pure implementations based only on homomorphic encryption or Yao circuits.
Proceedings ArticleDOI

Privacy-preserving matrix factorization

TL;DR: This work shows that a recommender can profile items without ever learning the ratings users provide, or even which items they have rated, by designing a system that performs matrix factorization, a popular method used in a variety of modern recommendation systems, through a cryptographic technique known as garbled circuits.
Journal ArticleDOI

DataSHIELD: resolving a conflict in contemporary bioscience—performing a pooled analysis of individual-level data without sharing the data

TL;DR: The aim of this conceptual article is to encourage others to address the challenges and opportunities that DataSHIELD presents, and to explore potential extensions, for example to its use when different data sources hold different data on the same individuals.
References
More filters
Book

Statistical Analysis with Missing Data

TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.
Book

Analysis of Incomplete Multivariate Data

TL;DR: The Normal Model Methods for Categorical Data Loglinear Models Methods for Mixed Data and Inference by Data Augmentation Methods for Normal Data provide insights into the construction of categorical and mixed data models.
Book

Discrete multivariate analysis: theory and practice

TL;DR: Discrete Multivariate Analysis is a comprehensive text and general reference on the analysis of discrete multivariate data, particularly in the form of multidimensional tables, and contains a wealth of material on important topics.