scispace - formally typeset
Journal ArticleDOI

Inference Controls for Statistical Databases

Denning, +1 more
- 01 Jul 1983 - 
- Vol. 16, Iss: 7, pp 69-82
TLDR
Some of the controls of the inference problem in on-line, general-purpose database systems allowing both statistical and nonstatistical access are surveyed, divided into two categories: those that place restrictions on the set of allowable queries and those that add "noise" to the data or to the released statistics.
Abstract
The goal of statistical databases is to provide frequencies, averages, and other statistics about groups of persons (or organizations), while protecting the privacy of the individuals represented in the database. This objective is difficult to achieve, since seemingly innocuous statistics contain small vestiges of the data used to compute them. By correlating enough statistics, sensitive data about an individual can be inferred. As a simple example, suppose there is only one female professor in an electrical engineering department. If statistics are released for the total salary of all professors in the department and the total salary of all male professors, the female professor's salary is easily obtained by subtraction. The problem of protecting against such indirect disclosures of sensitive data is called the inference problem. Over the last several decades, census agencies have developed many techniques for controlling inferences in population surveys. These techniques are applied before data are released so that the distributed data are free from disclosure problems. The data are typically released either in the form of microstatistics, which are files of \"sanitized\" records, or in the form of macrostatistics, which are tables of counts, sums, and higher order statistics. Starting with a study by Hoffman and Miller,' computer scientists began to look at the inference problem in on-line, general-purpose database systems allowing both statistical and nonstatistical access. A hospital database, for example, can give doctors direct access to a patient's medical records, while hospital administrators are permitted access only to statistical summaries of the records. Up until the late 1970's, most studies of the inference problem in these systems led to negative results; every conceivable control seemed to be easy to circumvent, to severely restrict the free flow of information, or to be intractable to implement. Recently, the results have become more positive, since we are now discovering controls that can potentially keep security and information loss at acceptable levels for a reasonable cost. This article surveys some of the controls that have been studied, comparing them with respect to their security, information loss, and cost. The controls are divided into two categories: those that place restrictions on the set of allowable queries and those that add \"noise\" to the data or to the released statistics. The controls are described and further classified within the context of a lattice model.

read more

Citations
More filters
Proceedings ArticleDOI

Dynamic inference control in privacy preference enforcement

TL;DR: This paper proposes to use dynamic Bayesian networks to track the most updated beliefs of the adversaries about the dynamic domains in order to evaluate which contexts in the domains could be released safely in various situations.
Book ChapterDOI

A Theoretically-Sound Accuracy/Privacy-Constrained Framework for Computing Privacy Preserving Data Cubes in OLAP Environments

TL;DR: A theoretically-sound accuracy/privacy-constrained framework for computing privacy preserving data cubes in OLAP environments and ensures the efficiency and the scalability of the proposed approach, thanks to the idea of leaving the algorithmic vision of the privacy preserving OLAP problem.
Journal ArticleDOI

Security of statistical databases with an output perturbation technique

TL;DR: The purpose of this paper is to propose a new type of output perturbation method that may be very difficult to compromise and provides unbiased response.
Book ChapterDOI

Inference Control in Data Integration Systems

TL;DR: This paper proposes an approach that allows the security administrator to derive a set of queries so that when their results are combined they could lead to security breaches, and detects the set of additional rules which will be used to extend the policy of the mediator in order to block security breaches.
Posted Content

Private Disclosure of Information in Health Tele-monitoring

TL;DR: Cases where it is possible to achieve perfect privacy regardless of the adversary's auxiliary knowledge while preserving full utility of the information to the intended recipient are shown and sufficient conditions for such cases are provided.
References
More filters
Book

Cryptography and data security

TL;DR: The goal of this book is to introduce the mathematical principles of data security and to show how these principles apply to operating systems, database systems, and computer networks.
Journal ArticleDOI

Data-swapping: A technique for disclosure control

TL;DR: Data-swapping is a data transformation technique where the underlying statistics of the data are preserved and can be used as a basis for microdata release or to justify the release of tabulations.
Journal ArticleDOI

Suppression Methodology and Statistical Disclosure Control

TL;DR: In this paper, the authors discuss theory and method of complementary cell suppression and related topics in statistical disclosure control, focusing on the development of methods that are theoretically broad but also practical to implement.
Journal ArticleDOI

Secure databases: protection against user influence

TL;DR: Users may be able to compromise databases by asking a series of questions and then inferring new information from the answers, and the complexity of protecting a database against this technique is discussed here.
Journal ArticleDOI

Secure statistical databases with random sample queries

TL;DR: A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases that deals directly with the basic principle of compromise by making it impossible for a questioner to control precisely the formation of query sets.