scispace - formally typeset
Open AccessProceedings ArticleDOI

Evaluating probabilistic queries over imprecise data

TLDR
This paper addresses the important issue of measuring the quality of the answers to query evaluation based upon uncertain data, and provides algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve thequality of the executing queries.
Abstract
Many applications employ sensors for monitoring entities such as temperature and wind speed. A centralized database tracks these entities to enable query processing. Due to continuous changes in these values and limited resources (e.g., network bandwidth and battery power), it is often infeasible to store the exact values at all times. A similar situation exists for moving object environments that track the constantly changing locations of objects. In this environment, it is possible for database queries to produce incorrect or invalid results based upon old data. However, if the degree of error (or uncertainty) between the actual value and the database value is controlled, one can place more confidence in the answers to queries. More generally, query answers can be augmented with probabilistic estimates of the validity of the answers. In this paper we study probabilistic query evaluation based upon uncertain data. A classification of queries is made based upon the nature of the result set. For each class, we develop algorithms for computing probabilistic answers. We address the important issue of measuring the quality of the answers to these queries, and provide algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve the quality of the executing queries. Extensive experiments are performed to examine the effectiveness of several data update policies.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Model-driven data acquisition in sensor networks

TL;DR: This paper enrichs interactive sensor querying with statistical modeling techniques, and demonstrates that such models can help provide answers that are both more meaningful, and, by introducing approximations with probabilistic confidences, significantly more efficient to compute in both time and energy.
Journal ArticleDOI

Efficient query evaluation on probabilistic databases

TL;DR: It is shown that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods, and an optimization algorithm is described that can compute efficiently most queries.
Journal ArticleDOI

Protecting Location Privacy with Personalized k-Anonymity: Architecture and Algorithms

TL;DR: A scalable architecture for protecting the location privacy from various privacy threats resulting from uncontrolled usage of LBSs is described, including the development of a personalized location anonymization model and a suite of location perturbation algorithms.
Journal ArticleDOI

Information Extraction

TL;DR: A taxonomy of the field is created along various dimensions derived from the nature of the extraction task, the techniques used for extraction, the variety of input resources exploited, and the type of output produced to survey techniques for optimizing the various steps in an information extraction pipeline.
Proceedings Article

Trio: A System for Integrated Management of Data, Accuracy, and Lineage

TL;DR: This paper provides numerous motivating applications for Trio and lays out preliminary plans for the data model, query language, and prototype system.
References
More filters
Journal Article

The mathematical theory of communication

TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.
Journal ArticleDOI

The Mathematical Theory of Communication

TL;DR: The theory of communication is extended to include a number of new factors, in particular the effect of noise in the channel, and the savings possible due to the statistical structure of the original message anddue to the nature of the final destination of the information.
Proceedings ArticleDOI

New sampling-based summary statistics for improving approximate query answers

TL;DR: This paper introduces two new sampling-based summary statistics, concise samples and counting samples, and presents new techniques for their fast incremental maintenance regardless of the data distribution, and considers their application to providing fast approximate answers to hot list queries.
Journal ArticleDOI

Updating and Querying Databases that Track Mobile Units

TL;DR: The update problem is to determine when the location of a moving object in the database (namely its database location) should be updated, and an information cost model is proposed that captures uncertainty, deviation, and communication.
Journal ArticleDOI

Querying imprecise data in moving object environments

TL;DR: Algorithms for computing these queries are presented for a generic object movement model and detailed solutions are discussed for two common models of uncertainty in moving object databases.
Related Papers (5)