Evaluating probabilistic queries over imprecise data
Reynold Cheng,Dmitri V. Kalashnikov,Sunil Prabhakar +2 more
- pp 551-562
TLDR
This paper addresses the important issue of measuring the quality of the answers to query evaluation based upon uncertain data, and provides algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve thequality of the executing queries.Abstract:
Many applications employ sensors for monitoring entities such as temperature and wind speed. A centralized database tracks these entities to enable query processing. Due to continuous changes in these values and limited resources (e.g., network bandwidth and battery power), it is often infeasible to store the exact values at all times. A similar situation exists for moving object environments that track the constantly changing locations of objects. In this environment, it is possible for database queries to produce incorrect or invalid results based upon old data. However, if the degree of error (or uncertainty) between the actual value and the database value is controlled, one can place more confidence in the answers to queries. More generally, query answers can be augmented with probabilistic estimates of the validity of the answers. In this paper we study probabilistic query evaluation based upon uncertain data. A classification of queries is made based upon the nature of the result set. For each class, we develop algorithms for computing probabilistic answers. We address the important issue of measuring the quality of the answers to these queries, and provide algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve the quality of the executing queries. Extensive experiments are performed to examine the effectiveness of several data update policies.read more
Citations
More filters
Book ChapterDOI
Model-driven data acquisition in sensor networks
TL;DR: This paper enrichs interactive sensor querying with statistical modeling techniques, and demonstrates that such models can help provide answers that are both more meaningful, and, by introducing approximations with probabilistic confidences, significantly more efficient to compute in both time and energy.
Journal ArticleDOI
Efficient query evaluation on probabilistic databases
Nilesh Dalvi,Dan Suciu +1 more
TL;DR: It is shown that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods, and an optimization algorithm is described that can compute efficiently most queries.
Journal ArticleDOI
Protecting Location Privacy with Personalized k-Anonymity: Architecture and Algorithms
Bugra Gedik,Ling Liu +1 more
TL;DR: A scalable architecture for protecting the location privacy from various privacy threats resulting from uncontrolled usage of LBSs is described, including the development of a personalized location anonymization model and a suite of location perturbation algorithms.
Journal ArticleDOI
Information Extraction
TL;DR: A taxonomy of the field is created along various dimensions derived from the nature of the extraction task, the techniques used for extraction, the variety of input resources exploited, and the type of output produced to survey techniques for optimizing the various steps in an information extraction pipeline.
Proceedings Article
Trio: A System for Integrated Management of Data, Accuracy, and Lineage
TL;DR: This paper provides numerous motivating applications for Trio and lays out preliminary plans for the data model, query language, and prototype system.
References
More filters
Journal Article
The mathematical theory of communication
Claude E. Shannon,Warren Weaver +1 more
TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.
Journal ArticleDOI
The Mathematical Theory of Communication
TL;DR: The theory of communication is extended to include a number of new factors, in particular the effect of noise in the channel, and the savings possible due to the statistical structure of the original message anddue to the nature of the final destination of the information.
Proceedings ArticleDOI
New sampling-based summary statistics for improving approximate query answers
Phillip B. Gibbons,Yossi Matias +1 more
TL;DR: This paper introduces two new sampling-based summary statistics, concise samples and counting samples, and presents new techniques for their fast incremental maintenance regardless of the data distribution, and considers their application to providing fast approximate answers to hot list queries.
Journal ArticleDOI
Updating and Querying Databases that Track Mobile Units
TL;DR: The update problem is to determine when the location of a moving object in the database (namely its database location) should be updated, and an information cost model is proposed that captures uncertainty, deviation, and communication.
Journal ArticleDOI
Querying imprecise data in moving object environments
TL;DR: Algorithms for computing these queries are presented for a generic object movement model and detailed solutions are discussed for two common models of uncertainty in moving object databases.