scispace - formally typeset
Search or ask a question

Showing papers by "Reynold Cheng published in 2004"


Book ChapterDOI
31 Aug 2004
TL;DR: This paper develops two index structures and associated algorithms to efficiently answer Probabilistic Threshold Queries (PTQs), and establishes the difficulty of this problem by mapping one-dimensional intervals to a two-dimensional space, and shows that the problem of intervals indexing with probabilities is significantly harder than interval indexing which is considered a well-studied problem.
Abstract: It is infeasible for a sensor database to contain the exact value of each sensor at all points in time. This uncertainty is inherent in these systems due to measurement and sampling errors, and resource limitations. In order to avoid drawing erroneous conclusions based upon stale data, the use of uncertainty intervals that model each data item as a range and associated probability density function (pdf) rather than a single value has recently been proposed. Querying these uncertain data introduces imprecision into answers, in the form of probability values that specify the likeliness the answer satisfies the query. These queries are more expensive to evaluate than their traditional counterparts but are guaranteed to be correct and more informative due to the probabilities accompanying the answers. Although the answer probabilities are useful, for many applications, it is only necessary to know whether the probability exceeds a given threshold - we term these Probabilistic Threshold Queries (PTQ). In this paper we address the efficient computation of these types of queries. In particular, we develop two index structures and associated algorithms to efficiently answer PTQs. The first index scheme is based on the idea of augmenting uncertainty information to an R-tree. We establish the difficulty of this problem by mapping one-dimensional intervals to a two-dimensional space, and show that the problem of interval indexing with probabilities is significantly harder than interval indexing which is considered a well-studied problem. To overcome the limitations of this R-tree based structure, we apply a technique we call variance-based clustering, where data points with similar degrees of uncertainty are clustered together. Our extensive index structure can answer the queries for various kinds of uncertainty pdfs, in an almost optimal sense. We conduct experiments to validate the superior performance of both indexing schemes.

305 citations


Proceedings ArticleDOI
15 Oct 2004
TL;DR: A simple method to select an appropriate set of sensors to provide reliable answers to sensor data aggregation workload in a network environment and at the same time meet the probabilistic requirement of the CPQ is proposed.
Abstract: Due to the error-prone properties of sensors, it is important to use multiple low-cost sensors to improve the reliability of query results. However, using multiple sensors to generate the value for a data item can be expensive, especially in wireless environments where continuous queries are executed. Further, we need to distinguish effectively which sensors are not working properly and discard them from being used. In this paper, we propose a probabilistic approach to decide what sensor nodes to be used to answer a query. In particular, we propose to solve the problem with the aid of continuous probabilistic query (CPQ), which is originally used to manage uncertain data and is associated with a probabilistic guarantee on the query result. Based on the historical data values from the sensor nodes, the query type, and the probabilistic requirement on the query result, we derive a simple method to select an appropriate set of sensors to provide reliable answers. We examine a wide range of common aggregate queries: average, sum, minimum, maximum, and range count query, but we believe our method can be extended to other query types. Our goal is to minimize sensor data aggregation workload in a network environment and at the same time meet the probabilistic requirement of the CPQ.

18 citations