scispace - formally typeset
Open AccessJournal ArticleDOI

Retrieval of Spatial Join Pattern Instances from Sensor Networks

Reads0
Chats0
TLDR
This work develops cost models that suggest the appropriateness of each protocol, based on various factors, including selectivity of query elements, energy requirements for sensing, and network topology, and devise protocols for ‘in-network’ evaluation of spatial join queries, aiming at the minimization of power consumption.
Abstract
We study the continuous evaluation of spatial join queries and extensions thereof, defined by interesting combinations of sensor readings (events) that co-occur in a spatial neighborhood. An example of such a pattern is "a high temperature reading in the vicinity of at least four high-pressure readings". We devise protocols for `in-network' evaluation of this class of queries, aiming at the minimization of power consumption. In addition, we develop cost models that suggest the appropriateness of each protocol, based on various factors, including selectivity of query elements, energy requirements for sensing, and network topology. Finally, we experimentally compare the effectiveness of the proposed solutions on an experimental platform that emulates real sensor networks.

read more

Content maybe subject to copyright    Report

Retrieval of Spatial Join Pattern Instances from Sensor Networks
Man Lung Yiu
Department of Computer Science
Aalborg University
DK-9220 Aalborg, Denmark
mly@cs.aau.dk
Nikos Mamoulis
Department of Computer Science
University of Hong Kong
Pokfulam Road, Hong Kong
nikos@cs.hku.hk
Spiridon Bakiras
Dept. of Math. and Comp. Science
John Jay College
City University of New York
sbakiras@jjay.cuny.edu
Abstract
We study the continuous evaluation of spatial join queries and extensions thereof, defined by
interesting combinations of sensor readings (events) that co-occur in a spatial neighborhood.
An example of such a pattern is “a high temperature reading in the vicinity of at least four high-
pressure readings”. We devise protocols for ‘in-network’ evaluation of this class of queries,
aiming at the minimization of power consumption. In addition, we develop cost models that
suggest the appropriateness of each protocol, based on various factors, including selectivity of
query elements, energy requirements for sensing, and network topology. Finally, we experi-
mentally compare the effectiveness of the proposed solutions on an experimental platform that
emulates real sensor networks.
Work supported by grant HKU 7155/06E from Hong Kong RGC.
A preliminary version of this work appeared in [25], available at http://www.cs.aau.dk/mly/ssdbm07 senpat.pdf
1

1 Introduction
Advances in computer hardware have brought to availability small and relatively cheap devices
forming a powerful network that interacts and collects information from the environment, where it
is deployed [27]. Sensor networks have several applications, including environmental monitoring
[15, 13], control/maintenance of industrial infrastructure [1], military applications [20], structural
monitoring [17], etc. Recently, the problem of evaluating queries over a sensor network has at-
tracted significant research interest from the database community, leading to the development of
two research DBMS prototypes [24, 14]. These systems provide to the user an interface, via which
queries are expressed in a declarative way; the user needs not deal with how queries are evaluated.
Suitable extensions of SQL were proposed with clauses that consider the special features of sen-
sor networks. These features include the transient, on-demand nature of sampled data, extended
lifetime of continuous (non-transient) queries, sampling rate or compression of sensor readings,
event-triggered queries, etc.
The main focus of existing work on sensor networks has been the minimization of power consump-
tion at sensor nodes, during query evaluation. Sensors are usually battery-operated and they are
often deployed in hostile environments or rough terrains, where the network runs unsupervised for
long time intervals. Thus, power is of utmost importance, since it is directly related to the longevity
of the network. Previously studied topics include the energy-efficient retrieval of aggregations or
data summaries [13, 5, 3, 7, 6, 19], the derivation and maintenance of data models that describe
the data distribution [8, 4], and the optimal in-network placement of operators or filter predicates
on the sensed values [14, 2, 1, 22]. To our knowledge, there is no prior work for in-network evalu-
ation of queries that spatially correlate measurements from different sensors. An example of such
a query (taken from [3]) is “generate a notification whenever two sensors within 5 yards from each
other simultaneously measure an abnormal temperature”. A spatial pattern query retrieves sets
of sensors (pairs in this example), whose readings qualify some selection predicates (e.g., abnor-
mal temperatures) and their locations qualify some pairwise distance predicates (e.g., within ve
yards). Data analysts may be interested in the on-line identification of pattern instances that occur
2

rarely in the environments where sensors are deployed and may indicate exceptional events. For
instance, an unusually high temperature detected in the vicinity of multiple low-humidity readings
may indicate high chance of a fire break in the local area, where the pattern is detected. An-
other application of spatial pattern queries is the prediction of weather phenomena based on spatial
combinations of sensor readings.
A straightforward way to evaluate spatial pattern queries is to program the sensors to transmit
their readings together with their locations to a central basestation (via a routing tree [10, 14]),
where their spatial associations are validated. Although this approach is easy to implement, it may
waste more energy than necessary, as sensor readings that are not part of query results may be
sent all the way up to the root. Motivated by the lack of effective evaluation protocols for spatial
pattern queries, in this paper, we study this problem in depth, focusing on (i) filtering techniques
for readings that do not participate in the result, (ii) in-network computation of query results. We
propose optimized evaluation protocols for binary spatial joins and more complex query patterns
and compare them for different problem parameters. Our solutions are orthogonal to snapshot-
based schemes (e.g., [11]), which apply query evaluation only to a small (self-maintained) sample
of the network and to techniques that summarize sensor readings over long time intervals before
applying query evaluation on them (e.g., [6]). The contributions of this paper can be summarized
as follows:
We identify the interesting class of spatial pattern queries. We formally define them and
discuss how they can be expressed using the language extensions of [14].
We propose energy-efficient protocols for in-network evaluation of spatial pattern queries.
In addition, we provide cost models which can be used by a query optimizer to determine a
suitable evaluation method based on query parameters and data statistics.
We experimentally evaluate the effectiveness of the proposed techniques by tuning various
parameters, including query selectivity, network size, topology and density, sampling cost,
etc.
3

The remainder of the paper is organized as follows. Section 2 reviews related work. Section 3
formally defines spatial pattern queries. In Section 4, we describe in detail the proposed solutions,
and analyze their costs in Section 5. Section 6 discusses the evaluation of variants and extensions
of pattern queries, as well as advanced issues, like multiple query evaluation. Section 7 experimen-
tally demonstrates the applicability and efficiency of our techniques. Finally, Section 8 concludes
the paper.
2 Background and Related Work
The special characteristics of a sensor network compared to a generic wireless network are (i)
the limited resources of nodes (energy, communication range, network bandwidth and capacity),
(ii) unreliable communication with high packet loss rates and frequent node failures, and (iii)
unsupervised nature with nodes placed at hostile environments (e.g., remote areas, war fields, etc.).
Thus, query evaluation techniques for sensor networks aim at minimizing the energy cost, subject
to the constraints of the network (e.g., communication range, maximum data volume that can be
sent by a node at a cycle, etc.). Besides, sensor networks are inherently redundant (i.e., dense),
in order to keep the network connected after node failures and increase the reliability of sensed
information.
Query evaluation in sensor networks is performed in two steps [10, 24, 14]. Suppose that the query
should collect the readings from all sensors. The query is registered at a basestation, which is
connected to a root node r. In the first step, the query is disseminated to the sensors, and a spanning
tree of the network, rooted at r is dynamically constructed. If a node receives the query for the
first time, it selects one of the senders as its parent in the tree and broadcasts the query. Otherwise,
the message is ignored. The resulting communication (or routing) tree is used to acquire sensor
readings related to the query, up to the basestation. Delivery of sensor readings (or query results) to
the root is performed in multiple phases. During a specific phase, a level of the tree sends and the
level above listens and receives information addressed for it. Finally, the root collects all readings
and sends them to the basestation.
4

Queries over sensor networks are usually continuous, i.e., they remain active for a lengthy time
interval (e.g., minutes, hours). Otherwise, the cost for disseminating the query may not be com-
pensated. Frequent instantaneous queries are best processed if the network operates in a push-based
manner; sensors periodically and unconditionally collect measurements and route them to a bases-
tation, where queries are registered and evaluated as queries over streaming data. For example,
in the work of [9], efficient algorithms are developed for processing continuous constraint queries
at a centralized basestation, without considering communication cost in the underlying infrastruc-
ture (e.g., sensor network). In this paper, we exploit in-network evaluation techniques in order to
minimize power consumption of the sensor network for processing continuous queries. Next, we
review work on (continuous) query evaluation on sensor networks.
2.1 Aggregation and summarization
Madden et al. [13] proposed a simple, but powerful protocol for computing common aggregate
functions (e.g., count, sum, max, min). Each sensor combines the information received by its
children with its own measurement to derive and send data of constant size, capturing a partial
computation of the aggregate function. In [5], a multi-path algorithm for computing aggregates is
presented to reduce communication errors as multiple parents may hear and aggregate the infor-
mation broadcast by a single child. [16] proposes a hybrid method that combines the tree topology
of [13] with the ring network topology of [5]. Besides, [7] describes a method for pushing error
tolerance in network nodes, in order to avoid sending information if the aggregate is within some
error bound. The problem of redistributing the error tolerance among nodes in order to minimize
the overall error at dynamic environments is also studied. A similar approach was independently
proposed in [19]. To minimize network communication, [6] presents a methodology for in-network
compression of multiple (time-series) signals generated by sensors (e.g., one for temperature, one
for humidity, etc.). The rationale is that measurements observed at the same node are likely to
follow similar trends. Soheili et al. [21] focused on the processing of spatial aggregation query,
which derives the aggregate (e.g., average) of sensor values (e.g., temperatures) in a user-defined
spatial window W . They developed a distributed and hierarchical structure on the sensor network
5

Citations
More filters
Journal ArticleDOI

Spatial interpolation in wireless sensor networks: localized algorithms for variogram modeling and Kriging

TL;DR: This work shows that a phenomenon can be interpolated inside a coverage hole with a high level of accuracy from the available nodal data given a model of its spatial correlation, and presents highly energy efficient methods for spatial interpolation in WSNs.
Book ChapterDOI

New Data Types and Operations to Support Geo-streams

TL;DR: This paper uses the work in data type based spatio-temporal databases to propose new data types called STREAM and their abstract semantics to support geo-stream applications and defines and illustrated new operations on STREAM data types by embedding them into SQL.
Proceedings ArticleDOI

A spatial extension of TinyDB for wireless sensor networks

TL;DR: This paper proposes an extension of TinyDB suitable to manage the location of sensor nodes and, hence, able to process besides standard queries, spatial queries as well.
Proceedings ArticleDOI

Retrieval of Spatial Join Pattern Instances from Sensor Networks

TL;DR: This work devise acquisitional and distributed protocols for evaluating spatial join queries and extensions thereof, defined by interesting combinations of sensor readings (events) that co-occur in a spatial neighborhood.
Journal ArticleDOI

Multi-Attribute Join Query Processing in Sensor Networks

TL;DR: A filter-based scheme to discard non-joining tuples, which the center points of filters are identified and updated and an optimized solution to reduce the transmission of non- joining tuples is designed, which is very benefit on energy efficiency.
References
More filters
Proceedings ArticleDOI

Directed diffusion: a scalable and robust communication paradigm for sensor networks

TL;DR: This paper explores and evaluates the use of directed diffusion for a simple remote-surveillance sensor network and its implications for sensing, communication and computation.
Proceedings ArticleDOI

Wireless sensor networks for habitat monitoring

TL;DR: An in-depth study of applying wireless sensor networks to real-world habitat monitoring and an instance of the architecture for monitoring seabird nesting environment and behavior is presented.
Journal ArticleDOI

TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

TL;DR: This work presents the Tiny AGgregation (TAG) service for aggregation in low-power, distributed, wireless environments, and discusses a variety of optimizations for improving the performance and fault tolerance of the basic solution.
Journal ArticleDOI

TinyDB: an acquisitional query processing system for sensor networks

TL;DR: This work evaluates issues in the context of TinyDB, a distributed query processor for smart sensor devices, and shows how acquisitional techniques can provide significant reductions in power consumption on the authors' sensor devices.
Journal ArticleDOI

The cougar approach to in-network query processing in sensor networks

TL;DR: This paper introduces the Cougar approach to tasking sensor networks through declarative queries, and proposes a natural architecture for a data management system for sensor networks, and describes open research problems in this area.
Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions mentioned in the paper "Retrieval of spatial join pattern instances from sensor networks∗†" ?

In this paper, the authors proposed an in-network evaluation protocol for spatial pattern queries over a sensor network. 

In the future, the authors plan to study alternative spatial pattern queries that capture advanced characteristics such as the shape and distribution of sensor values. Regarding continuous query evaluation, the authors will continue to explore the approach in Section 6. 2 for reducing energy consumption by saving notifications of identical spatial patterns in consecutive epochs. 

The main focus of existing work on sensor networks has been the minimization of power consumption at sensor nodes, during query evaluation. 

For instance, if a node s falls into the 80% − 100% class, then the quantity hops between pr(s) and the basehops between s and the base (i.e., the path ratio saved if a tuple from s was pruned by pr(s)) is between 0.8 and 1. 

In summary, acquisitional protocols are favorable for multi-hop queries, due to the extreme cost of flooding the selection results at long ranges. 

The temporal constraint can be in the form of an interval (e.g., [0, 5]) of the allowed time difference vj.t− vi.t, where v.t denotes the time instant the sensor value that instantiates variable v was sampled. 

DSB has the best performance at low values of Sel(PB), while DSC becomes the best protocol as the number of border nodes increases. 

The special characteristics of a sensor network compared to a generic wireless network are (i) the limited resources of nodes (energy, communication range, network bandwidth and capacity), (ii) unreliable communication with high packet loss rates and frequent node failures, and (iii) unsupervised nature with nodes placed at hostile environments (e.g., remote areas, war fields, etc.). 

A straightforward way to evaluate spatial pattern queries is to program the sensors to transmit their readings together with their locations to a central basestation (via a routing tree [10, 14]), where their spatial associations are validated. 

A pruner keeps track of the queries that apply in each prunee list and uses it to potentially filter tuples, relevant to these queries. 

The closest work to ours is [12], which reports pairs of sensor events located within a given distance range, and reduces communication cost by a distributed routing index. 

The probability that a node satisfying either P1 or P2 does not participate in a join result is E1 +E2, since the two events are mutually exclusive (a node is within distance c from itself). 

At z=2 (z=3), the authors double (triple) the energy consumption of sensors and the successful transmission probability (between neighbors) rises to 0.96 (0.992). 

protocol DS is more appropriate for multi-path routing than AQB (or AQP), since (i) the amount of transferred data is low as only (rare) join results are routed and (ii) the pruner nodes of AQP will be less effective, since tuples from prunees may find other paths to the root.