scispace - formally typeset
Search or ask a question

Showing papers on "Data aggregator published in 2004"


Proceedings ArticleDOI
03 Nov 2004
TL;DR: This paper proposes a data aggregation scheme that significantly extends the class of queries that can be answered using sensor networks, and provides strict theoretical guarantees on the approximation quality of the queries in terms of the message size.
Abstract: Wireless sensor networks offer the potential to span and monitor large geographical areas inexpensively. Sensors, however, have significant power constraint (battery life), making communication very expensive. Another important issue in the context of sensor-based information systems is that individual sensor readings are inherently unreliable. In order to address these two aspects, sensor database systems like TinyDB and Cougar enable in-network data aggregation to reduce the communication cost and improve reliability. The existing data aggregation techniques, however, are limited to relatively simple types of queries such as SUM, COUNT, AVG, and MIN/MAX. In this paper we propose a data aggregation scheme that significantly extends the class of queries that can be answered using sensor networks. These queries include (approximate) quantiles, such as the median, the most frequent data values, such as the consensus value, a histogram of the data distribution, as well as range queries. In our scheme, each sensor aggregates the data it has received from other sensors into a fixed (user specified) size message. We provide strict theoretical guarantees on the approximation quality of the queries in terms of the message size. We evaluate the performance of our aggregation scheme by simulation and demonstrate its accuracy, scalability and low resource utilization for highly variable input data sets.

498 citations


Proceedings ArticleDOI
25 Oct 2004
TL;DR: This paper examines several approaches for making these aggregation schemes more resilient against certain attacks, and proposes a mathematical framework for formally evaluating their security.
Abstract: This paper studies security for data aggregation in sensor networks. Current aggregation schemes were designed without security in mind and there are easy attacks against them. We examine several approaches for making these aggregation schemes more resilient against certain attacks, and we propose a mathematical framework for formally evaluating their security.

400 citations


Proceedings ArticleDOI
24 Aug 2004
TL;DR: This work presents exact and approximate algorithms to find the minimum number of aggregation points in order to maximize the network lifetime in WSNs and studies the tradeoffs between energy savings and the potential delay involved in the data aggregation process.
Abstract: A fundamental challenge in the design of wireless sensor networks (WSNs) is to maximize their lifetimes. Data aggregation has emerged as a basic approach in WSNs in order to reduce the number of transmissions of sensor nodes, and hence minimizing the overall power consumption in the network. We study optimal data aggregation in WSNs. Data aggregation is affected by several factors, such as the placement of aggregation points, the aggregation function, and the density of sensors in the network. The determination of an optimal selection of aggregation points is thus extremely important. We present exact and approximate algorithms to find the minimum number of aggregation points in order to maximize the network lifetime. Our algorithms use a fixed virtual wireless backbone that is built on top of the physical topology. We also study the tradeoffs between energy savings and the potential delay involved in the data aggregation process. Numerical results show that our approach provides substantial energy savings.

269 citations


Patent
27 Dec 2004
TL;DR: In this paper, a data aggregation module is configured to store financial and risk related information from a plurality of data sources, including private client data sources and public data sources (30).
Abstract: The present system provides information on risks and related hedging strategies. A plurality of client terminals are coupled to the system, for providing access to the system for accessing information on risks and related hedging strategies. A data aggregation module is configured to store financial and risk related information from a plurality of data sources, including private client data sources (34) and public data sources (30). An analytical module (fig. 19) is coupled to the data aggregation module, and configured to perform benchmarking estimates (fig. 18) based on information retrieved from the private client data sources (34) and the public data sources (30). The benchmarking estimates (fig. 18) are performed against the private data and the public data obtained from a plurality of industries.

211 citations


Proceedings ArticleDOI
20 Jun 2004
TL;DR: Through extensive simulations, it is shown that setting up the clock out timer based on a node's position in the aggregation tree results in a beneficial "cascading effect", yielding considerable energy efficiency, yet maintaining data accuracy and freshness.
Abstract: This paper evaluates the effect of timing in data aggregation algorithms. In-network aggregation achieves energy-efficient data propagation by processing data as it flows from information sources to sinks. Our goal is to show that the decision of when to "clock out" data as it is processed by nodes have significant performance impact in terms of data accuracy and freshness. Using the sensor network paradigm where all nodes produce information periodically, we compare three aggregation timing policies. Through extensive simulations we show that setting up the clock out timer based on a node's position in the aggregation tree results in a beneficial "cascading effect", yielding considerable energy efficiency, yet maintaining data accuracy and freshness.

178 citations


Book ChapterDOI
14 Mar 2004
TL;DR: A new algorithm is introduced, based on potential gains, which adaptively redistributes the error thresholds to those nodes that benefit the most and tries to minimize the total number of transmitted messages in the network.
Abstract: Earlier work has demonstrated the effectiveness of in-network data aggregation in order to minimize the amount of messages exchanged during continuous queries in large sensor networks. The key idea is to build an aggregation tree, in which parent nodes aggregate the values received from their children. Nevertheless, for large sensor networks with severe energy constraints the reduction obtained through the aggregation tree might not be sufficient. In this paper we extend prior work on in-network data aggregation to support approximate evaluation of queries to further reduce the number of exchanged messages among the nodes and extend the longevity of the network. A key ingredient to our framework is the notion of the residual mode of operation that is used to eliminate messages from sibling nodes when their cumulative change is small. We introduce a new algorithm, based on potential gains, which adaptively redistributes the error thresholds to those nodes that benefit the most and tries to minimize the total number of transmitted messages in the network. Our experiments demonstrate that our techniques significantly outperform previous approaches and reduce the network traffic by exploiting the super-imposed tree hierarchy.

162 citations


Proceedings ArticleDOI
26 Sep 2004
TL;DR: Simulation results show that the proposed protocol yields significant savings in energy consumption while preserving data security, and also establishes secure connectivity among sensor nodes without any online key distribution.
Abstract: Data aggregation in wireless sensor networks eliminates data redundancy, thereby improving bandwidth usage and energy utilization. The paper presents a secure data aggregation protocol, called SRDA (secure reference-based data aggregation), for wireless sensor networks. In order to reduce the number of bits transmitted, sensor nodes compare their raw sensed data value with their reference data value and then transfer only the difference data. In addition to reducing the number of transmitted bits, SRDA also establishes secure connectivity among sensor nodes without any online key distribution. The security level of the communication links is gradually increased as packets are transmitted at higher level cluster-heads, since intercepting a packet at higher levels of the clustering hierarchy provides a summary of a large number of transmissions at lower levels. Simulation results show that the proposed protocol yields significant savings in energy consumption while preserving data security.

105 citations


Patent
19 May 2004
TL;DR: An improved method of and apparatus for aggregating data including a scalable multi-dimensional database (MDDB) storing multidimensional data logically organized along N dimensions and a high performance aggregation engine that performs multi-stage data aggregation operations on the multiddimensional data as discussed by the authors.
Abstract: An improved method of and apparatus for aggregating data including a scalable multi-dimensional database (MDDB) storing multidimensional data logically organized along N dimensions and a high performance aggregation engine that performs multi-stage data aggregation operations on the multidimensional data. A first stage of such data aggregation operations is performed along a first dimension of the N dimensions; and a second stage of such data aggregation operations is performed for a given slice in the first dimension along another dimension of the N dimensions. Such multi-stage data aggregation operations achieve a significant increase in system performance (e.g. deceased access/search time). The MDDB and high performance aggregation engine of the present invention may be integrated into a standalone data aggregation server supporting an OLAP system (one or more OLAP servers and clients), or may be integrated into a database management system (DBMS), thus achieving improved user flexibility and ease of use. The improved DBMS system of the present invention can be used to realize an improved Data Warehouse for supporting on-line analytical processing (OLAP) operations or to realize an improved informational database system, operational database system, or the like.

85 citations


Proceedings ArticleDOI
01 Dec 2004
TL;DR: A data communication and aggregation framework is presented that manipulates the degree of data aggregation to maintain specified acceptable latency bounds on data delivery while attempting to minimize energy consumption.
Abstract: Sensor networks have recently emerged as a new paradigm for distributed sensing and actuation. This paper describes fundamental performance trade-offs in sensor networks and the utility of simple feedback control mechanisms for distributed performance optimization. A data communication and aggregation framework is presented that manipulates the degree of data aggregation to maintain specified acceptable latency bounds on data delivery while attempting to minimize energy consumption. An analytic model is constructed to describe the relationships between timeliness, energy, and the degree of aggregation, as well as to quantify constraints that stem from real-time requirements. Feedback control is used to adapt the degree of data aggregation dynamically in response to network load conditions while meeting application deadlines. The results illustrate the usefulness of feedback control in the sensor network domain.

76 citations


Book ChapterDOI
30 Aug 2004
TL;DR: This paper proposes a group-aware network configuration method and presents two algorithms, that “cluster” along the same path sensor nodes which belong to the same group, which provides energy savings over existing network configuration schemes and improves quality of data in systems with imperfect quality ofData such as TiNA.
Abstract: In-network aggregation has been proposed as one method for reducing energy consumption in networked sensors. In this paper, we explore the idea of influencing the construction of the routing trees for sensor networks with the goal of reducing the size of transmitted data for networks with in-network aggregation involving Group By queries. Toward this, we propose a group-aware network configuration method and present two algorithms, that “cluster” along the same path sensor nodes which belong to the same group. We evaluate our proposed scheme experimentally, in the context of existing in-network aggregation schemes, with respect to energy consumption and quality of data. Overall, our routing tree construction scheme provides energy savings over existing network configuration schemes and improves quality of data in systems with imperfect quality of data such as TiNA.

54 citations


Proceedings ArticleDOI
25 Oct 2004
TL;DR: An information model for sensed data is first formulated and a new metric for evaluating data aggregation process, data aggregation quality (DAQ), is formally derived, which may be readily applied to most of continuous data gathering protocols and therefore significant to future development of sensor network protocols.
Abstract: In-network data gathering and data fusion are essential for the efficient operation of wireless sensor networks. While most existing data gathering routing protocols addressed the issue of energy efficiency, few of them, however, have considered the quality of the implied data aggregation process. In this work, an information model for sensed data is first formulated. A new metric for evaluating data aggregation process, data aggregation quality (DAQ), is formally derived. DAQ does not assume any prior knowledge on values or on statistical distributions of sensing data, and may be applied to most data gathering protocols. Next, two new protocols are proposed: the enhanced LEACH and the clustered PEGASIS, enhanced from two major existing protocols: the cluster-based LEACH and the chain-based PEGASIS. By carefully accounting for listening energy, energy efficiency of all four protocols is evaluated. In addition, DAQ is applied to evaluate their data aggregation process. It is found that, while chain-based protocols are more energy efficient than cluster-based protocols, they however suffer from poor data aggregation quality. DAQ may be readily applied to most of continuous data gathering protocols; it is therefore significant to future development of sensor network protocols.

Proceedings ArticleDOI
01 Aug 2004
TL;DR: This paper proposes a second database technology, namely active rules, that provides a natural computational paradigm for sensor network applications which require reactive behavior, such as security management and rapid forest fire response.
Abstract: Recent years have witnessed a rapidly growing interest in query processing in sensor and actuator networks. This is mainly due to the increased awareness of query processing as the most appropriate computational paradigm for a wide range of sensor network applications, such as environmental monitoring. In this paper we propose a second database technology, namely active rules, that provides a natural computational paradigm for sensor network applications which require reactive behavior, such as security management and rapid forest fire response. Like query processing, efficient and effective active rule execution mechanisms have to address several technical challenges including language design, data aggregation, data verification, robustness under topology changes, routing, power management and many more. Nonetheless, active rules change the context and the requirements of these issues and hence a new set of solutions is appropriate. To this end, we outline the implications of active rules for sensor networks and contrast these against query processing. We then proceed to discuss work in progress carried out in project Asene that aims to effectively address these issues. Finally, we introduce our architecture for a decentralized event broker based on the publish/subscribe paradigm and our early design of an ECA language for sensor networks.

Book ChapterDOI
TL;DR: A new network architecture, CODA (Cluster-based self-Organizing Data Aggregation), based on the Kohonen Self-organizing Map to aggregate sensor data in cluster is presented, which increases the quality of data and reduces data traffic as well as energy-conserving.
Abstract: Sensor Networks have recently emerged as a ubiquitous computing platform. However, the energy constrained and limited computing resources of the sensor nodes present major challenges in gathering data. In this work, we propose a self-organizing method for aggregating data in ad-hoc wireless sensor networks. We present new network architecture, CODA (Cluster-based self-Organizing Data Aggregation), based on the Kohonen Self-Organizing Map to aggregate sensor data in cluster. Before deploying the network, we train the nodes to have the ability to classify the sensor data. Thus, it increases the quality of data and reduces data traffic as well as energy-conserving. Our simulation results show that CODA increases the accuracy of data than traditional aggregation of database system. Finally, we show a real-world platform, TIP, on that we will implement the idea.

Proceedings ArticleDOI
18 Jun 2004
TL;DR: An integrated analytical model is presented to study the joint performance of in-network aggregation and topology control and indicates that to achieve high fidelity levels under medium to high event reporting load, shorter and fatter aggregation/routing trees (toward the sink) offer the best delay-energy tradeoff as long asTopology control is well coordinated with routing.
Abstract: Wireless sensor networks are characterized by limited energy resources. To conserve energy, application-specific aggregation (fusion) of data reports from multiple sensors can be beneficial in reducing the amount of data flowing over the network. Furthermore, controlling the topology by scheduling the activity of nodes between active and sleep modes has often been used to uniformly distribute the energy consumption among all nodes by de-synchronizing their activities. We present an integrated analytical model to study the joint performance of in-network aggregation and topology control. We define performance metrics that capture the tradeoffs among delay, energy, and fidelity of the aggregation. Our results indicate that to achieve high fidelity levels under medium to high event reporting load, shorter and fatter aggregation/routing trees (toward the sink) offer the best delay-energy tradeoff as long as topology control is well coordinated with routing.

Proceedings ArticleDOI
14 Dec 2004
TL;DR: A data aggregation and dilution scheme is introduced for the wireless sensor network, which can be perceived as a distributed relational database and can reduce the number of transmitted packets 50% on average compared to the case where aggregation or dilution is not used.
Abstract: A data aggregation and dilution scheme is introduced for the wireless sensor network, which can be perceived as a distributed relational database A new algorithm that can run on tiny sensor nodes to aggregate or dilute the sensed data packets is developed Two location based hash functions are also introduced to determine how the sensed data can be grouped or which sensors should be excluded from a query Analytical models are provided for the performance evaluation The numerical results show that our scheme can reduce the number of transmitted packets 50% on average compared to the case where aggregation or dilution is not used

Journal ArticleDOI
TL;DR: A double-sided technique was developed that could refine the selection of aggregation levels based on two self-defined indices: the dissimilarity index and the information loss index and it is expected that the proposed technique will maximize the use of real-time ITS data and improve data needs in the urban transportation planning process.
Abstract: Intelligent transportation system (ITS) data, which are normally collected at a 15- to 30-s interval, are a rich and valuable resource for a variety of applications, including transportation planning. However, raw ITS data exhibit a wide range of fluctuations, which may not be directly useful for planning purposes, for which aggregations taken over longer time intervals are commonly needed. Proper determination of aggregation level of ITS data will ensure the retention of necessary information and the elimination of as much unnecessary information as possible. The traditional approaches to determining aggregation level are intuitive and easy to implement, yet they may be unable to determine whether particular information is kept or lost. The newly developed wavelet-based approach, although good for decomposing original data sets, needs to be further improved. A double-sided technique was developed that could refine the selection of aggregation levels based on two self-defined indices: the dissimilarity index and the information loss index. A computer program coded in MATLAB with the use of the Wavelet Toolbox was developed to implement the proposed technique. A case study illustrating the process was conducted by using real-time ITS data from traffic management centers in the San Antonio TransGuide. It is expected that the proposed technique will maximize the use of real-time ITS data and improve data needs in the urban transportation planning process.

Journal ArticleDOI
TL;DR: In this paper, spatial and temporal multiple aggregation (STMA) is proposed to minimize energy consumption and traffic load when a single or multiple users gather state-based sensor data from varions subareas through multi-hop paths.
Abstract: Sensor nodes are thrown to remote environments for deployment and constitute a multi-hop sensor network over a wide range of area. Users hardly have global information on the distribution of sensor nodes. Hence, when users request state-based sensor readings such as temperature and humidity in an arbitrary area, networks may suffer unpredictable heavy traffic. This problem needs data aggregation to comply with user requirements and manage overlapped aggregation trees of multiple users efficiently. In this paper, spatial and temporal multiple aggregation (STMA) is proposed to minimize energy consumption and traffic load when a single or multiple users gather state-based sensor data from varions subareas through multi-hop paths. Spatial aggregation builds the aggregation tree with an optimal intermediary between a target area and a sink. The broadcast nature of wireless communication is exploited to build the aggregation tree in the confined area. Temporal aggregation uses the interval so that users obtain an appropriate amount of data they need without suffering excess traffic. The performance of STMA is evaluated in terras of energy consumption and area-to-sink delay in the simulation based on real parameters of Berkeley's MICA motes.

Proceedings ArticleDOI
Xi Zhang1, Tony Pan1, Ümit V. Çatalyürek1, Tahsin Kurc1, Joel H. Saltz1 
19 Apr 2004
TL;DR: A suite of services that support storage, indexing, and data processing (data sampling and data aggregation) on datasets that consist of a collection of multi-resolution Grids are presented.
Abstract: This paper is concerned with efficient querying of very large multi-resolution datasets on storage and compute clusters. We present a suite of services that support storage, indexing, and data processing (data sampling and data aggregation) on datasets that consist of a collection of multi-resolution Grids. We empirically evaluate the performance impact of different data declustering, indexing, and query processing strategies. The experimental evaluation is carried out using a data server implemented to serve multi-terabyte multi-resolution volumetric datasets to remote visualization clients and a one-terabyte multi-resolution volumetric dataset on a PC cluster with distributed disk space.

Book ChapterDOI
TL;DR: A new data aggregation algorithm named DAUCH (Data Aggregation algorithm Using DAG rooted at the Cluster Head) for clustering distributed nodes in sensor networks, combining the random cluster head election technique in LEACH with DAG in TORA.
Abstract: The sensor nodes in sensor networks are limited in power, computational capacities, and memory. In order to fulfill these limitations an appropriate strategy is needed. Data aggregation is one of the power saving strategies in sensor networks, combining the data that comes from many sensor nodes into a set of the meaningful information. This paper proposes a new data aggregation algorithm named DAUCH (Data Aggregation algorithm Using DAG rooted at the Cluster Head) for clustering distributed nodes in sensor networks, combining the random cluster head election technique in LEACH with DAG in TORA. The proposed algorithm outperforms LEACH due to the less transmission power. Our simulation reveals that approximately a 4% improvement is accomplished comparing to the number of nodes alive with LEACH.

Journal Article
TL;DR: This paper specifies the funcfions and the classification of data aggregation, and analyzes the main aggregation schemes proposed in detail.
Abstract: Sensor networks, consisting of a large number of sensor nodes with limited battery power and the ability for local computation, gather useful information from the nodes using wireless communication technology. Such sys- tems have proposed for a broad use in many applications including military surveillanceand environmental monitoring. It is a critical consideration to collect sensedinformation in an energy efficient manner for obtaining a long lifetime of the sensor network. As one of the mechanisms that can make use of the energy of the sensor nodes efficiently, data aggregation can reduce the traffic in network byutilizing the abilities of the nodes in local computation and storage. Based on a simple introduction of sensor networks, this paper specifies the funcfionsand the classification of data ag- gregation, and analyzes the main aggregation schemes proposed in detail.

Book ChapterDOI
21 Oct 2004
TL;DR: In this paper, an extended service data aggregator service based on notification mechanism in a Grid environment is presented, where the aggregator parses messages and extracts information about the status of service as well as computing resources.
Abstract: This paper presents an extended service data aggregator service based on notification mechanism in a Grid environment. To solve scalability problem in its infrastructure, the extended aggregator aperiodically aggregates the Service Data Element(SDE) based on notification scheme about the kinds of data which are gathered. The aggregator parses messages and extracts information about the status of service as well as computing resources. In order to provide the persistent grid information service, we also apply Xindice DBMS to maintain SDEs on multiple collections for storing the collection of the resource information as well as its services.



Journal Article
TL;DR: The Parallel Data Shipping with Priority Transmission scheme is extended to be workload sensitive and the new algorithm is called PAST with Workload Sensitivity (PAST-WS), which reduces the data aggregation cost significantly and distributes the aggregation workload more evenly among the nodes in the system.
Abstract: In this paper, we study the reliability issue in aggregating data versions for execution of real-time queries in a wireless sensor network in which sensor nodes are distributed to monitor the events occurred in the environment. We extend the Parallel Data Shipping with Priority Transmission (PAST) scheme to be workload sensitive (the new algorithm is called PAST with Workload Sensitivity (PAST-WS)) in selecting the coordinator node and the paths for transmitting the data from the participating nodes to the coordinator node. PAST-WS considers the workload at each relay node to minimize the total cost and delay in data transmission. PAST-WS not only reduces the data aggregation cost significantly, but also distributes the aggregation workload more evenly among the nodes in the system. Both properties are very important for extending the lifetime of sensor networks since the energy consumption rate of the nodes highly depends on the data transmission workloads.