scispace - formally typeset
Search or ask a question

Showing papers on "Data aggregator published in 2012"


Patent
02 Feb 2012
TL;DR: In this paper, a private stream aggregation (PSA) system is proposed to contribute a user's data to a data aggregator without compromising the user's privacy, where the aggregator can decrypt an aggregate value without decrypting individual data values associated with the set of users, and without interacting with the users while decrypting the aggregate value.
Abstract: A private stream aggregation (PSA) system contributes a user's data to a data aggregator without compromising the user's privacy. The system can begin by determining (302) a private key for a local user in a set of users, wherein the sum of the private keys associated with the set of users and the data aggregator is equal to zero. The system also selects a set of data values associated with the local user. Then, the system encrypts individual data values in the set based in part on the private key to produce a set of encrypted data values, thereby allowing the data aggregator to decrypt an aggregate value across the set of users without decrypting individual data values associated with the set of users, and without interacting with the set of users while decrypting the aggregate value. The system also sends (308) the set of encrypted data values to the data aggregator.

494 citations


Journal ArticleDOI
TL;DR: The design has been generalized and adopted on both homogeneous and heterogeneous wireless sensor networks and can recover all sensing data even these data has been aggregated, called “recoverable.”
Abstract: Recently, several data aggregation schemes based on privacy homomorphism encryption have been proposed and investigated on wireless sensor networks. These data aggregation schemes provide better security compared with traditional aggregation since cluster heads (aggregator) can directly aggregate the ciphertexts without decryption; consequently, transmission overhead is reduced. However, the base station only retrieves the aggregated result, not individual data, which causes two problems. First, the usage of aggregation functions is constrained. For example, the base station cannot retrieve the maximum value of all sensing data if the aggregated result is the summation of sensing data. Second, the base station cannot confirm data integrity and authenticity via attaching message digests or signatures to each sensing sample. In this paper, we attempt to overcome the above two drawbacks. In our design, the base station can recover all sensing data even these data has been aggregated. This property is called “recoverable.” Experiment results demonstrate that the transmission overhead is still reduced even if our approach is recoverable on sensing data. Furthermore, the design has been generalized and adopted on both homogeneous and heterogeneous wireless sensor networks.

131 citations


Journal ArticleDOI
TL;DR: A trust-based framework for data aggregation with fault tolerance based on the multilayer aggregation architecture of WMSNs is designed to reduce the impact of erroneous data and provide measurable trustworthiness for aggregated results.
Abstract: For wireless multimedia sensor networks (WMSNs) deployed in noisy and unattended environments, it is necessary to establish a comprehensive framework that protects the accuracy of the gathered multimedia information. In this paper, we jointly consider data aggregation, information trust, and fault tolerance to enhance the correctness and trustworthiness of collected information. Based on the multilayer aggregation architecture of WMSNs, we design a trust-based framework for data aggregation with fault tolerance with a goal to reduce the impact of erroneous data and provide measurable trustworthiness for aggregated results. By extracting statistical characteristics from different sources and extending Josang's trust model, we propose how to compute self-data trust opinion, peer node trust opinion, and peer data trust opinion. According to the trust transfer and trust combination rules designed in our framework, we derive the trust opinion of the sink node on the final aggregated result. In particular, this framework can evaluate both discrete data and continuous media streams in WMSNs through a uniform mechanism. Results obtained from both simulation study and experiments on a real WMSN testbed demonstrate the validity and efficiency of our framework, which can significantly improve the quality of multimedia information as well as more precisely evaluate the trustworthiness of collected information.

115 citations


Proceedings ArticleDOI
25 Mar 2012
TL;DR: This paper identifies a new security threat in collaborative sensing from testbed implementation, and it is shown that the attackers could geo-locate a secondary user from its sensing report with a successful rate of above 90% even in the presence of data aggregation.
Abstract: Collaborative spectrum sensing has been regarded as a promising approach to enable secondary users to detect primary users by exploiting spatial diversity. In this paper, we consider a converse question: could space diversity be exploited by a malicious entity, e.g., an external attacker or an untrusted Fusion Center (FC), to achieve involuntary geolocation of a secondary user by linking his location-dependent sensing report to his physical position. We answer this question by identifying a new security threat in collaborative sensing from testbed implementation, and it is shown that the attackers could geo-locate a secondary user from its sensing report with a successful rate of above 90% even in the presence of data aggregation. We then introduce a novel location privacy definition to quantify the location privacy leaking in collaborative sensing. We propose a Privacy Preserving collaborative Spectrum Sensing (PPSS) scheme, which includes two primitive protocols: Privacy Preserving Sensing Report Aggregation protocol (PPSRA) and Distributed Dummy Report Injection Protocol (DDRI). Specifically, PPSRA scheme utilizes applied cryptographic techniques to allow the FC to obtain the aggregated result from various secondary users without learning each individual's values while DDRI algorithm can provide differential location privacy for secondary users by introducing a novel sensing data randomization technique. We implement and evaluate the PPSS scheme in a real-world testbed. The evaluation results show that PPSS can significantly improve the secondary user's location privacy with a reasonable security overhead in collaborative sensing.

108 citations


Proceedings ArticleDOI
30 Oct 2012
TL;DR: An efficient protocol to obtain the Sum aggregate is proposed, which employs an additive homomorphic encryption and a novel key management technique to support large plaintext space and is orders of magnitude faster than existing solutions.
Abstract: The proliferation and ever-increasing capabilities of mobile devices such as smart phones give rise to a variety of mobile sensing applications. This paper studies how an untrusted aggregator in mobile sensing can periodically obtain desired statistics over the data contributed by multiple mobile users, without compromising the privacy of each user. Although there are some existing works in this area, they either require bidirectional communications between the aggregator and mobile users in every aggregation period, or has high computation overhead and cannot support large plaintext spaces. Also, they do not consider the Min aggregate which is quite useful in mobile sensing. To address these problems, we propose an efficient protocol to obtain the Sum aggregate, which employs an additive homomorphic encryption and a novel key management technique to support large plaintext space. We also extend the sum aggregation protocol to obtain the Min aggregate of time-series data. Evaluations show that our protocols are orders of magnitude faster than existing solutions.

84 citations


Proceedings ArticleDOI
Fengjun Li1, Bo Luo1
01 Nov 2012
TL;DR: An end-to-end signature scheme is introduced, which generates a homomorphic signature for the aggregation result, and an incremental verification protocol is presented, which is computationally inexpensive, while ensuring faithfulness and undeniability properties.
Abstract: In smart grid systems, secure in-network data aggregation approaches have been introduced to efficiently collect aggregation data, while preserving data privacy of individual meters. Nevertheless, it is also important to maintain the integrity of aggregate data in the presence of accidental errors and internal/external attacks. To ensure the correctness of the aggregation against unintentional errors, we introduce an end-to-end signature scheme, which generates a homomorphic signature for the aggregation result. The homomorphic signature scheme is compatible with the in-network aggregation schemes that are also based on homomorphic encryption, and supports efficient batch verifications of the aggregation results. Next, to defend against suspicious/compromised meters and external attacks, we present a hop-by-hop signature scheme and an incremental verification protocol. In this approach, signatures are managed distributedly and verification is only triggered in an ex post facto basis - when anomalies in the aggregation results are detected at the collector. The incremental verification process starts from the collector, and traces the anomaly in a breath-first manner. The abnormal node is identified within O(logN) iterations. Therefore, the verification process is computationally inexpensive, while ensuring faithfulness and undeniability properties.

83 citations


Journal ArticleDOI
TL;DR: An adaptive forwarding delay control scheme, namely Catch-Up, which dynamically changes the forwarding speed of nearby reports so that they have a better chance to meet each other and be aggregated together.
Abstract: In-network data aggregation is a useful technique to reduce redundant data and to improve communication efficiency. Traditional data aggregation schemes for wireless sensor networks usually rely on a fixed routing structure to ensure data can be aggregated at certain sensor nodes. However, they cannot be applied in highly mobile vehicular environments. In this paper, we propose an adaptive forwarding delay control scheme, namely Catch-Up, which dynamically changes the forwarding speed of nearby reports so that they have a better chance to meet each other and be aggregated together. The Catch-Up scheme is designed based on a distributed learning algorithm. Each vehicle learns from local observations and chooses a delay based on learning results. The simulation results demonstrate that our scheme can efficiently reduce the number of redundant reports and achieve a good trade-off between delay and communication overhead.

58 citations


Proceedings ArticleDOI
06 Dec 2012
TL;DR: Simulation results show that the DHCS outperforms conventional clustering protocols in terms of energy conservation, network life time and network latency.
Abstract: Energy efficient data aggregation is one of the key research areas of wireless sensor networks (WSNs). Many clustering techniques have been proposed for topology maintenance and routing in these networks. In addition to this, these techniques are also beneficial in prolonging the network life time. Clustering protocols proposed in existing literature use a single Cluster Head (CH) for a group of nodes (Cluster). In these protocols, the CH performs a number of activities, such as data gathering, data aggregation and data forwarding. As a result, the CH depletes its energy quickly as compared to its member nodes. Hence, re-clustering is required frequently, which consumes considerable energy. This paper proposes an energy efficient Dual Head Clustering Scheme (DHCS) for WSNs. DHCS selects two different nodes within the cluster for cluster management and aggregation namely Cluster Head (CH) and Aggregator Head (AH) respectively. Simulation results show that the DHCS outperforms conventional clustering protocols in terms of energy conservation, network life time and network latency.

58 citations


Proceedings ArticleDOI
25 Mar 2012
TL;DR: This paper works on the problem of constructing a data aggregation tree that minimizes the total energy cost of data transmission in a wireless sensor network, and proposes O(1)-approximation algorithms for each of them.
Abstract: In many applications, it is a basic operation for the sink to periodically collect reports from all sensors. Since the data gathering process usually proceeds for many rounds, it is important to collect these data efficiently, that is, to reduce the energy cost of data transmission. Under such applications, a tree is usually adopted as the routing structure to save the computation costs for maintaining the routing tables of sensors. In this paper, we work on the problem of constructing a data aggregation tree that minimizes the total energy cost of data transmission in a wireless sensor network. In addition, we also address such a problem in the wireless sensor network where relay nodes exist. We show these two problems are NP-complete, and propose O(1)-approximation algorithms for each of them. Simulations show that the proposed algorithms each have good performance in terms of the energy cost.

57 citations


Journal ArticleDOI
02 Oct 2012
TL;DR: A novel optimisation algorithm called intelligent water drops (IWDs) is adopted to construct the optimal data aggregation trees for the WSNs and shows that the IWD algorithm is able to obtain a better data aggregation tree with a smaller number of edges representing direct communication between two nodes when compared with the well-known optimisation method such as ant colony optimisation.
Abstract: Energy conservation is an important aspect in wireless sensor networks (WSNs) to extend the network lifetime. In order to obtain energy-efficient data transmission within the network, sensor nodes can be organised into an optimal data aggregation tree with optimally selected aggregation nodes to transfer data. Various nature-inspired optimisation methods have been shown to outperform conventional methods when solving this problem in a distributed manner, that is, each sensor node makes its own decision on routing the data. In this study, a novel optimisation algorithm called intelligent water drops (IWDs) is adopted to construct the optimal data aggregation trees for the WSNs. Further enhancement of the basic IWD algorithm is proposed to improve the construction of the tree by attempting to increase the probability of selecting optimum aggregation nodes. The computational experiment results show that the IWD algorithm is able to obtain a better data aggregation tree with a smaller number of edges representing direct communication between two nodes when compared with the well-known optimisation method such as ant colony optimisation. In addition, the proposed improved version of the IWD algorithm provides better performance in comparison with the basic IWD algorithm for saving the energy of WSNs.

51 citations


Proceedings ArticleDOI
12 Jun 2012
TL;DR: In this paper, the authors consider how an external aggregator or multiple parties learn some algebraic statistics over participants' data while any individual's input data is kept secret to others (the aggregator and other participants).
Abstract: Much research has been conducted to securely outsource multiple parties' data aggregation to an untrusted aggregator without disclosing each individual's data, or to enable multiple parties to jointly aggregate their data while preserving privacy. However, those works either assume to have a secure channel or suffer from high complexity. Here we consider how an external aggregator or multiple parties learn some algebraic statistics (e.g., summation, product) over participants' data while any individual's input data is kept secret to others (the aggregator and other participants). We assume channels in our construction are insecure. That is, all channels are subject to eavesdropping attacks, and all the communications throughout the aggregation are open to others. We successfully guarantee data confidentiality under this weak assumption while limiting both the communication and computation complexity to at most linear.

Proceedings Article
01 Jan 2012
TL;DR: This work introduces an approach for visual analysis of large multivariate time-dependent data, based on the idea of projecting multivariate measurements to a 2D display, visualizing the time dimension by trajectories, and uses visual data aggregation metaphors based on grouping of similar data elements to scale withMultivariate time series.
Abstract: The analysis of time-dependent data is an important problem in many application domains, and interactive visualization of time-series data can help in understanding patterns in large time series data. Many effective approaches already exist for visual analysis of univariate time series supporting tasks such as assessment of data quality, detection of outliers, or identification of periodically or frequently occurring patterns. However, much fewer approaches exist which support multivariate time series. The existence of multiple values per time stamp makes the analysis task per se harder, and existing visualization techniques often do not scale well. We introduce an approach for visual analysis of large multivariate time-dependent data, based on the idea of projecting multivariate measurements to a 2D display, visualizing the time dimension by trajectories. We use visual data aggregation metaphors based on grouping of similar data elements to scale with multivariate time series. Aggregation procedures can either be based on statistical properties of the data or on data clustering routines. Appropriately defined user controls allow to navigate and explore the data and interactively steer the parameters of the data aggregation to enhance data analysis. We present an implementation of our approach and apply it on a comprehensive data set from the field of earth observation, demonstrating the applicability and usefulness of our approach.

Patent
20 Jun 2012
TL;DR: In this paper, a data aggregator discovery (DAD) message may be distributed by an associated DAG through the recorded route and via the data aggregators, the DAD message identifying the initiating DAG, and comprising a recorded route taken from the DAG to a receiving particular node as well as a total path cost for the particular node to reach a root node of the DL through the record route and through the data aggregation via the DL.
Abstract: In one embodiment, a data aggregator discovery (DAD) message may be distributed by an associated data aggregator, the DAD message identifying the initiating data aggregator, and comprising a recorded route taken from the data aggregator to a receiving particular node as well as a total path cost for the particular node to reach a root node of the DAG through the recorded route and via the data aggregator The receiving particular node determines a path cost increase (PCI) associated with use of the data aggregator based on the total path cost as compared to a DAG-based path cost for the particular node to reach the root node via the DAG If the PCI is below a configured threshold, the particular node may redirect traffic to the data aggregator as source-routed traffic according to the recorded route The traffic may then be aggregated by the data aggregator, accordingly

Journal ArticleDOI
TL;DR: This paper proposed the efficient and effective architecture and mech anism of energy efficient techniques for data aggregation and collection in WSN using principles like global weight calculation of nodes, data collection for cluster head and data aggregation techniques using data cube aggregation.
Abstract: A multidisciplinary research area sas wireless sensor networks (WSN) have been invoked the monitoring of remote physical environment and are used for a wide range of applications ranging from defense personnel to many scientific research, statistical application, disaster area and War Z one. These networks are constraint with energy, memory and computing power enhance efficient techniques are needed for data aggregation, data collection, query processing, decision making and routing in sensor networks. The problem encountered in the recen t past was of the more battery power consumption as activity increases, need more efficient data aggregation and collection techniques with right decision making capabilities. Therefore, this paper proposed the efficient and effective architecture and mech anism of energy efficient techniques for data aggregation and collection in WSN using principles like global weight calculation of nodes, data collection for cluster head and data aggregation techniques using data cube aggregation.

Patent
13 Jan 2012
TL;DR: In this paper, techniques for managing aggregation of data in a distributed manner, such as for a particular client based on specified configuration information, are described for receiving information about multi-stage data manipulation operations that are to be performed as part of the data aggregation.
Abstract: Techniques are described for managing aggregation of data in a distributed manner, such as for a particular client based on specified configuration information. The described techniques may include receiving information about multi-stage data manipulation operations that are to be performed as part of the data aggregation, with each stage able to be performed in a distributed manner using multiple computing nodes—for example, a map-reduce architecture may be used, with a first stage involving the use of one or more specified map functions to be performed, and with at least a second stage involving the use of one or more specified reduce functions to be performed. In some situations, a particular set of input data may be used to generate the data for a multi-dimensional OLAP (“online analytical processing”) cube, such as for input data corresponding to a large quantity of transactions of one or more types.

Journal ArticleDOI
TL;DR: A new protocol is presented that provides control integrity for aggregation in wireless sensor networks that does not require referring to the base station for verifying and detecting faulty aggregated readings, thus providing a totally distributed scheme to guarantee data integrity.

Proceedings ArticleDOI
01 Oct 2012
TL;DR: Detailed security analysis has shown that the proposed LPDA scheme is robust against many security and privacy threats in smart grid and performance evaluation via extensive simulations demonstrates its efficiency in terms of low average aggregation delay.
Abstract: Security and privacy are challenging issues in smart grid. Failure to address them will hinder the flourish of smart grid. In this paper, aiming at resolving the electricity consumption data security and residential user privacy, we proposed an efficient lightweight privacy-preserving aggregation scheme, called LPDA, for smart grid. The proposed LPDA is characterized by employing one-time masking technique to protect user's privacy while achieving lightweight data aggregation. Detailed security analysis has shown that the proposed LPDA scheme is robust against many security and privacy threats in smart grid. Furthermore, performance evaluation via extensive simulations demonstrates its efficiency in terms of low average aggregation delay.

Posted Content
TL;DR: A novel data aggregation architecture model that integrates a multi-resolution hierarchical structure with CS to further optimize the amount of data transmitted and obtains substantial energy savings compared to other existing methods.
Abstract: Compressive Sensing (CS) method is a burgeoning technique being applied to diverse areas including wireless sensor networks (WSNs). In WSNs, it has been studied in the context of data gathering and aggregation, particularly aimed at reducing data transmission cost and improving power efficiency. Existing CS based data gathering work in WSNs assume fixed and uniform compression threshold across the network, regard- less of the data field characteristics. In this paper, we present a novel data aggregation architecture model that combines a multi- resolution structure with compressed sensing. The compression thresholds vary over the aggregation hierarchy, reflecting the underlying data field. Compared with previous relevant work, the proposed model shows its significant energy saving from theoretical analysis. We have also implemented the proposed CS- based data aggregation framework on a SIDnet SWANS platform, discrete event simulator commonly used for WSN simulations. Our experiments show substantial energy savings, ranging from 37% to 77% for different nodes in the networking depending on the position of hierarchy.

Journal ArticleDOI
13 Apr 2012
TL;DR: DataSHIELD as mentioned in this paper is a tool to coordinate analyses of data that cannot be pooled by simply using summary statistics from each study, and it is also an efficient approach to carry out a study level meta-analysis when this is appropriate and when the analysis can be pre-planned.
Abstract: Very large sample sizes are required for estimating effects which are known to be small, and for addressing intricate or complex statistical questions. This is often only achievable by pooling data from multiple studies, especially in genetic epidemiology where associations between individual genetic variants and phenotypes of interest are generally weak. However, the physical pooling of experimental data across a consortium is frequently prohibited by the ethico-legal constraints that govern agreements and consents for individual studies. Study level meta-analyses are frequently used so that data from multiple studies need not be pooled to conduct an analysis, though the resulting analysis is necessarily restricted by the available summary statistics. The idea of maintaining data security is also of importance in other areas and approaches to carrying out ‘secure analyses’ that do not require sharing of data from different sources have been proposed in the technometrics literature. Crucially, the algorithms for fitting certain statistical models can be manipulated so that an individual level meta-analysis can essentially be performed without the need for pooling individual-level data by combining particular summary statistics obtained individually from each study. DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual levEL Databases) is a tool to coordinate analyses of data that cannot be pooled. In this paper, we focus on explaining why a DataSHIELD approach yields identical results to an individual level meta-analysis in the case of a generalised linear model, by simply using summary statistics from each study. It is also an efficient approach to carrying out a study level meta-analysis when this is appropriate and when the analysis can be pre-planned. We briefly comment on the IT requirements, together with the ethical and legal challenges which must be addressed.

Journal ArticleDOI
TL;DR: A novel nonlinear adaptive pulse coded modulation-based compression (NADPCMC) scheme is proposed for data aggregation in a wireless sensor network (WSN) and the performance of the proposed scheme is contrasted with the available compression schemes in an NS-2 environment through several benchmarking datasets.
Abstract: Data aggregation is necessary for extending the network lifetime of wireless sensor nodes with limited processing and power capabilities, since energy expended in transmitting a single data bit would be at least several orders of magnitude higher when compared to that needed for a 32-bit computation. Therefore, in this article, a novel nonlinear adaptive pulse coded modulation-based compression (NADPCMC) scheme is proposed for data aggregation in a wireless sensor network (WSN). The NADPCMC comprises of two estimators—one at the source or transmitter and the second one at the destination node. The estimator at the source node approximates the data value for each sample. The difference between the data sample and its estimate is quantized and transmitted to the next hop node instead of the actual data sample, thus reducing the amount of data transmission and rending energy savings. A similar estimator at the next hop node or base station reconstructs the original data. It is demonstrated that repeated application of the NADPCMC scheme along the route in a WSN results in data aggregation. Satisfactory performance of the proposed scheme in terms of distortion, compression ratio, and energy efficiency and in the presence of estimation and quantization errors for data aggregation is demonstrated using the Lyapunov approach. Then the performance of the proposed scheme is contrasted with the available compression schemes in an NS-2 environment through several benchmarking datasets. Simulation and hardware results demonstrate that almost 50p energy savings with low distortion levels below 5p and low overhead are observed when compared to no compression. Iteratively applying the proposed compression scheme at the cluster head nodes along the routes over the network yields an additional improvement of 20p in energy savings per aggregation with an overall distortion below 8p.

Book ChapterDOI
09 Jul 2012
TL;DR: This paper investigates the problem of finding all pairs of nodes generating similar data sets such that similarity between each pair of sets is above a threshold t and proposes a new frequency filtering approach and several optimizations using sets similarity functions to solve this problem.
Abstract: In-network data aggregation is considered an effective technique for conserving energy communication in wireless sensor networks. It consists in eliminating the inherent redundancy in raw data collected from the sensor nodes. Prior works on data aggregation protocols have focused on the measurement data redundancy. In this paper, our goal in addition of reducing measures redundancy is to identify near duplicate nodes that generate similar data sets. We consider a tree based bi-level periodic data aggregation approach implemented on the source node and on the aggregator levels. We investigate the problem of finding all pairs of nodes generating similar data sets such that similarity between each pair of sets is above a threshold t. We propose a new frequency filtering approach and several optimizations using sets similarity functions to solve this problem. To evaluate the performance of the proposed filtering method, experiments on real sensor data have been conducted. The obtained results show that our approach offers significant data reduction by eliminating in network redundancy and outperforms existing filtering techniques.

Proceedings ArticleDOI
23 Jul 2012
TL;DR: Unlike many other secure data aggregation algorithms which require separate phases for secure aggregation and integrity verification, the secure hierarchical data aggregation algorithm does not require an additional phase for verification and saves energy by avoiding additional transmissions and computational overhead on the sensor nodes.
Abstract: Secure data aggregation in wireless sensor networks has two contrasting objectives, i) Efficiently collecting and aggregating data and ii) Aggregating the data securely. Many schemes do not take into account the possibility of corrupt aggregators and allow the aggregator to decrypt data in hop by hop algorithms. On the other hand using public key cryptography for providing end to end security is not energy efficient. In this paper we present and analyze the performance of the secure hierarchical data aggregation algorithm which uses an efficient public key cryptosystem (elliptic curve cryptography) to achieve end to end security. Unlike many other secure data aggregation algorithms which require separate phases for secure aggregation and integrity verification, the secure hierarchical data aggregation algorithm does not require an additional phase for verification. This saves energy by avoiding additional transmissions and computational overhead on the sensor nodes. We present and implement the secure data aggregation algorithm on Mica2 and TelosB sensor network platforms and measure the execution time and energy consumption of various cryptographic functions. We have also simulated our algorithms to analyze how an end to end scheme increases the network life time. We experimentally analyze our algorithms based on parameters like throughput, end to end delay and resilience to node failures.

Proceedings ArticleDOI
16 Jul 2012
TL;DR: Two crucial aspects of the data aggregation process in ODCleanStore - resolution of data conflicts and computation of aggregate quality helping consumers to decide whether the aggregated data are worth using are described.
Abstract: The paradigm of publishing governmental data is shifting from data trapped in relational databases, scanned images, or PDF files to open data, or even linked open data, bringing the information consumers (citizens, companies) unrestricted access to the data and enabling an agile information aggregation, which has up to now not been possible. Such information aggregation comes with inherent problems, such as provision of poor quality, inaccurate, irrelevant or fraudulent information. As part of the OpenData.cz initiative, we are developing projects which will enable creation, maintenance, and usage of the data infrastructure formed by the Czech governmental linked open data. In particular, the project ODCleanStore will enable data consumers seamless automated data aggregation to simplify the manual aggregation process, which would have to be performed otherwise, and will also provide provenance tracking and justifications why the aggregated data should be trusted by the consumer in the given situation. In this paper, we describe two crucial aspects of the data aggregation process in ODCleanStore - resolution of data conflicts and computation of aggregate quality helping consumers to decide whether the aggregated data are worth using. Since the data aggregation algorithm is executed during query time, we show that the proposed algorithm is fast enough to work in real-world settings.

Proceedings ArticleDOI
01 Dec 2012
TL;DR: This work proposes Two Tier Cluster based Data Aggregation (TTCDA) algorithm for the randomly distributed nodes to minimize computation and communication cost and prevent transmission of redundant data packets by improving energy consumption.
Abstract: Wireless Sensor Network (WSN) is used for the monitoring and control applications where sensor nodes gather data and send it to the sink. Most of the energy of these nodes is consumed in transmission of data packets without aggregation to sink, which may be located at single or multi hop distance. The direct transmission of data packets to the sink from nodes in the network causes increased communication costs in terms of energy, average delay and network life time. In this context the data aggregation techniques minimize the communication cost with efficient bandwidth utilization by decreasing the packet count reached at the sink. Here we propose Two Tier Cluster based Data Aggregation (TTCDA) algorithm for the randomly distributed nodes to minimize computation and communication cost. The TTCDA is energy and bandwidth efficient, since it reduces the transmission of number of packets to the sink. It is based on the additive and divisible aggregation functions at Cluster head. The functions are applied according to data packets generated by each node by considering spatial and temporal correlation. The aggregation function used in TTCDA effectively reduces the packet count reported to the sink and prevent transmission of redundant data packets by improving energy consumption. The performance of algorithm is validated using examples and simulations.

Proceedings ArticleDOI
25 Apr 2012
TL;DR: Simulation results demonstrate that the proposed Energy efficient Cluster Based Data Aggregation scheme for sensor networks (ECBDA) effectively reduces the energy consumed and it helps to increase the network lifetime.
Abstract: One of the major issues in Wireless Sensor Network is maximizing the lifetime of the network. In general, all sensor nodes directly send the information to the Base station so the energy requirement is very high. Clustering is used to decrease energy consumption and collision. In this paper, we propose an Energy efficient Cluster Based Data Aggregation scheme for sensor networks (ECBDA). This scheme has four phases: Cluster formation, Cluster head election, Data aggregation and Maintenance. Cluster members send the data only to its corresponding local cluster head. Data generated from neighboring sensors are often redundant and highly correlated thus the cluster head performs the data aggregation to reduce the redundant packet transmission. In our scheme, clusters are formed in a non-periodic manner to avoid unnecessary setup message transmissions. Simulation results demonstrate that our approach effectively reduces the energy consumed and it helps to increase the network lifetime.

Journal ArticleDOI
TL;DR: Various data aggregation techniques and their solutions are discussed and analysed and the goal of data aggregation is to combine the messages and disseminate this in larger region.

Journal ArticleDOI
TL;DR: A fuzzy based secure data aggregation technique which performs clustering and cluster head election process, efficiently checks for malicious nodes based on the system parameters and maintains a secure aggregation process in the network.
Abstract: Problem statement: Secure data aggregation is a challenging task in wireless sensor network due to the facts like more complexity, greater overhead in the case of cryptographic techniques. These issues need to be overcome using efficient technique. Approach: We propose a fuzzy based secure data aggregation technique which was having 3 phases. In its first phase, it performs clustering and cluster head election process. In the second phase, within each clusters, power consumed, distance and trust values were calculated for each member. In the third phase, based on these parameters, fuzzy logic technique was used to select the secure and non-faulty node members for data aggregation. Finally, the aggregated data from the cluster heads was transmitted to the sink. Results: By simulation results we show that our technique had improved throughput and packet delivery ratio with reduced packet drop and less energy consumption. Conclusion: The proposed technique efficiently checks for malicious nodes based on the system parameters and maintains a secure aggregation process in the network.

Journal ArticleDOI
TL;DR: The paper outlines novel data structures and algorithms to tackle the above problem, when the model mined out of the data is a classifier, and the introduced model and the overall ensemble architecture are presented in details.
Abstract: Mining data streams has become an important and challenging task for a wide range of applications. In these scenarios, data tend to arrive in multiple, rapid and time-varying streams, thus constraining data mining algorithms to look at data only once. Maintaining an accurate model, e.g. a classifier, while the stream goes by requires a smart way of keeping track of the data already passed away. Such a synthetic structure has to serve two purposes: distilling the most of information out of past data and allowing a fast reaction to concept drifting, i.e. to the change of the data trend that necessarily affects the model. The paper outlines novel data structures and algorithms to tackle the above problem, when the model mined out of the data is a classifier. The introduced model and the overall ensemble architecture are presented in details, even considering how the approach can be extended for treating numerical attributes. A large part of the paper discusses the experiments and the comparisons with several existing systems. The comparisons show that the performance of our system in general, and in particular with respect to the reaction to concept drifting, is at the top level.

Journal ArticleDOI
TL;DR: This work provides a technique for getting the optimal set of subqueries with their incoherency bounds which satisfies client query's coherency requirement with least number of refresh messages sent from aggregators to the client.
Abstract: Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic webpage are served by one or more nodes of a content distribution network, our technique involves decomposing a client query into subqueries and executing subqueries on judiciously chosen data aggregators with their individual subquery incoherency bounds. We provide a technique for getting the optimal set of subqueries with their incoherency bounds which satisfies client query's coherency requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound. Performance results using real-world traces show that our cost-based query planning leads to queries being executed using less than one third the number of messages required by existing schemes.

Proceedings ArticleDOI
10 Oct 2012
TL;DR: The DEDA algorithm minimizes data aggregation latency by building a delay-efficient network structure and considers the distances between network nodes for saving sensor transmission power and network energy.
Abstract: Data aggregation is a fundamental problem in wireless sensor networks that has attracted great attention in recent years. To design a data aggregation scheme, delay and energy efficiencies are two crucial issues that require much consideration. In this paper, we propose a distributed, energy-efficient algorithm for collecting data from all sensor nodes with minimum latency called Delay-minimized Energy-efficient Data Aggregation algorithm (DEDA). The DEDA algorithm minimizes data aggregation latency by building a delay-efficient network structure. At the same time, it also considers the distances between network nodes for saving sensor transmission power and network energy. Energy consumption is also well-balanced between sensors to achieve an acceptable network lifetime. The simulation results show that the scheme could significantly decrease data aggregation delay and obtain a reasonable network lifetime compared with other approaches.