scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Adaptive Multivariate Data Compression in Smart Metering Internet of Things

TL;DR: Performance studies indicate that compared to the state-of-the-art, the proposed technique is able to achieve impressive bandwidth saving for transmission of data over communication network without compromising faithful reconstruction of data at the receiver.
Abstract: Recent advances in electric metering infrastructure have given rise to the generation of gigantic chunks of data. Transmission of all of these data certainly poses a significant challenge in bandwidth and storage constrained Internet of Things (IoT), where smart meters act as sensors. In this work, a novel multivariate data compression scheme is proposed for smart metering IoT. The proposed algorithm exploits the cross correlation between different variables sensed by smart meters to reduce the dimension of data. Subsequently, sparsity in each of the decorrelated streams is utilized for temporal compression. To examine the quality of compression, the multivariate data is characterized using multivariate normal–autoregressive integrated moving average modeling before compression as well as after reconstruction of the compressed data. Our performance studies indicate that compared to the state-of-the-art, the proposed technique is able to achieve impressive bandwidth saving for transmission of data over communication network without compromising faithful reconstruction of data at the receiver. The proposed algorithm is tested in a real smart metering setup and its time complexity is also analyzed.
Citations
More filters
Journal ArticleDOI
TL;DR: This article surveys the existing green sensing and communication approaches to realize sustainable IoT systems for various applications and presents a few case studies that aim to generate sensed traffic data intelligently as well as prune it efficiently without sacrificing the required service quality.
Abstract: With the advent of Internet of Things (IoT) devices, their reconfigurability, networking, task automation, and control ability have been a boost to the evolution of traditional industries such as health-care, agriculture, power, education, and transport. However, the quantum of data produced by the IoT devices poses serious challenges on its storage, communication, computation, security, scalability, and system’s energy sustainability. To address these challenges, the concept of green sensing and communication has gained importance. This article surveys the existing green sensing and communication approaches to realize sustainable IoT systems for various applications. Further, a few case studies are presented that aim to generate sensed traffic data intelligently as well as prune it efficiently without sacrificing the required service quality. Challenges associated with these green techniques, various open issues, and future research directions for improving the energy efficiency of the IoT systems are also discussed.

27 citations


Cites methods from "Adaptive Multivariate Data Compress..."

  • ...More robust frameworks for effective characterization and reduction of high frequency smart meter data using adaptive compressive sampling are proposed in [63] and [52], respectively for single-variate and multivariate data samples, based on adaptive sparsity selection over optimum batch size before data transmission....

    [...]

Journal ArticleDOI
TL;DR: In this paper, three main techniques that utilize the edge computing paradigm to perform ML and data processing on intermediary nodes are categorized according to where data processing occurs: Device and Edge, Edge and Cloud and Device and Cloud (Federated Learning).
Abstract: The use of IoT has become pervasive and IoT devices are common in many domains. Industrial IoT (IIoT) utilises IoT devices and sensors to monitor machines and environments to ensure optimal performance of equipment and processes. Predictive Maintenance (PM) which monitors the health of machines to determine the probable failure of components is one IIoT technique which is receiving attention lately. To achieve effective PM, massive amounts of data are collected, processed and ultimately analysed by Machine Learning (ML) algorithms. Traditionally IoT sensors transmit their data readings to the cloud for processing and modelling. Handling and transmitting massive amounts of data between IoT devices and infrastructure has a cost. Edge Computing (EC) in which both sensors and intermediate nodes can process data provides opportunities to reduce data transmission costs and increase processing speed. This article examines IIoT for PM and discusses how and where data can be processed and analysed. Initially, this article presents sampling and data reduction techniques. These techniques allow for a reduction in the amount of data transmitted to the cloud for processing but there are potential accuracy trade-offs when ML algorithms utilise reduced datasets. An alternative approach is to move ML algorithms closer to the data to reduce data transmission. There are three main techniques that utilise the EC paradigm to perform ML and data processing on intermediary nodes. These techniques are categorized according to where data processing occurs: Device and Edge , Edge and Cloud and Device and Cloud (Federated Learning) . In addition to exploring traditional approaches, these three state-of-the-art techniques are examined in this article and their benefits and weaknesses are presented. A novel architecture to demonstrate how EC can be utilized both for data reduction and PM in IIoT is also proposed.

25 citations

Journal ArticleDOI
TL;DR: From the results of this paper, it is found that machine learning techniques can detect IoT attacks, but there are a few issues in the design of detection models.
Abstract: In many enterprises and the private sector, the Internet of Things (IoT) has spread globally. The growing number of different devices connected to the IoT and their various protocols have contributed to the increasing number of attacks, such as denial-of-service (DoS) and remote-to-local (R2L) ones. There are several approaches and techniques that can be used to construct attack detection models, such as machine learning, data mining, and statistical analysis. Nowadays, this technique is commonly used because it can provide precise analysis and results. Therefore, we decided to study the previous literature on the detection of IoT attacks and machine learning in order to understand the process of creating detection models. We also evaluated various datasets used for the models, IoT attack types, independent variables used for the models, evaluation metrics for assessment of models, and monitoring infrastructure using DevSecOps pipelines. We found 49 primary studies, and the detection models were developed using seven different types of machine learning techniques. Most primary studies used IoT device testbed datasets, and others used public datasets such as NSL-KDD and UNSW-NB15. When it comes to measuring the efficiency of models, both numerical and graphical measures are commonly used. Most IoT attacks occur at the network layer according to the literature. If the detection models applied DevSecOps pipelines in development processes for IoT devices, they were more secure. From the results of this paper, we found that machine learning techniques can detect IoT attacks, but there are a few issues in the design of detection models. We also recommend the continued use of hybrid frameworks for the improved detection of IoT attacks, advanced monitoring infrastructure configurations using methods based on software pipelines, and the use of machine learning techniques for advanced supervision and monitoring.

12 citations


Cites background from "Adaptive Multivariate Data Compress..."

  • ...In addition, IoT devices produce an enormous amount of data [14]....

    [...]

Proceedings ArticleDOI
07 Oct 2020
TL;DR: This research work shows the reasons why lossless compression techniques are needed in NBIoT and LTE-M and goes through the challenges posed by the low bandwidth IoTs.
Abstract: In the recent years, Internet of things (IoT) has become an integral part of the modern digital ecosystem. It has the ability to handle the tasks smartly for many different situations. Therefore, it is one of the main technologies for autonomous systems. These IoTs deal with a lot of information. As the resources of the IoT are limited, data compression is an essential need. Some of the information transmitted over the IoTs cannot be compromised at all. Any loss of such sensitive data may cause serious consequences. Therefore, lossless data compression techniques are preferred for such data so that the integrity can be maintained. The low bandwidth IoTs are very popular in the recent times. They provide services over large coverage area with limited resources. These networks are known as low power wide area networks (LPWANs). In the 3GPP framework, there are some popular LPWANs such as narrowband IoT (NBIoT), and LTE machine-type communication (LTE-M). This article focuses on the lossless compression techniques employed in these popular LPWANs. This research work shows the reasons why lossless compression techniques are needed in NBIoT and LTE-M. It also goes through the challenges posed by the low bandwidth IoTs. Further, the recently used compression techniques for these low bandwidth IoTs are also discussed.

9 citations


Cites background from "Adaptive Multivariate Data Compress..."

  • ...techniques are also being tried for low bandwidth IoTs [3]....

    [...]

  • ...In IoT, data compression is needed in several instances as the data rates and bandwidth allocated for IoTs are limited [3] – [6]....

    [...]

  • ...metering [3], and image transmission [4]....

    [...]

Journal ArticleDOI
TL;DR: A novel delay-aware priority access classification (DPAC) based ACB is proposed, where the MTC devices having packets with lesser leftover delay budget are given higher priority in ACB.
Abstract: Massive Machine-type Communications (mMTC) is one of the principal features of the 5th Generation and beyond (5G+) mobile network services. Due to sparse but synchronous MTC nature, a large number of devices tend to access a base station simultaneously for transmitting data, leading to congestion. To accommodate a large number of simultaneous arrivals in mMTC, efficient congestion control techniques like access class barring (ACB) are incorporated in LTE-A random access. ACB introduces access delay which may not be acceptable in delay-constrained scenarios, such as, eHealth, self-driven vehicles, and smart grid applications. In such scenarios, MTC devices may be forced to drop packets that exceed their delay budget, leading to a decreased system throughput. To this end, in this paper a novel delay-aware priority access classification (DPAC) based ACB is proposed, where the MTC devices having packets with lesser leftover delay budget are given higher priority in ACB. A reinforcement learning (RL) aided framework, called DPAC-RL, is also proposed for online learning of DPAC model parameters. Simulation studies show that the proposed scheme increases successful preamble transmissions by up to $75 \\%$ while ensuring that the access delay is well within the delay budget.

8 citations


Cites background from "Adaptive Multivariate Data Compress..."

  • ...MTC applications include but not limited to smart metering, payment, object tracking, remote surveillance, e-health [2], [3]....

    [...]

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations


"Adaptive Multivariate Data Compress..." refers methods in this paper

  • ...arima function of forecast package in R [37], where Akaike Information Criterion (AIC) is used to compare models, and the order of differencing d is computed based on Kwiatkowski–Phillips–Schmidt–Shin test....

    [...]

Book
01 May 1986
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Abstract: Introduction * Properties of Population Principal Components * Properties of Sample Principal Components * Interpreting Principal Components: Examples * Graphical Representation of Data Using Principal Components * Choosing a Subset of Principal Components or Variables * Principal Component Analysis and Factor Analysis * Principal Components in Regression Analysis * Principal Components Used with Other Multivariate Techniques * Outlier Detection, Influential Observations and Robust Estimation * Rotation and Interpretation of Principal Components * Principal Component Analysis for Time Series and Other Non-Independent Data * Principal Component Analysis for Special Types of Data * Generalizations and Adaptations of Principal Component Analysis

17,446 citations

Reference EntryDOI
15 Oct 2005
TL;DR: Principal component analysis (PCA) as discussed by the authors replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables.
Abstract: When large multivariate datasets are analyzed, it is often desirable to reduce their dimensionality. Principal component analysis is one technique for doing this. It replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables. Often, it is possible to retain most of the variability in the original variables with q very much smaller than p. Despite its apparent simplicity, principal component analysis has a number of subtleties, and it has many uses and extensions. A number of choices associated with the technique are briefly discussed, namely, covariance or correlation, how many components, and different normalization constraints, as well as confusion with factor analysis. Various uses and extensions are outlined. Keywords: dimension reduction; factor analysis; multivariate analysis; variance maximization

14,773 citations

Book
01 Jan 1982
TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Abstract: (NOTE: Each chapter begins with an Introduction, and concludes with Exercises and References.) I. GETTING STARTED. 1. Aspects of Multivariate Analysis. Applications of Multivariate Techniques. The Organization of Data. Data Displays and Pictorial Representations. Distance. Final Comments. 2. Matrix Algebra and Random Vectors. Some Basics of Matrix and Vector Algebra. Positive Definite Matrices. A Square-Root Matrix. Random Vectors and Matrices. Mean Vectors and Covariance Matrices. Matrix Inequalities and Maximization. Supplement 2A Vectors and Matrices: Basic Concepts. 3. Sample Geometry and Random Sampling. The Geometry of the Sample. Random Samples and the Expected Values of the Sample Mean and Covariance Matrix. Generalized Variance. Sample Mean, Covariance, and Correlation as Matrix Operations. Sample Values of Linear Combinations of Variables. 4. The Multivariate Normal Distribution. The Multivariate Normal Density and Its Properties. Sampling from a Multivariate Normal Distribution and Maximum Likelihood Estimation. The Sampling Distribution of 'X and S. Large-Sample Behavior of 'X and S. Assessing the Assumption of Normality. Detecting Outliners and Data Cleaning. Transformations to Near Normality. II. INFERENCES ABOUT MULTIVARIATE MEANS AND LINEAR MODELS. 5. Inferences About a Mean Vector. The Plausibility of ...m0 as a Value for a Normal Population Mean. Hotelling's T 2 and Likelihood Ratio Tests. Confidence Regions and Simultaneous Comparisons of Component Means. Large Sample Inferences about a Population Mean Vector. Multivariate Quality Control Charts. Inferences about Mean Vectors When Some Observations Are Missing. Difficulties Due To Time Dependence in Multivariate Observations. Supplement 5A Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids. 6. Comparisons of Several Multivariate Means. Paired Comparisons and a Repeated Measures Design. Comparing Mean Vectors from Two Populations. Comparison of Several Multivariate Population Means (One-Way MANOVA). Simultaneous Confidence Intervals for Treatment Effects. Two-Way Multivariate Analysis of Variance. Profile Analysis. Repealed Measures, Designs, and Growth Curves. Perspectives and a Strategy for Analyzing Multivariate Models. 7. Multivariate Linear Regression Models. The Classical Linear Regression Model. Least Squares Estimation. Inferences About the Regression Model. Inferences from the Estimated Regression Function. Model Checking and Other Aspects of Regression. Multivariate Multiple Regression. The Concept of Linear Regression. Comparing the Two Formulations of the Regression Model. Multiple Regression Models with Time Dependant Errors. Supplement 7A The Distribution of the Likelihood Ratio for the Multivariate Regression Model. III. ANALYSIS OF A COVARIANCE STRUCTURE. 8. Principal Components. Population Principal Components. Summarizing Sample Variation by Principal Components. Graphing the Principal Components. Large-Sample Inferences. Monitoring Quality with Principal Components. Supplement 8A The Geometry of the Sample Principal Component Approximation. 9. Factor Analysis and Inference for Structured Covariance Matrices. The Orthogonal Factor Model. Methods of Estimation. Factor Rotation. Factor Scores. Perspectives and a Strategy for Factor Analysis. Structural Equation Models. Supplement 9A Some Computational Details for Maximum Likelihood Estimation. 10. Canonical Correlation Analysis Canonical Variates and Canonical Correlations. Interpreting the Population Canonical Variables. The Sample Canonical Variates and Sample Canonical Correlations. Additional Sample Descriptive Measures. Large Sample Inferences. IV. CLASSIFICATION AND GROUPING TECHNIQUES. 11. Discrimination and Classification. Separation and Classification for Two Populations. Classifications with Two Multivariate Normal Populations. Evaluating Classification Functions. Fisher's Discriminant Function...nSeparation of Populations. Classification with Several Populations. Fisher's Method for Discriminating among Several Populations. Final Comments. 12. Clustering, Distance Methods and Ordination. Similarity Measures. Hierarchical Clustering Methods. Nonhierarchical Clustering Methods. Multidimensional Scaling. Correspondence Analysis. Biplots for Viewing Sample Units and Variables. Procustes Analysis: A Method for Comparing Configurations. Appendix. Standard Normal Probabilities. Student's t-Distribution Percentage Points. ...c2 Distribution Percentage Points. F-Distribution Percentage Points. F-Distribution Percentage Points (...a = .10). F-Distribution Percentage Points (...a = .05). F-Distribution Percentage Points (...a = .01). Data Index. Subject Index.

11,697 citations

Journal ArticleDOI
TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Abstract: The time-frequency and time-scale communities have recently developed a large number of overcomplete waveform dictionaries --- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the method of frames (MOF), Matching pursuit (MP), and, for special dictionaries, the best orthogonal basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP, and BOB, including better sparsity and superresolution. BP has interesting relations to ideas in areas as diverse as ill-posed problems, in abstract harmonic analysis, total variation denoising, and multiscale edge denoising. BP in highly overcomplete dictionaries leads to large-scale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interior-point methods. We obtain reasonable success with a primal-dual logarithmic barrier method and conjugate-gradient solver.

9,950 citations


"Adaptive Multivariate Data Compress..." refers background in this paper

  • ...orthogonal matching pursuit (OMP) [26] have been found to faster than approximation algorithms such as basis pursuit [27], which can be handled by linear programming (LP) solvers....

    [...]