# Pattern Matching Based Algorithms for Graph Compression

01 Nov 2018-pp 93-97

TL;DR: The results show that large graphs can be efficiently stored in smaller memory and exploit the parallel processing power of compute nodes as well as efficiently transfer data between resources.

Abstract: Graphs can be used to represent a wide variety of data belonging to different domains. Graphs can capture the relationship among data in an efficient way, and have been widely used. In recent times, with the advent of Big Data, there has been a need to store and compute on large data sets efficiently. However, considering the size of the data sets in question, finding optimal methods to store and process the data has been a challenge. Therefore, we study different graph compression techniques and propose novel algorithms to do the same in this paper. Specifically, given a graph G = (V, E), where V is the set of vertices and E is the set of edges, and $\vert \mathrm{V}\vert$ = n, we propose techniques to compress the adjacency matrix representation of the graph. Our algorithms are based on finding patterns within the adjacency matrix data, and replacing the common patterns with specific markers. All the techniques proposed here are lossless compression of graphs. Based on the experimental results, it is observed that our proposed techniques achieve almost 70% compression as compared to the adjacency matrix representation. The results show that large graphs can be efficiently stored in smaller memory and exploit the parallel processing power of compute nodes as well as efficiently transfer data between resources.

##### Citations

More filters

••

05 Jun 2021TL;DR: In this article, the authors proposed an online kernel-based algorithm for topology estimation of non-linear vector autoregressive time series by solving a sparse online optimization framework using the composite objective mirror descent method.

Abstract: Estimating the unknown causal dependencies among graph-connected time series plays an important role in many applications, such as sensor network analysis, signal processing over cyber-physical systems, and finance engineering. Inference of such causal dependencies, often know as topology identification, is not well studied for non-linear non-stationary systems, and most of the existing methods are batch-based which are not capable of handling streaming sensor signals. In this paper, we propose an online kernel-based algorithm for topology estimation of non-linear vector autoregressive time series by solving a sparse online optimization framework using the composite objective mirror descent method. Experiments conducted on real and synthetic data sets show that the proposed algorithm outperforms the state-of-the-art methods for topology estimation.

15 citations

•

TL;DR: In this paper, the authors proposed an online kernel-based algorithm for topology estimation of non-linear vector autoregressive time series by solving a sparse online optimization framework using the composite objective mirror descent method.

Abstract: Estimating the unknown causal dependencies among graph-connected time series plays an important role in many applications, such as sensor network analysis, signal processing over cyber-physical systems, and finance engineering. Inference of such causal dependencies, often know as topology identification, is not well studied for non-linear non-stationary systems, and most of the existing methods are batch-based which are not capable of handling streaming sensor signals. In this paper, we propose an online kernel-based algorithm for topology estimation of non-linear vector autoregressive time series by solving a sparse online optimization framework using the composite objective mirror descent method. Experiments conducted on real and synthetic data sets show that the proposed algorithm outperforms the state-of-the-art methods for topology estimation.

4 citations

••

31 Oct 2022

TL;DR: In this article , a non convex optimization problem is formed with lasso regularization and solved via block coordinate descent (BCD) for joint signal estimation and topology identification with a nonlinear model, under the modelling assumption that signals are generated by a sparse VAR model in a latent space and then transformed by a set of invertible, componentwise nonlinearities.

Abstract: Topology identification from multiple time series has been proved to be useful for system identification, anomaly detection, denoising, and data completion. Vector autoregressive (VAR) methods have proved well in identifying directed topology from complex networks. The task of inferring topology in the presence of noise and missing observations has been studied for linear models. As a first approach to joint signal estimation and topology identification with a nonlinear model, this paper proposes a method to do so under the modelling assumption that signals are generated by a sparse VAR model in a latent space and then transformed by a set of invertible, component-wise nonlinearities. A non convex optimization problem is formed with lasso regularisation and solved via block coordinate descent (BCD). Initial experiments conducted on synthetic data sets show the identifying capability of the proposed method.

••

31 Oct 2022

TL;DR: In this paper , a non convex optimization problem is formed with lasso regularization and solved via block coordinate descent (BCD) for joint signal estimation and topology identification with a nonlinear model, under the modelling assumption that signals are generated by a sparse VAR model in a latent space and then transformed by a set of invertible, componentwise nonlinearities.

Abstract: Topology identification from multiple time series has been proved to be useful for system identification, anomaly detection, denoising, and data completion. Vector autoregressive (VAR) methods have proved well in identifying directed topology from complex networks. The task of inferring topology in the presence of noise and missing observations has been studied for linear models. As a first approach to joint signal estimation and topology identification with a nonlinear model, this paper proposes a method to do so under the modelling assumption that signals are generated by a sparse VAR model in a latent space and then transformed by a set of invertible, component-wise nonlinearities. A non convex optimization problem is formed with lasso regularisation and solved via block coordinate descent (BCD). Initial experiments conducted on synthetic data sets show the identifying capability of the proposed method.

•

22 Dec 2022

TL;DR: In this paper , a nonlinear modeling technique for multiple time series that has a complexity similar to that of linear vector autoregressive (VAR), but it can account for nonlinear interactions for each sensor variable is proposed.

Abstract: Discovery of causal dependencies among time series has been tackled in the past either by using linear models, or using kernel- or deep learning-based nonlinear models, the latter ones entailing great complexity. This paper proposes a nonlinear modelling technique for multiple time series that has a complexity similar to that of linear vector autoregressive (VAR), but it can account for nonlinear interactions for each sensor variable. The modelling assumption is that the time series are generated in two steps: i) a VAR process in a latent space, and ii) a set of invertible nonlinear mappings applied component-wise, mapping each sensor variable into a latent space. Successful identification of the support of the VAR coefficients reveals the topology of the interconnected system. The proposed method enforces sparsity on the VAR coefficients and models the component-wise nonlinearities using invertible neural networks. To solve the estimation problem, a technique combining proximal gradient descent (PGD) and projected gradient descent is designed. Experiments conducted on real and synthetic data sets show that the proposed algorithm provides an improved identification of the support of the VAR coefficients, while improving also the prediction capabilities.