Compact structure for sparse undirected graphs based on a clique graph partition

doi:10.1016/J.INS.2020.09.010

Home
/
Papers
/
Compact structure for sparse undirected graphs based on a clique graph partition

Journal Article•DOI•

Compact structure for sparse undirected graphs based on a clique graph partition

Felipe Glaria¹, Cecilia Hernandez¹, Susana Ladra, Gonzalo Navarro², Gonzalo Navarro³, Lilian Salinas¹ - Show less +2 more•Institutions (3)

University of Concepción¹, University of Chile², Millennium Institute³

12 Jan 2021-Information Sciences (Elsevier)-Vol. 544, pp 485-499

TL;DR: This work proposes a novel compact representation for real sparse and clustered undirected graphs that is competitive with the state-of-the-art methods in terms of compression efficiency and access times for neighbor queries, and that it recovers all the maximal cliques faster than using the original graph.

read less

About: This article is published in Information Sciences.The article was published on 2021-01-12. It has received 11 citations till now. The article focuses on the topics: Clique graph.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Zuckerli: A New Compressed Representation for Graphs

[...]

Luca Versari¹, Iulia M. Comsa², Alessio Conte¹, Roberto Grossi¹•Institutions (2)

University of Pisa¹, Google²

02 Sep 2020-IEEE Access

TL;DR: It is shown that Zuckerli-compressed graphs are 10% to 29% smaller, and more than 20% in most cases, with a resource usage for decompression comparable to that of WebGraph.

...read moreread less

Abstract: Zuckerli is a scalable compression system meant for large real-world graphs. Graphs are notoriously challenging structures to store efficiently due to their linked nature, which makes it hard to separate them into smaller, compact components. Therefore, effective compression is crucial when dealing with large graphs, which can have billions of nodes and edges. Furthermore, a good compression system should give the user fast and reasonably flexible access to parts of the compressed data without requiring full decompression, which may be unfeasible on their system. Compared to WebGraph, the de-facto standard in compressing real-world graphs, Zuckerli improves multiple aspects by using advanced compression techniques and novel heuristic graph algorithms. Zuckerli can produce both a compressed representation for storage and one which allows fast direct access to the adjacency lists of the compressed graph without decompressing the entire graph. We validate its effectiveness on real-world graphs with up to a billion nodes and 90 billion edges, conducting an extensive experimental evaluation of both compression density and decompression performance. We show that Zuckerli-compressed graphs are 10% to 29% smaller, and more than 20% in most cases, with a resource usage for decompression comparable to that of WebGraph.

...read moreread less

7 citations

Cites background from "Compact structure for sparse undire..."

...A brief experimental comparison between Zuckerli, k2-trees and 2D block trees can be found in Section IV, observing that space results can be improved if no random list decompression is supported [10]....
[...]

Journal Article•DOI•

A Survey of Frequent Subgraph Mining Algorithms

[...]

Chandra Prakash Dixit, Nilay Khare

07 Jul 2018-International journal of engineering and technology

TL;DR: This review paper discovers present FSM techniques and tries to give their comparative study.

...read moreread less

Abstract: Graphs are broadly used data structure. Graphs are very useful in representing/analyzing and processing real world data. Evolving graphs are graphs which are frequently changing in nature. There is either increase or decrease in their size i.e. change in number of edges or/and vertices. Mining is the process done for knowledge discovery in graphs. Detecting specific patterns with their number of repetition more than a predefined threshold in graph is known as frequent subgraph mining or FSM. Real Timed data representing graphs are high volumetric or of very large in size, handling such graphs require processing them with special mechanisms and algorithms. Our review paper discovers present FSM techniques and tries to give their comparative study.

...read moreread less

6 citations

Journal Article•DOI•

Iterated multilevel simulated annealing for large-scale graph conductance minimization

[...]

Zhi Lu¹, Zhi Lu², Jin-Kao Hao¹, Una Benlic³, David Lesaint¹ - Show less +1 more•Institutions (3)

University of Angers¹, University of Shanghai for Science and Technology², Tesco³

01 Sep 2021-Information Sciences

TL;DR: This work presents the first iterated multilevel simulated annealing algorithm for large-scale graph conductance minimization, which features a novel solution-guided coarsening method and an effective solution refinement procedure based on simulatedAnnealing.

...read moreread less

4 citations

Cites background from "Compact structure for sparse undire..."

...In particular, other graph representations using compact structure [14] may be considered to reduce the space complexity of the algorithm....
[...]

Journal Article•DOI•

Graph Compression based on Transitivity for Neighborhood Query

[...]

Amin Emamzadeh Esmaeili Nejad¹, Mansoor Zolghadri Jahromi¹, Mohammad Taheri¹•Institutions (1)

Shiraz University¹

23 Jun 2021-Information Sciences

TL;DR: The proposed approach, in this paper, is a lossy compression technique used to answer neighborhood queries with a more general precondition, called transitivity, that is a sparse graph optimized to keep original adjacent vertices, in at most 2-distance from each other and vice versa.

...read moreread less

2 citations

Proceedings Article•DOI•

Parallel K-clique counting on GPUs

[...]

28 Jun 2022

TL;DR: In this article , the authors propose a parallelization of k-clique counting on GPUs, based on the idea of traversing search trees starting at each vertex in the graph.

...read moreread less

Abstract: Counting k-cliques in a graph is an important problem in graph analysis with many applications such as community detection and graph partitioning. Counting k-cliques is typically done by traversing search trees starting at each vertex in the graph. Parallelizing k-clique counting has been well-studied on CPUs and many solutions exist. However, there are no performant solutions for k-clique counting on GPUs.

...read moreread less

1 citations

References

PDF

Open Access

More filters

Proceedings Article•

On Spectral Clustering: Analysis and an algorithm

[...]

Andrew Y. Ng¹, Michael I. Jordan¹, Yair Weiss²•Institutions (2)

University of California, Berkeley¹, Hebrew University of Jerusalem²

03 Jan 2001

TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.

...read moreread less

Abstract: Despite many empirical successes of spectral clustering methods— algorithms that cluster points using eigenvectors of matrices derived from the data—there are several unresolved issues. First. there are a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this paper, we present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Using tools from matrix perturbation theory, we analyze the algorithm, and give conditions under which it can be expected to do well. We also show surprisingly good experimental results on a number of challenging clustering problems.

...read moreread less

9,043 citations

"Compact structure for sparse undire..." refers methods in this paper

...Afterwards, a clustering algorithm, such as spectral clustering [41] or hierarchical clustering [42], is applied....
[...]

Journal Article•DOI•

A Method for the Construction of Minimum-Redundancy Codes

[...]

David A. Huffman¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1952

TL;DR: A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.

...read moreread less

Abstract: An optimum method of coding an ensemble of messages consisting of a finite number of members is developed. A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.

...read moreread less

5,221 citations

"Compact structure for sparse undire..." refers methods in this paper

...For the byte sequence BB, plain Huffman compression [46] was used....
[...]

Journal Article•DOI•

Uncovering the overlapping community structure of complex networks in nature and society

[...]

Gergely Palla¹, Gergely Palla², Imre Derényi¹, Illés J. Farkas², Tamás Vicsek¹, Tamás Vicsek² - Show less +2 more•Institutions (2)

Eötvös Loránd University¹, Hungarian Academy of Sciences²

09 Jun 2005-Nature

TL;DR: After defining a set of new characteristic quantities for the statistics of communities, this work applies an efficient technique for exploring overlapping communities on a large scale and finds that overlaps are significant, and the distributions introduced reveal universal features of networks.

...read moreread less

Abstract: A network is a network — be it between words (those associated with ‘bright’ in this case) or protein structures. Many complex systems in nature and society can be described in terms of networks capturing the intricate web of connections among the units they are made of1,2,3,4. A key question is how to interpret the global organization of such networks as the coexistence of their structural subunits (communities) associated with more highly interconnected parts. Identifying these a priori unknown building blocks (such as functionally related proteins5,6, industrial sectors7 and groups of people8,9) is crucial to the understanding of the structural and functional properties of networks. The existing deterministic methods used for large networks find separated communities, whereas most of the actual networks are made of highly overlapping cohesive groups of nodes. Here we introduce an approach to analysing the main statistical features of the interwoven sets of overlapping communities that makes a step towards uncovering the modular structure of complex systems. After defining a set of new characteristic quantities for the statistics of communities, we apply an efficient technique for exploring overlapping communities on a large scale. We find that overlaps are significant, and the distributions we introduce reveal universal features of networks. Our studies of collaboration, word-association and protein interaction graphs show that the web of communities has non-trivial correlations and specific scaling properties.

...read moreread less

5,217 citations

"Compact structure for sparse undire..." refers background or methods in this paper

...A well-known algorithm for identifying clique communities is the Clique Percolation Method (CPM) [1]....
[...]
...It omits Xp[0] = 3, then it checks Xp[1] = 4, and since its associated BB element is 3, and the bitwise and with 3 is non-zero, then 4 is a neighbor of...
[...]
...disease analysis [5], community discovery [1, 2, 6], recommender systems [7], graph compression [8, 9, 10], measuring relevance of network actors [11, 12], and network visualization [13, 14]....
[...]
..., C[8 · bpup] for b = 1 to bpup do for k = 1 to 8 do cluster ← cluster + 1 if BBp[bpup · j + b][k]=1 then Insert vertex Xp[j] to C[cluster] CC ← CC ∪ {C[1], C[2], ....
[...]
...A well-known algorithm for identifying clique communities is the Clique Percolation Method (CPM) [1].125 This method first lists all of the maximal cliques and later builds a clique-clique overlap matrix....
[...]

Journal Article•DOI•

Graph structure in the Web

[...]

Andrei Z. Broder, Ravi Kumar¹, Farzin Maghoul, Prabhakar Raghavan¹, Sridhar Rajagopalan¹, Raymie Stata, Andrew Tomkins¹, Janet L. Wiener - Show less +4 more•Institutions (1)

IBM¹

01 Jun 2000

TL;DR: The study of the web as a graph yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution.

...read moreread less

Abstract: The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of the web graph using two Altavista crawls each with over 200 million pages and 1.5 billion links. Our study indicates that the macroscopic structure of the web is considerably more intricate than suggested by earlier experiments on a smaller scale.

...read moreread less

2,973 citations

Proceedings Article•

The network data repository with interactive graph analytics and visualization

[...]

Ryan A. Rossi¹, Nesreen K. Ahmed¹•Institutions (1)

Purdue University¹

25 Jan 2015

TL;DR: The aim of NR is to make it easy to discover key insights into the data extremely fast with little effort while also providing a medium for users to share data, visualizations, and insights.

...read moreread less

Abstract: (NR) is the first interactive data repository with a web-based platform for visual interactive analytics. Unlike other data repositories (e.g., UCI ML Data Repository, and SNAP), the network data repository (networkrepository.com) allows users to not only download, but to interactively analyze and visualize such data using our web-based interactive graph analytics platform. Users can in real-time analyze, visualize, compare, and explore data along many different dimensions. The aim of NR is to make it easy to discover key insights into the data extremely fast with little effort while also providing a medium for users to share data, visualizations, and insights. Other key factors that differentiate NR from the current data repositories is the number of graph datasets, their size, and variety. While other data repositories are static, they also lack a means for users to collaboratively discuss a particular dataset, corrections, or challenges with using the data for certain applications. In contrast, NR incorporates many social and collaborative aspects that facilitate scientific research, e.g., users can discuss each graph, post observations, and visualizations.

...read moreread less

1,767 citations

"Compact structure for sparse undire..." refers background or methods in this paper

...disease analysis [5], community discovery [1, 2, 6], recommender systems [7], graph compression [8, 9, 10], measuring relevance of network actors [11, 12], and network visualization [13, 14]....
[...]
...These substructures have been used for improving network analysis, graph compression [9, 20], and visualization [13]....
[...]