(Open Access) Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions (2014) | Ahmet Erdem Sariyuce

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Attribute-driven community search

[...]

Xin Huang¹, Laks V. S. Lakshmanan²•Institutions (2)

Hong Kong Baptist University¹, University of British Columbia²

01 May 2017

TL;DR: This paper develops an efficient greedy algorithmic framework to iteratively remove nodes with the least popular attributes, and shrink the graph into an ATC, and builds an elegant index to maintain k-truss structure and attribute information, and proposes efficient query processing algorithms.

...read moreread less

Abstract: Recently, community search over graphs has gained significant interest In applications such as analysis of protein-protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes Unfortunately, most previous community search algorithms ignore attributes and result in communities with poor cohesion wrt their node attributes In this paper, we study the problem of attribute-driven community search, that is, given an undirected graph G where nodes are associated with attributes, and an input query Q consisting of nodes Vq and attributes Wq, find the communities containing Vq, in which most community members are densely inter-connected and have similar attributes We formulate this problem as finding attributed truss communities (ATC), ie, finding connected and close k-truss subgraphs containing Vq, with the largest attribute relevance score We design a framework of desirable properties that good score function should satisfy We show that the problem is NP-hard However, we develop an efficient greedy algorithmic framework to iteratively remove nodes with the least popular attributes, and shrink the graph into an ATC In addition, we also build an elegant index to maintain k-truss structure and attribute information, and propose efficient query processing algorithms Extensive experiments on large real-world networks with ground-truth communities show that our algorithms significantly outperform the state of the art and demonstrates their efficiency and effectiveness

...read moreread less

198 citations

Cites background from "Finding the Hierarchy of Dense Subg..."

...Besides k-truss, there exist several other definitions of dense subgraphs including: k-(r,s)-nucleus [34], quasiclique [10], densest subgraph [37], and k-core [35]....
[...]

Journal Article•DOI•

Truss-based community search: a truss-equivalence based indexing approach

[...]

Esra Akbas¹, Peixiang Zhao¹•Institutions (1)

Florida State University¹

01 Aug 2017

TL;DR: A novel equivalence relation, k-truss equivalence, is introduced to model the intrinsic density and cohesiveness of edges in k- Truss communities and validate the efficiency and effectiveness of EquiTruss.

...read moreread less

Abstract: We consider the community search problem defined upon a large graph G: given a query vertex q in G, to find as output all the densely connected subgraphs of G, each of which contains the query v. As an online, query-dependent variant of the well-known community detection problem, community search enables personalized community discovery that has found widely varying applications in real-world, large-scale graphs. In this paper, we study the community search problem in the truss-based model aimed at discovering all dense and cohesive k-truss communities to which the query vertex q belongs. We introduce a novel equivalence relation, k-truss equivalence, to model the intrinsic density and cohesiveness of edges in k-truss communities. Consequently, all the edges of G can be partitioned to a series of k-truss equivalence classes that constitute a space-efficient, truss-preserving index structure, EquiTruss. Community search can be henceforth addressed directly upon EquiTruss without repeated, time-demanding accesses to the original graph, G, which proves to be theoretically optimal. In addition, EquiTruss can be efficiently updated in a dynamic fashion when G evolves with edge insertion and deletion. Experimental studies in real-world, large-scale graphs validate the efficiency and effectiveness of EquiTruss, which has achieved at least an order of magnitude speedup in community search over the state-of-the-art method, TCP-Index.

...read moreread less

127 citations

Cites background from "Finding the Hierarchy of Dense Subg..."

...There has been a rich literature in modelling and quantification of dense and cohesive graphs, including clique or quasi-clique [6, 31], k-core [25, 1, 15], and nucleus [26, 27]....
[...]

Proceedings Article•DOI•

Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling

[...]

Michael Mitzenmacher¹, Jakub Pachocki², Richard Peng³, Charalampos E. Tsourakakis¹, Shen Chen Xu² - Show less +1 more•Institutions (3)

Harvard University¹, Carnegie Mellon University², Massachusetts Institute of Technology³

10 Aug 2015

TL;DR: This paper presents a sampling scheme that gives densest subgraph sparsifier, yielding a randomized algorithm that produces high-quality approximations while providing significant speedups and improved space complexity, and devise an exact algorithm that can treat both clique and biclique densities in a unified way.

...read moreread less

Abstract: Extracting dense subgraphs from large graphs is a key primitive in a variety of graph mining applications, ranging from mining social networks and the Web graph to bioinformatics [41]. In this paper we focus on a family of poly-time solvable formulations, known as the k-clique densest subgraph problem (k-Clique-DSP) [57]. When k=2, the problem becomes the well-known densest subgraph problem (DSP) [22, 31, 33, 39]. Our main contribution is a sampling scheme that gives densest subgraph sparsifier, yielding a randomized algorithm that produces high-quality approximations while providing significant speedups and improved space complexity. We also extend this family of formulations to bipartite graphs by introducing the (p,q)-biclique densest subgraph problem ((p,q)-Biclique-DSP), and devise an exact algorithm that can treat both clique and biclique densities in a unified way.As an example of performance, our sparsifying algorithm extracts the 5-clique densest subgraph --which is a large-near clique on 62 vertices-- from a large collaboration network. Our algorithm achieves 100% accuracy over five runs, while achieving an average speedup factor of over 10,000. Specifically, we reduce the running time from ∼2 107 seconds to an average running time of 0.15 seconds. We also use our methods to study how the k-clique densest subgraphs change as a function of time in time-evolving networks for various small values of k. We observe significant deviations between the experimental findings on real-world networks and stochastic Kronecker graphs, a random graph model that mimics real-world networks in certain aspects.We believe that our work is a significant advance in routines with rigorous theoretical guarantees for scalable extraction of large near-cliques from networks.

...read moreread less

120 citations

Cites methods from "Finding the Hierarchy of Dense Subg..."

...They are also used for expert team formation [19, 55], detecting link spam in Web graphs [32], graph compression [21], reachability and distance query indexing [36], insightful graph decompositions [51] and mining micro-blogging streams [9]....
[...]

Proceedings Article•DOI•

Listing k-cliques in Sparse Real-World Graphs*

[...]

Maximilien Danisch¹, Oana Balalau², Mauro Sozio³•Institutions (3)

University of Paris¹, Max Planck Society², Télécom ParisTech³

23 Apr 2018

TL;DR: This work revisits the iconic algorithm of Chiba and Nishizeki and develops the most efficient parallel algorithm for list all k-cliques in graphs containing up to tens of millions of edges, which is faster than state-of-the-art algorithms, while boasting an excellent degree of parallelism.

...read moreread less

Abstract: Motivated by recent studies in the data mining community which require to efficiently list all k-cliques, we revisit the iconic algorithm of Chiba and Nishizeki and develop the most efficient parallel algorithm for such a problem. Our theoretical analysis provides the best asymptotic upper bound on the running time of our algorithm for the case when the input graph is sparse. Our experimental evaluation on large real-world graphs shows that our parallel algorithm is faster than state-of-the-art algorithms, while boasting an excellent degree of parallelism. In particular, we are able to list all k-cliques (for any k) in graphs containing up to tens of millions of edges as well as all $10$-cliques in graphs containing billions of edges, within a few minutes and a few hours respectively. Finally, we show how our algorithm can be employed as an effective subroutine for finding the k-clique core decomposition and an approximate k-clique densest subgraphs in very large real-world graphs.

...read moreread less

109 citations

Cites methods from "Finding the Hierarchy of Dense Subg..."

...In [46] an algorithm for organizing cliques into hierarchical structures is presented, which requires to list allk-cliques....
[...]

Posted Content•

ESCAPE: Efficiently Counting All 5-Vertex Subgraphs

[...]

Ali Pinar¹, C. Seshadhri², Vaidyanathan Vishal•Institutions (2)

Sandia National Laboratories¹, University of California, Santa Cruz²

28 Oct 2016-arXiv: Social and Information Networks

TL;DR: It is proved that it suffices to enumerate only four specific subgraphs (three of them have less than 5 vertices) to exactly count all 5-vertex patterns, the first practical algorithm for 5- Vertex pattern counting that runs at this scale and is able to compute counts of graphs with tens of millions of edges in minutes on a commodity machine.

...read moreread less

Abstract: Counting the frequency of small subgraphs is a fundamental technique in network analysis across various domains, most notably in bioinformatics and social networks. The special case of triangle counting has received much attention. Getting results for 4-vertex or 5-vertex patterns is highly challenging, and there are few practical results known that can scale to massive sizes. We introduce an algorithmic framework that can be adopted to count any small pattern in a graph and apply this framework to compute exact counts for \emph{all} 5-vertex subgraphs. Our framework is built on cutting a pattern into smaller ones, and using counts of smaller patterns to get larger counts. Furthermore, we exploit degree orientations of the graph to reduce runtimes even further. These methods avoid the combinatorial explosion that typical subgraph counting algorithms face. We prove that it suffices to enumerate only four specific subgraphs (three of them have less than 5 vertices) to exactly count all 5-vertex patterns. We perform extensive empirical experiments on a variety of real-world graphs. We are able to compute counts of graphs with tens of millions of edges in minutes on a commodity machine. To the best of our knowledge, this is the first practical algorithm for $5$-vertex pattern counting that runs at this scale. A stepping stone to our main algorithm is a fast method for counting all $4$-vertex patterns. This algorithm is typically ten times faster than the state of the art $4$-vertex counters.

...read moreread less

109 citations

Collapse

Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions

Citations

Cites background from "Finding the Hierarchy of Dense Subg..."

Cites background from "Finding the Hierarchy of Dense Subg..."

Cites methods from "Finding the Hierarchy of Dense Subg..."

Cites methods from "Finding the Hierarchy of Dense Subg..."

References

"Finding the Hierarchy of Dense Subg..." refers background in this paper

"Finding the Hierarchy of Dense Subg..." refers background in this paper

"Finding the Hierarchy of Dense Subg..." refers methods in this paper

Related Papers (5)