Qualitative Comparison of Community Detection Algorithms

doi:10.1007/978-3-642-22027-2_23

Home
/
Papers
/
Qualitative Comparison of Community Detection Algorithms

Book Chapter•DOI•

Qualitative Comparison of Community Detection Algorithms

Günce Keziban Orman¹, Günce Keziban Orman², Vincent Labatut¹, Hocine Cherifi²•Institutions (2)

Galatasaray University¹, University of Burgundy²

16 Jul 2012-arXiv: Social and Information Networks-

TL;DR: This study generates networks thanks to the most realistic model available to date and applies five community detection algorithms on these networks, finding out the performance assessed quantitatively does not necessarily agree with a qualitative analysis of the identified communities.

read less

Abstract: Community detection is a very active field in complex networks analysis, consisting in identifying groups of nodes more densely interconnected relatively to the rest of the network. The existing algorithms are usually tested and compared on real-world and artificial networks, their performance being assessed through some partition similarity measure. However, artificial networks realism can be questioned, and the appropriateness of those measures is not obvious. In this study, we take advantage of recent advances concerning the characterization of community structures to tackle these questions. We first generate networks thanks to the most realistic model available to date. Their analysis reveals they display only some of the properties observed in real-world community structures. We then apply five community detection algorithms on these networks and find out the performance assessed quantitatively does not necessarily agree with a qualitative analysis of the identified communities. It therefore seems both approaches should be applied to perform a relevant comparison of the algorithms.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Random graphs

[...]

Alan Frieze¹•Institutions (1)

Carnegie Mellon University¹

22 Jan 2006

TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.

...read moreread less

Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

...read moreread less

7,116 citations

Journal Article•DOI•

Consensus clustering in complex networks

[...]

Andrea Lancichinetti¹, Andrea Lancichinetti², Santo Fortunato³, Santo Fortunato²•Institutions (3)

Polytechnic University of Turin¹, Institute for Scientific Interchange², Aalto University³

27 Mar 2012-Scientific Reports

TL;DR: It is shown that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions.

...read moreread less

Abstract: The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on the specific random seeds, initial conditions and tie-break rules adopted for their execution. Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Here we show that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions. This framework is also particularly suitable to monitor the evolution of community structure in temporal networks. An application of consensus clustering to a large citation network of physics papers demonstrates its capability to keep track of the birth, death and diversification of topics.

...read moreread less

727 citations

Proceedings Article•DOI•

High quality, scalable and parallel community detection for large real graphs

[...]

Arnau Prat-Pérez¹, David Dominguez-Sal, Josep Lluís Larriba-Pey¹•Institutions (1)

Polytechnic University of Catalonia¹

07 Apr 2014

TL;DR: Scalable Community Detection is proposed, a novel disjoint community detection algorithm that is able to run up to two orders of magnitude faster than practical existing solutions by exploiting the parallelism of current multi-core processors, enabling us to process graphs of unprecedented size in short execution times.

...read moreread less

Abstract: Community detection has arisen as one of the most relevant topics in the field of graph mining, principally for its applications in domains such as social or biological networks analysis. Different community detection algorithms have been proposed during the last decade, approaching the problem from different perspectives. However, existing algorithms are, in general, based on complex and expensive computations, making them unsuitable for large graphs with millions of vertices and edges such as those usually found in the real world. In this paper, we propose a novel disjoint community detection algorithm called Scalable Community Detection (SCD). By combining different strategies, SCD partitions the graph by maximizing the Weighted Community Clustering (WCC), a recently proposed community detection metric based on triangle analysis. Using real graphs with ground truth overlapped communities, we show that SCD outperforms the current state of the art proposals (even those aimed at finding overlapping communities) in terms of quality and performance. SCD provides the speed of the fastest algorithms and the quality in terms of NMI and F1Score of the most accurate state of the art proposals. We show that SCD is able to run up to two orders of magnitude faster than practical existing solutions by exploiting the parallelism of current multi-core processors, enabling us to process graphs of unprecedented size in short execution times.

...read moreread less

146 citations

Cites methods from "Qualitative Comparison of Community..."

...Infomap, Louvain and Walktrap are considered the best algorithms for disjoint community detection, according to [10, 15]....
[...]

Journal Article•DOI•

Comparative Evaluation of Community Detection Algorithms: A Topological Approach

[...]

Günce Keziban Orman¹, Günce Keziban Orman², Vincent Labatut², Hocine Cherifi¹•Institutions (2)

University of Burgundy¹, Galatasaray University²

21 Jun 2012-arXiv: Social and Information Networks

TL;DR: A comprehensive comparative study of a representative set of community detection methods, in which community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure and it turns out there is no equivalence between the two approaches.

...read moreread less

Abstract: Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic real-world systems, we use artificially generated realistic networks. It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment.

...read moreread less

121 citations

Cites methods from "Qualitative Comparison of Community..."

...edness has a different distribution, which depends on the considered network class. To overcome this drawback, we modified the LFR model so that it produces a more realistic embeddedness distribution [32]. After some tests, we decided to focus on three classes in particular, because the generated networks were globally more similar to them: communication, Internet and biological networks. In our modif...
[...]

Journal Article•DOI•

Surprise maximization reveals the community structure of complex networks

[...]

Rodrigo Aldecoa¹, Ignacio Marín¹•Institutions (1)

Spanish National Research Council¹

14 Jan 2013-Scientific Reports

TL;DR: It is concluded that Surprise maximization precisely reveals the community structure of complex networks.

...read moreread less

Abstract: How to determine the community structure of complex networks is an open question. It is critical to establish the best strategies for community detection in networks of unknown structure. Here, using standard synthetic benchmarks, we show that none of the algorithms hitherto developed for community structure characterization perform optimally. Significantly, evaluating the results according to their modularity, the most popular measure of the quality of a partition, systematically provides mistaken solutions. However, a novel quality function, called Surprise, can be used to elucidate which is the optimal division into communities. Consequently, we show that the best strategy to find the community structure of all the networks examined involves choosing among the solutions provided by multiple algorithms the one with the highest Surprise value. We conclude that Surprise maximization precisely reveals the community structure of complex networks.

...read moreread less

94 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Emergence of Scaling in Random Networks

[...]

Albert-László Barabási¹, Réka Albert¹•Institutions (1)

University of Notre Dame¹

15 Oct 1999-Science

TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

...read moreread less

Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

...read moreread less

33,771 citations

"Qualitative Comparison of Community..." refers methods in this paper

...By applying Barabási & Albert’s preferential attachment model (BA) [21] instead of the CM, the degree correlation and transitivity become more stable relatively to changes in ....
[...]

Journal Article•DOI•

The Structure and Function of Complex Networks

[...]

Mark Newman

01 Jan 2003-Siam Review

TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

...read moreread less

Abstract: Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

...read moreread less

17,647 citations

"Qualitative Comparison of Community..." refers background or methods in this paper

...This is realistic [8], but holds only under certain conditions....
[...]
...By construction, the LFR method guaranties to obtain values considered as realistic [1, 8] for several properties: size of the network, power law distributed degrees and community sizes....
[...]
...Some properties common to most real-world networks are well-identified: powerlaw distributed degree, small-worldness, non-zero degree correlation and relatively high transitivity [8]....
[...]

Journal Article•DOI•

Community structure in social and biological networks

[...]

Michelle Girvan¹, Mark Newman•Institutions (1)

Santa Fe Institute¹

11 Jun 2002-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: This article proposes a method for detecting communities, built around the idea of using centrality indices to find community boundaries, and tests it on computer-generated and real-world graphs whose community structure is already known and finds that the method detects this known structure with high sensitivity and reliability.

...read moreread less

Abstract: A number of recent studies have focused on the statistical properties of networked systems such as social networks and the Worldwide Web. Researchers have concentrated particularly on a few properties that seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this article, we highlight another property that is found in many networks, the property of community structure, in which network nodes are joined together in tightly knit groups, between which there are only looser connections. We propose a method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer-generated and real-world graphs whose community structure is already known and find that the method detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well known—a collaboration network and a food web—and find that it detects significant and informative community divisions in both cases.

...read moreread less

14,429 citations

"Qualitative Comparison of Community..." refers background in this paper

...Girvan and Newman seemingly defined the first one [5], which produces networks taking roughly the form of sets of small interconnected Erdős-Rényi...
[...]
...Authors traditionally test their community detection algorithms on real-world and/or artificial networks [5, 6]....
[...]

Journal Article•DOI•

Fast unfolding of communities in large networks

[...]

Vincent D. Blondel¹, Jean-Loup Guillaume¹, Jean-Loup Guillaume², Renaud Lambiotte³, Renaud Lambiotte¹, Etienne Lefebvre¹ - Show less +2 more•Institutions (3)

Université catholique de Louvain¹, Pierre-and-Marie-Curie University², Imperial College London³

04 Mar 2008-arXiv: Physics and Society

TL;DR: This work proposes a heuristic method that is shown to outperform all other known community detection methods in terms of computation time and the quality of the communities detected is very good, as measured by the so-called modularity.

...read moreread less

Abstract: We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2.6 million customers and by analyzing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad-hoc modular networks. .

...read moreread less

13,519 citations

"Qualitative Comparison of Community..." refers methods in this paper

...Fast Greedy applies a basic greedy approach [3], and Louvain includes a community aggregation step to improve processing on large networks [23]....
[...]

Journal Article•DOI•

Finding and evaluating community structure in networks.

[...]

Mark Newman¹, Mark Newman², Michelle Girvan², Michelle Girvan³•Institutions (3)

University of Michigan¹, Santa Fe Institute², Cornell University³

26 Feb 2004-Physical Review E

TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.

...read moreread less

Abstract: We propose and study a set of algorithms for discovering community structure in networks-natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from the network to split it into communities, the edges removed being identified using any one of a number of possible "betweenness" measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.

...read moreread less

12,882 citations

"Qualitative Comparison of Community..." refers background or methods in this paper

...A community roughly corresponds to a group of nodes more densely interconnected, relatively to the rest of the network [3]....
[...]
...Fast Greedy applies a basic greedy approach [3], and Louvain includes a community aggregation step to improve processing on large networks [23]....
[...]