scispace - formally typeset
Search or ask a question

Showing papers on "Clique percolation method published in 2004"


Journal ArticleDOI
TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Abstract: We propose and study a set of algorithms for discovering community structure in networks-natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from the network to split it into communities, the edges removed being identified using any one of a number of possible "betweenness" measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.

12,882 citations


Journal ArticleDOI
TL;DR: A hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O (md log n) where d is the depth of the dendrogram describing the community structure.
Abstract: The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O (md log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m approximately n and d approximately log n, in which case our algorithm runs in essentially linear time, O (n log(2) n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2 x 10(6) edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

6,599 citations


Journal ArticleDOI
TL;DR: An algorithm is described which gives excellent results when tested on both computer-generated and real-world networks and is much faster, typically thousands of times faster, than previous algorithms.
Abstract: Many networks display community structure--groups of vertices within which connections are dense but between which they are sparser--and sensitive computer algorithms have in recent years been developed for detecting this structure. These algorithms, however, are computationally demanding, which limits their application to small networks. Here we describe an algorithm which gives excellent results when tested on both computer-generated and real-world networks and is much faster, typically thousands of times faster, than previous algorithms. We give several example applications, including one to a collaboration network of more than 50,000 physicists.

5,127 citations


Journal ArticleDOI
TL;DR: This article proposes a local algorithm to detect communities which outperforms the existing algorithms with respect to computational cost, keeping the same level of reliability and applies to a network of scientific collaborations, which, for its size, cannot be attacked with the usual methods.
Abstract: The investigation of community structures in networks is an important issue in many domains and disciplines This problem is relevant for social tasks (objective analysis of relationships on the web), biological inquiries (functional studies in metabolic and protein networks), or technological problems (optimization of large infrastructures) Several types of algorithms exist for revealing the community structure in networks, but a general and quantitative definition of community is not implemented in the algorithms, leading to an intrinsic difficulty in the interpretation of the results without any additional nontopological information In this article we deal with this problem by showing how quantitative definitions of community are implemented in practice in the existing algorithms In this way the algorithms for the identification of the community structure become fully self-contained Furthermore, we propose a local algorithm to detect communities which outperforms the existing algorithms with respect to computational cost, keeping the same level of reliability The algorithm is tested on artificial and real-world graphs In particular, we show how the algorithm applies to a network of scientific collaborations, which, for its size, cannot be attacked with the usual methods This type of local algorithm could open the way to applications to large-scale technological and biological systems

2,309 citations


Journal ArticleDOI
TL;DR: A number of more recent algorithms that appear to work well with real-world network data, including algorithms based on edge betweenness scores, on counts of short loops in networks and on voltage differences in resistor networks are described.
Abstract: There has been considerable recent interest in algorithms for finding communities in networks— groups of vertices within which connections are dense, but between which connections are sparser. Here we review the progress that has been made towards this end. We begin by describing some traditional methods of community detection, such as spectral bisection, the Kernighan-Lin algorithm and hierarchical clustering based on similarity measures. None of these methods, however, is ideal for the types of real-world network data with which current research is concerned, such as Internet and web data and biological and social networks. We describe a number of more recent algorithms that appear to work well with these data, including algorithms based on edge betweenness scores, on counts of short loops in networks and on voltage differences in resistor networks.

2,032 citations


Journal ArticleDOI
TL;DR: It is shown both numerically and analytically that random graphs and scale-free networks have modularity and it is argued that this fact must be taken into consideration to define statistically significant modularity in complex networks.
Abstract: The mechanisms by which modularity emerges in complex networks are not well understood but recent reports have suggested that modularity may arise from evolutionary selection. We show that finding the modularity of a network is analogous to finding the ground-state energy of a spin system. Moreover, we demonstrate that, due to fluctuations, stochastic network models give rise to modular networks. Specifically, we show both numerically and analytically that random graphs and scale-free networks have modularity. We argue that this fact must be taken into consideration to define statistically significant modularity in complex networks.

881 citations


Journal ArticleDOI
TL;DR: An efficient and relatively fast algorithm for the detection of communities in complex networks is introduced that exploits spectral properties of the graph Laplacian matrix combined with hierarchical clustering techniques, and includes a procedure for maximizing the 'modularity' of the output.
Abstract: An efficient and relatively fast algorithm for the detection of communities in complex networks is introduced. The method exploits spectral properties of the graph Laplacian matrix combined with hierarchical clustering techniques, and includes a procedure for maximizing the 'modularity' of the output. Its performance is compared with that of other existing methods, as applied to different well-known instances of complex networks with a community structure, both computer generated and from the real world. Our results are, in all the cases tested, at least as good as the best ones obtained with any other methods, and faster in most of the cases than methods providing similar quality results. This converts the algorithm into a valuable computational tool for detecting and analysing communities and modular structures in complex networks.

451 citations


Proceedings ArticleDOI
10 Jun 2004
TL;DR: This work proposes a novel idea of using mixed clustering technique called clustering in quest (CLIQUE) (R. Agrawal et al., 1998) in experiments with KDD Cup '99 data to detect attacks efficiently and assumed that CLIQUE can handle large database of high dimensional network traffic data efficiently.
Abstract: We propose a grid based technique to mine the KDD Cup '99 data. We propose a novel idea of using mixed clustering technique called clustering in quest (CLIQUE) (R. Agrawal et al., 1998) in experiments with KDD Cup '99 data to detect attacks efficiently. Novelty lies in the fact that CLIQUE was never used on network traffic data. The results produced by CLIQUE when evaluated on synthetic data sets improved as the dimensionality of the data increased. Based on these results we assumed that CLIQUE can handle large database of high dimensional network traffic data efficiently. CLIQUE clustering technique is a combination of grid-based clustering and density-based clustering (R. Agrawal et al., 1998).

7 citations