scispace - formally typeset
Open AccessJournal ArticleDOI

Pseudo-likelihood methods for community detection in large sparse networks

TLDR
In this paper, the authors proposed a fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees.
Abstract
Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Matrix estimation by Universal Singular Value Thresholding

TL;DR: This paper introduces a simple estimation procedure, called Universal Singular Value Thresholding (USVT), that works for any matrix that has "a little bit of structure" and achieves the minimax error rate up to a constant factor.
Journal ArticleDOI

Matrix estimation by Universal Singular Value Thresholding

TL;DR: The Universal Singular Value Thresholding (USVT) estimator as discussed by the authors achieves the minimax error rate up to a constant factor for any matrix that has a little bit of structure.
Journal ArticleDOI

Spectral methods for community detection and graph partitioning.

TL;DR: It is shown that with certain choices of the free parameters appearing in these spectral algorithms the algorithms for all three problems are identical, and hence there is no difference between the modularity- and inference-based community detection methods, or between either and graph partitioning.
Journal ArticleDOI

Localization and centrality in networks

TL;DR: An alternative centrality measure based on the nonbacktracking matrix is proposed, which gives results closely similar to the standard eigenvector centrality in dense networks where the latter is well behaved but avoids localization and gives useful results in regimes where the standard centrality fails.
Journal ArticleDOI

Fast community detection by score

TL;DR: A theoretic framework is developed where it is shown that under mild conditions, the SCORE stably yields successful community detection and is much more satisfactory than those by the classical spectral methods.
References
More filters
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Journal ArticleDOI

Finding and evaluating community structure in networks.

TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Journal ArticleDOI

Modularity and community structure in networks

TL;DR: In this article, the modularity of a network is expressed in terms of the eigenvectors of a characteristic matrix for the network, which is then used for community detection.
Journal ArticleDOI

Community detection in graphs

TL;DR: A thorough exposition of the main elements of the clustering problem can be found in this paper, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.
Related Papers (5)