Pseudo-likelihood methods for community detection in large sparse networks

doi:10.1214/13-AOS1138

Open AccessJournal ArticleDOI

Pseudo-likelihood methods for community detection in large sparse networks

Arash A. Amini, +3 more

- 10 Jul 2012 -

arXiv: Social and Information Networks

TLDR

It is proved that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

Abstract:

Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Spectral redemption in clustering sparse networks

Florent Krzakala, +6 more

- 24 Dec 2013 -

Proceedings of the National Academy of S...

TL;DR: A way of encoding sparse data using a “nonbacktracking” matrix, and it is shown that the corresponding spectral algorithm performs optimally for some popular generative models, including the stochastic block model.

...read moreread less

Journal ArticleDOI

A useful variant of the Davis--Kahan theorem for statisticians

Yi Yu, +2 more

- 01 Jun 2015 -

Biometrika

TL;DR: In this paper, the authors present a variant of the Davis-Kahan theorem that relies only on a population eigenvalue separation condition, making it more natural and convenient for direct application in statistical contexts, and provide an improvement in many cases to the usual bound.

...read moreread less

Journal ArticleDOI

Matrix estimation by Universal Singular Value Thresholding

Sourav Chatterjee

- 06 Dec 2012 -

arXiv: Statistics Theory

TL;DR: This paper introduces a simple estimation procedure, called Universal Singular Value Thresholding (USVT), that works for any matrix that has "a little bit of structure" and achieves the minimax error rate up to a constant factor.

...read moreread less

Journal ArticleDOI

Consistency of spectral clustering in stochastic block models

Jing Lei, +1 more

- 07 Dec 2013 -

arXiv: Statistics Theory

TL;DR: It is shown that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as $\log n$ with $n$ the number of nodes.

...read moreread less

Journal ArticleDOI

Spectral methods for community detection and graph partitioning.

Mark Newman

- 30 Oct 2013 -

Physical Review E

TL;DR: It is shown that with certain choices of the free parameters appearing in these spectral algorithms the algorithms for all three problems are identical, and hence there is no difference between the modularity- and inference-based community detection methods, or between either and graph partitioning.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Normalized cuts and image segmentation

Jianbo Shi, +1 more

- 01 Aug 2000 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Journal ArticleDOI

Finding and evaluating community structure in networks.

Mark Newman, +3 more

- 26 Feb 2004 -

Physical Review E

TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.

...read moreread less

Proceedings ArticleDOI

Normalized cuts and image segmentation

Jianbo Shi, +1 more

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Journal ArticleDOI

Modularity and community structure in networks

Mark Newman

- 06 Jun 2006 -

Proceedings of the National Academy of S...

TL;DR: In this article, the modularity of a network is expressed in terms of the eigenvectors of a characteristic matrix for the network, which is then used for community detection.

...read moreread less

Journal ArticleDOI

Community detection in graphs

Santo Fortunato

- 03 Jun 2009 -

arXiv: Physics and Society

TL;DR: A thorough exposition of community structure, or clustering, is attempted, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists.

...read moreread less