scispace - formally typeset
Open AccessJournal ArticleDOI

Mixed Membership Stochastic Blockmodels

TLDR
In this article, the authors introduce a class of variance allocation models for pairwise measurements, called mixed membership stochastic blockmodels, which combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters (mixed membership), and develop a general variational inference algorithm for fast approximate posterior inference.
Abstract
Consider data consisting of pairwise measurements, such as presence or absence of links between pairs of objects. These data arise, for instance, in the analysis of protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing pairwise measurements with probabilistic models requires special assumptions, since the usual independence or exchangeability assumptions no longer hold. Here we introduce a class of variance allocation models for pairwise measurements: mixed membership stochastic blockmodels. These models combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters that instantiate node-specific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodels with applications to social networks and protein interaction networks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Modeling and detecting community hierarchies

TL;DR: A theoretical model is proposed that explicitly formalizes both the tight connections within each community and the hierarchical nature of the communities, and an efficient algorithm is presented that provably detects all the communities in the model.
Posted Content

A Sharp Lower Bound for Mixed-membership Estimation

TL;DR: An undirected network with n nodes and K communities and the focus is on the special case where all $\pi_i$ are degenerate; the goal is clustering, so Hamming distance is the natural choice of loss function, and the rate can be exponentially fast.
Journal ArticleDOI

Model-based clustering of time-evolving networks through temporal exponential-family random graph models.

TL;DR: In this paper, a model-based clustering framework for time-evolving networks based on discrete time exponential-family random graph models is proposed, which simultaneously allows both modeling and detecting group structure.
Journal ArticleDOI

Identifying tumor clones in sparse single-cell mutation data.

TL;DR: SBMClone is introduced, a method to infer clusters of cells, or clones, that share groups of somatic single-nucleotide mutations, and it is shown that SBMClones accurately infers the true clonal composition on simulated datasets with coverage at low as 0.2×.
Book ChapterDOI

Multiplicative Attribute Graph Model of Real-World Networks

TL;DR: The Multiplicative Attribute Graphs (MAG) model as discussed by the authors captures the interactions between the network structure and the node attributes, where each node has a vector of categorical latent attributes associated with it and the probability of an edge between a pair of nodes then depends on the product of individual attribute-attribute affinities.
References
More filters
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Journal ArticleDOI

Finding scientific topics

TL;DR: A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.
Related Papers (5)