scispace - formally typeset
Open AccessJournal ArticleDOI

Mixed Membership Stochastic Blockmodels

TLDR
In this article, the authors introduce a class of variance allocation models for pairwise measurements, called mixed membership stochastic blockmodels, which combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters (mixed membership), and develop a general variational inference algorithm for fast approximate posterior inference.
Abstract
Consider data consisting of pairwise measurements, such as presence or absence of links between pairs of objects. These data arise, for instance, in the analysis of protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing pairwise measurements with probabilistic models requires special assumptions, since the usual independence or exchangeability assumptions no longer hold. Here we introduce a class of variance allocation models for pairwise measurements: mixed membership stochastic blockmodels. These models combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters that instantiate node-specific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodels with applications to social networks and protein interaction networks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles

TL;DR: This work shows how an adversary can exploit an online social network with a mixture of public and private user profiles to predict the private attributes of users, and proposes practical models that use friendship and group membership information to infer sensitive attributes.
Journal ArticleDOI

The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies

TL;DR: The nested Chinese restaurant process (nCRP) as discussed by the authors is a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees, and it can be used as a prior distribution in a Bayesian nonparametric model of document collections.
Posted Content

The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies

TL;DR: An application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction.
Proceedings Article

Relational Topic Models for Document Networks

TL;DR: The relational topic model (RTM) is developed, a model of documents and the links between them that derives efficient inference and learning algorithms based on variational methods and evaluates the predictive performance of the RTM for large networks of scientific abstracts and web documents.
Journal ArticleDOI

Clustering and Community Detection in Directed Networks: A Survey

TL;DR: In this article, the authors present an in-depth comparative review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications.
References
More filters
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Journal ArticleDOI

Finding scientific topics

TL;DR: A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.
Related Papers (5)