scispace - formally typeset
Search or ask a question
Topic

Hierarchical Dirichlet process

About: Hierarchical Dirichlet process is a research topic. Over the lifetime, 901 publications have been published within this topic receiving 68555 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

30,570 citations

Journal ArticleDOI
TL;DR: This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.
Abstract: We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes ...

3,755 citations

Journal ArticleDOI
TL;DR: In this article, Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model are presented, and two new classes of methods are presented. But neither of these methods is suitable for handling general models with non-conjugate priors.
Abstract: This article reviews Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model and presents two new classes of methods. One new approach is to make Metropolis—Hastings updates of the indicators specifying which mixture component is associated with each observation, perhaps supplemented with a partial form of Gibbs sampling. The other new approach extends Gibbs sampling for these indicators by using a set of auxiliary parameters. These methods are simple to implement and are more efficient than previous ways of handling general Dirichlet process mixture models with non-conjugate priors.

2,320 citations

Journal ArticleDOI
TL;DR: In this article, the conditional distribution of the random measure, given the observations, is no longer that of a simple Dirichlet process, but can be described as being a mixture of DirICHlet processes.
Abstract: process. This paper extends Ferguson's result to cases where the random measure is a mixing distribution for a parameter which determines the distribution from which observations are made. The conditional distribution of the random measure, given the observations, is no longer that of a simple Dirichlet process, but can be described as being a mixture of Dirichlet processes. This paper gives a formal definition for these mixtures and develops several theorems about their properties, the most important of which is a closure property for such mixtures. Formulas for computing the conditional distribution are derived and applications to problems in bio-assay, discrimination, regression, and mixing distributions are given.

2,146 citations

Journal ArticleDOI
TL;DR: Two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors are presented and the blocked Gibbs sampler, based on an entirely different approach that works by directly sampling values from the posterior of the random measure.
Abstract: A rich and flexible class of random probability measures, which we call stick-breaking priors, can be constructed using a sequence of independent beta random variables. Examples of random measures that have this characterization include the Dirichlet process, its two-parameter extension, the two-parameter Poisson–Dirichlet process, finite dimensional Dirichlet priors, and beta two-parameter processes. The rich nature of stick-breaking priors offers Bayesians a useful class of priors for nonparametric problems, while the similar construction used in each prior can be exploited to develop a general computational procedure for fitting them. In this article we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling method currently employed for Dirichlet process computing. This method applies t...

1,701 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
78% related
Cluster analysis
146.5K papers, 2.9M citations
78% related
Markov chain
51.9K papers, 1.3M citations
77% related
Robustness (computer science)
94.7K papers, 1.6M citations
77% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202313
202244
202123
202038
201940
201842