Showing papers on "Hierarchical Dirichlet process published in 2012"

PDF

Open Access

Journal Article•DOI•

A Spectral Algorithm for Latent Dirichlet Allocation

[...]

Animashree Anandkumar¹, Yi-Kai Liu², Daniel Hsu³, Dean P. Foster⁴, Sham M. Kakade⁵ - Show less +1 more•Institutions (5)

University of California, Irvine¹, National Institute of Standards and Technology², Columbia University³, Yahoo!⁴, Microsoft⁵

03 Dec 2012

TL;DR: This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).

...read moreread less

Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.

...read moreread less

271 citations

Posted Content•

Stochastic Variational Inference

[...]

Matthew D. Hoffman¹, David M. Blei², Chong Wang³, John Paisley⁴•Institutions (4)

Adobe Systems¹, Princeton University², Carnegie Mellon University³, University of California, Berkeley⁴

29 Jun 2012-arXiv: Machine Learning

TL;DR: This article developed stochastic variational inference, a scalable algorithm for approximating posterior distributions for a large class of probabilistic models, including the hierarchical Dirichlet process topic model.

...read moreread less

Abstract: We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.

...read moreread less

119 citations

Journal Article•DOI•

Beta Processes, Stick-Breaking and Power Laws

[...]

Tamara Broderick, Michael I. Jordan, Jim Pitman

01 Jun 2012-Bayesian Analysis

TL;DR: In this article, the authors derive a stick-breaking representation for the Dirichlet process from the characterization of the beta process as a completely random measure, which they use to derive a three-parameter generalization of the Beta process.

...read moreread less

Abstract: The beta-Bernoulli process provides a Bayesian nonparametric prior for models involving collections of binary-valued features. A draw from the beta process yields an infinite collection of probabilities in the unit interval, and a draw from the Bernoulli process turns these into binary-valued features. Recent work has provided stick-breaking representations for the beta process analogous to the well-known stick-breaking representation for the Dirichlet process. We derive one such stick-breaking representation directly from the characterization of the beta process as a completely random measure. This approach motivates a three-parameter generalization of the beta process, and we study the power laws that can be obtained from this generalized beta process. We present a posterior inference algorithm for the beta-Bernoulli process that exploits the stick-breaking representation, and we present experimental results for a discrete factor-analysis model.

...read moreread less

94 citations

Proceedings Article•

Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes

[...]

Michael Bryant¹, Erik B. Sudderth¹•Institutions (1)

Brown University¹

03 Dec 2012

TL;DR: This paper considers a nonparametric topic model based on the hierarchical Dirichlet process, and develops a novel online variational inference algorithm based on split-merge topic updates that can achieve substantially better predictions of test data than conventional online and batch variational algorithms.

...read moreread less

Abstract: Variational methods provide a computationally scalable alternative to Monte Carlo methods for large-scale, Bayesian nonparametric learning. In practice, however, conventional batch and online variational methods quickly become trapped in local optima. In this paper, we consider a nonparametric topic model based on the hierarchical Dirichlet process (HDP), and develop a novel online variational inference algorithm based on split-merge topic updates. We derive a simpler and faster variational approximation of the HDP, and show that by intelligently splitting and merging components of the variational posterior, we can achieve substantially better predictions of test data than conventional online and batch variational algorithms. For streaming analysis of large datasets where batch analysis is in-feasible, we show that our split-merge updates better capture the nonparametric properties of the underlying model, allowing continual learning of new topics.

...read moreread less

92 citations

Journal Article•DOI•

On the Support of MacEachern’s Dependent Dirichlet Processes and Extensions

[...]

Andrés F. Barrientos, Alejandro Jara, Fernando A. Quintana

01 Jun 2012-Bayesian Analysis

TL;DR: It is shown that under mild conditions on the copula functions, the version where only the support points or the weights are dependent on predictors have full weak support.

...read moreread less

Abstract: We study the support properties of Dirichlet process–based models for sets of predictor–dependent probability distributions. Exploiting the connection between copulas and stochastic processes, we provide an alternative definition of MacEachern’s dependent Dirichlet processes. Based on this definition, we provide sufficient conditions for the full weak support of different versions of the process. In particular, we show that under mild conditions on the copula functions, the version where only the support points or the weights are dependent on predictors have full weak support. In addition, we also characterize the Hellinger and Kullback–Leibler support of mixtures induced by the different versions of the dependent Dirichlet process. A generalization of the results for the general class of dependent stick–breaking processes is also provided.

...read moreread less

82 citations

Posted Content•

Nonparametric Bayes Pachinko Allocation

[...]

Wei Li¹, David M. Blei², Andrew McCallum¹•Institutions (2)

University of Massachusetts Amherst¹, Princeton University²

20 Jun 2012-arXiv: Information Retrieval

TL;DR: A nonparametric Bayesian prior for PAM is proposed based on a variant of the hierarchical Dirichlet process (HDP), and it is shown that non Parametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

...read moreread less

Abstract: Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset. In this paper, we propose a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP). Although the HDP can capture topic correlations defined by nested data structure, it does not automatically discover such correlations from unstructured data. By assuming an HDP-based prior for PAM, we are able to learn both the number of topics and how the topics are correlated. We evaluate our model on synthetic and real-world text datasets, and show that nonparametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

...read moreread less

79 citations

Journal Article•DOI•

Positive vectors clustering using inverted Dirichlet finite mixture models

[...]

Taoufik Bdiri¹, Nizar Bouguila¹•Institutions (1)

Concordia University¹

01 Feb 2012-Expert Systems With Applications

TL;DR: This work presents an unsupervised algorithm for learning finite mixture models from multivariate positive data and develops an approach, based on the minimum message length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture.

...read moreread less

Abstract: In this work we present an unsupervised algorithm for learning finite mixture models from multivariate positive data. Indeed, this kind of data appears naturally in many applications, yet it has not been adequately addressed in the past. This mixture model is based on the inverted Dirichlet distribution, which offers a good representation and modeling of positive non-Gaussian data. The proposed approach for estimating the parameters of an inverted Dirichlet mixture is based on the maximum likelihood (ML) using Newton Raphson method. We also develop an approach, based on the minimum message length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture. Experimental results are presented using artificial histograms and real data sets. The challenging problem of software modules classification is investigated within the proposed statistical framework, also.

...read moreread less

78 citations

Proceedings Article•

Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models

[...]

Ke Jiang¹, Brian Kulis¹, Michael I. Jordan²•Institutions (2)

Ohio State University¹, University of California, Berkeley²

03 Dec 2012

TL;DR: This paper derives novel clustering algorithms from the asymptotic limit of the DP and HDP mixtures that features the scalability of existing hard clustering methods as well as the flexibility of Bayesian nonparametric models.

...read moreread less

Abstract: Sampling and variational inference techniques are two standard methods for inference in probabilistic models, but for many problems, neither approach scales effectively to large-scale data. An alternative is to relax the probabilistic model into a non-probabilistic formulation which has a scalable associated algorithm. This can often be fulfilled by performing small-variance asymptotics, i.e., letting the variance of particular distributions in the model go to zero. For instance, in the context of clustering, such an approach yields connections between the k-means and EM algorithms. In this paper, we explore small-variance asymptotics for exponential family Dirichlet process (DP) and hierarchical Dirichlet process (HDP) mixture models. Utilizing connections between exponential family distributions and Bregman divergences, we derive novel clustering algorithms from the asymptotic limit of the DP and HDP mixtures that features the scalability of existing hard clustering methods as well as the flexibility of Bayesian nonparametric models. We focus on special cases of our analysis for discrete-data problems, including topic modeling, and we demonstrate the utility of our results by applying variants of our algorithms to problems arising in vision and document analysis.

...read moreread less

73 citations

Proceedings Article•

Truncation-free Online Variational Inference for Bayesian Nonparametric Models

[...]

Chong Wang¹, David M. Blei¹•Institutions (1)

Princeton University¹

01 Jan 2012

TL;DR: This work presents a truncation-free online variational inference algorithm for Bayesian nonparametric models that adapts model complexity on the fly and shows better performance than previous online Variational inference algorithms.

...read moreread less

Abstract: We present a truncation-free online variational inference algorithm for Bayesian nonparametric models. Unlike traditional (online) variational inference algorithms that require truncations for the model or the variational distribution, our method adapts model complexity on the fly. Our experiments for Dirichlet process mixture models and hierarchical Dirichlet process topic models on two large-scale data sets show better performance than previous online variational inference algorithms.

...read moreread less

63 citations

Journal Article•DOI•

The Discrete Innite Logistic Normal Distribution

[...]

John Paisley, Chong Wang, David M. Blei

01 Dec 2012-Bayesian Analysis

TL;DR: The discrete innite logistic normal distribution (DILN) as mentioned in this paper generalizes the hierarchical Dirichlet process (HDP) to model correlation structure between the weights of the atoms at the group level.

...read moreread less

Abstract: We present the discrete innite logistic normal distribution (DILN), a Bayesian nonparametric prior for mixed membership models. DILN generalizes the hierarchical Dirichlet process (HDP) to model correlation structure between the weights of the atoms at the group level. We derive a representation of DILN as a normalized collection of gamma-distributed random variables and study its statistical properties. We derive a variational inference algorithm for approximate posterior inference. We apply DILN to topic modeling of documents and study its empirical performance on four corpora, comparing performance with the HDP and the correlated topic model (CTM). To compute with large-scale data, we develop a stochastic variational inference algorithm for DILN and compare with similar algorithms for HDP and latent Dirichlet allocation (LDA) on a collection of 350; 000 articles from Nature.

...read moreread less

53 citations

Posted Content•

A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process

[...]

Chong Wang¹, David M. Blei¹•Institutions (1)

Princeton University¹

08 Jan 2012-arXiv: Machine Learning

TL;DR: It is found that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and some understanding of the data properties that give rise to larger improvements is given.

...read moreread less

Abstract: The hierarchical Dirichlet process (HDP) has become an important Bayesian nonparametric model for grouped data, such as document collections. The HDP is used to construct a flexible mixed-membership model where the number of components is determined by the data. As for most Bayesian nonparametric models, exact posterior inference is intractable---practitioners use Markov chain Monte Carlo (MCMC) or variational inference. Inspired by the split-merge MCMC algorithm for the Dirichlet process (DP) mixture model, we describe a novel split-merge MCMC sampling algorithm for posterior inference in the HDP. We study its properties on both synthetic data and text corpora. We find that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and we give some understanding of the data properties that give rise to larger improvements.

...read moreread less

Journal Article•DOI•

A countably infinite mixture model for clustering and feature selection

[...]

Nizar Bouguila¹, Djemel Ziou²•Institutions (2)

Concordia University¹, Université de Sherbrooke²

01 Nov 2012-Knowledge and Information Systems

TL;DR: Finite generalized Dirichlet mixture models are extended to the infinite case in which the number of components and relevant features do not need to be known a priori, which provides a natural representation of uncertainty regarding the challenging problem of model selection.

...read moreread less

Abstract: Mixture modeling is one of the most useful tools in machine learning and data mining applications. An important challenge when applying finite mixture models is the selection of the number of clusters which best describes the data. Recent developments have shown that this problem can be handled by the application of non-parametric Bayesian techniques to mixture modeling. Another important crucial preprocessing step to mixture learning is the selection of the most relevant features. The main approach in this paper, to tackle these problems, consists on storing the knowledge in a generalized Dirichlet mixture model by applying non-parametric Bayesian estimation and inference techniques. Specifically, we extend finite generalized Dirichlet mixture models to the infinite case in which the number of components and relevant features do not need to be known a priori. This extension provides a natural representation of uncertainty regarding the challenging problem of model selection. We propose a Markov Chain Monte Carlo algorithm to learn the resulted infinite mixture. Through applications involving text and image categorization, we show that infinite mixture models offer a more powerful and robust performance than classic finite mixtures for both clustering and feature selection.

...read moreread less

Posted Content•

Augment-and-Conquer Negative Binomial Processes

[...]

Mingyuan Zhou¹, Lawrence Carin¹•Institutions (1)

Duke University¹

05 Sep 2012-arXiv: Machine Learning

TL;DR: In this article, the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages, and a variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling.

...read moreread less

Abstract: By developing data augmentation methods unique to the negative binomial (NB) distribution, we unite seemingly disjoint count and mixture models under the NB process framework. We develop fundamental properties of the models and derive efficient Gibbs sampling inference. We show that the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages. A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters.

...read moreread less

Journal Article•DOI•

Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes

[...]

François Caron, Yee Whye Teh, Thomas Brendan Murphy

21 Nov 2012-arXiv: Machine Learning

TL;DR: A Bayesian nonparametric model for clustering partial ranking data, based on the theory of random atomic measures, is proposed, which characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation.

...read moreread less

Abstract: In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation. We then develop a Dirichlet process mixture extension of our model and apply it to investigate the clustering of preferences for college degree programmes amongst Irish secondary school graduates. The existence of clusters of applicants who have similar preferences for degree programmes is established and we determine that subject matter and geographical location of the third level institution characterise these clusters.

...read moreread less

Proceedings Article•

Truncation-free stochastic variational inference for Bayesian nonparametric models

[...]

Chong Wang¹, David M. Blei²•Institutions (2)

Carnegie Mellon University¹, Princeton University²

03 Dec 2012

TL;DR: This work presents a truncation-free stochastic variational inference algorithm for Bayesian nonparametric models that adapts model complexity on the fly and performs better than previous stochastically variational inferred inference algorithms.

...read moreread less

Abstract: We present a truncation-free stochastic variational inference algorithm for Bayesian nonparametric models. While traditional variational inference algorithms require truncations for the model or the variational distribution, our method adapts model complexity on the fly. We studied our method with Dirichlet process mixture models and hierarchical Dirichlet process topic models on two large data sets. Our method performs better than previous stochastic variational inference algorithms.

...read moreread less

Journal Article•DOI•

The infinite Student's t-mixture for robust modeling

[...]

Xin Wei¹, Chunguang Li¹•Institutions (1)

Zhejiang University¹

01 Jan 2012-Signal Processing

TL;DR: Through applications in blind signal detection and image segmentation, it is shown that the iSMM possesses the advantages of both the Student's t-distribution and the Dirichlet process mixture, offering a more powerful and robust performance than competing models.

...read moreread less

Journal Article•DOI•

Cluster and Feature Modeling from Combinatorial Stochastic Processes

[...]

Tamara Broderick¹, Michael I. Jordan¹, Jim Pitman¹•Institutions (1)

University of California, Berkeley¹

26 Jun 2012-arXiv: Statistics Theory

TL;DR: This paper develops analogous representations for the feature modeling problem of Bayesian nonparametric clustering, which include the beta process and the Indian buffet process as well as new representations that provide insight into the connections between these processes.

...read moreread less

Abstract: One of the focal points of the modern literature on Bayesian nonparametrics has been the problem of clustering, or partitioning, where each data point is modeled as being associated with one and only one of some collection of groups called clusters or partition blocks. Underlying these Bayesian nonparametric models are a set of interrelated stochastic processes, most notably the Dirichlet process and the Chinese restaurant process. In this paper we provide a formal development of an analogous problem, called feature modeling, for associating data points with arbitrary nonnegative integer numbers of groups, now called features or topics. We review the existing combinatorial stochastic process representations for the clustering problem and develop analogous representations for the feature modeling problem. These representations include the beta process and the Indian buffet process as well as new representations that provide insight into the connections between these processes. We thereby bring the same level of completeness to the treatment of Bayesian nonparametric feature modeling that has previously been achieved for Bayesian nonparametric clustering.

...read moreread less

Journal Article•DOI•

Hyperspectral Image Unmixing Using a Multiresolution Sticky HDP

[...]

Roni Mittelman¹, Nicolas Dobigeon², Alfred O. Hero¹•Institutions (2)

University of Michigan¹, University of Toulouse²

01 Apr 2012-IEEE Transactions on Signal Processing

TL;DR: A generative model for hyperspectral images in which the abundances are sampled from a Dirichlet distribution (DD) mixture model, whose parameters depend on a latent label process, which facilitates the use of scale-recursive estimation algorithms.

...read moreread less

Abstract: This paper is concerned with joint Bayesian endmember extraction and linear unmixing of hyperspectral images using a spatial prior on the abundance vectors. We propose a generative model for hyperspectral images in which the abundances are sampled from a Dirichlet distribution (DD) mixture model, whose parameters depend on a latent label process. The label process is then used to enforces a spatial prior which encourages adjacent pixels to have the same label. A Gibbs sampling framework is used to generate samples from the posterior distributions of the abundances and the parameters of the DD mixture model. The spatial prior that is used is a tree-structured sticky hierarchical Dirichlet process (SHDP) and, when used to determine the posterior endmember and abundance distributions, results in a new unmixing algorithm called spatially constrained unmixing (SCU). The directed Markov model facilitates the use of scale-recursive estimation algorithms, and is therefore more computationally efficient as compared to standard Markov random field (MRF) models. Furthermore, the proposed SCU algorithm estimates the number of regions in the image in an unsupervised fashion. The effectiveness of the proposed SCU algorithm is illustrated using synthetic and real data.

...read moreread less

Journal Article•DOI•

On a rapid simulation of the Dirichlet process

[...]

Mahmoud Zarepour¹, Luai Al Labadi¹•Institutions (1)

University of Ottawa¹

01 May 2012-Statistics & Probability Letters

TL;DR: A simple, yet efficient, procedure for approximating the Levy measure of a Gamma(α,1) random variable is described and a finite sum-representation is derived that converges almost surely to Ferguson’s representation of the Dirichlet process.

...read moreread less

Posted Content•

Gene Expression Time Course Clustering with Countably Infinite Hidden Markov Models

[...]

Matthew J. Beal¹, Praveen Krishnamurthy¹•Institutions (1)

State University of New York System¹

27 Jun 2012-arXiv: Learning

TL;DR: In this paper, an infinite hierarchical Dirichlet process (HDP) HMM with a countably infinite state space is proposed for clustering gene expression time course data, which is called the HDP-HMM.

...read moreread less

Abstract: Most existing approaches to clustering gene expression time course data treat the different time points as independent dimensions and are invariant to permutations, such as reversal, of the experimental time course. Approaches utilizing HMMs have been shown to be helpful in this regard, but are hampered by having to choose model architectures with appropriate complexities. Here we propose for a clustering application an HMM with a countably infinite state space; inference in this model is possible by recasting it in the hierarchical Dirichlet process (HDP) framework (Teh et al. 2006), and hence we call it the HDP-HMM. We show that the infinite model outperforms model selection methods over finite models, and traditional time-independent methods, as measured by a variety of external and internal indices for clustering on two large publicly available data sets. Moreover, we show that the infinite models utilize more hidden states and employ richer architectures (e.g. state-to-state transitions) without the damaging effects of overfitting.

...read moreread less

Journal Article•DOI•

Bayesian analysis of multistate event history data: beta-Dirichlet process prior

[...]

Yongdai Kim¹, Lancelot F. James², Rafael Weissbach³•Institutions (3)

Seoul National University¹, Hong Kong University of Science and Technology², University of Rostock³

01 Mar 2012-Biometrika

TL;DR: In this paper, a new prior process, called a beta-Dirichlet process, is introduced for the cumulative intensity functions and is proved to be conjugate with a Bayesian semiparametric regression model.

...read moreread less

Abstract: Bayesian analysis of a finite state Markov process, which is popularly used to model multistate event history data, is considered. A new prior process, called a beta-Dirichlet process, is introduced for the cumulative intensity functions and is proved to be conjugate. In addition, the beta-Dirichlet prior is applied to a Bayesian semiparametric regression model. To illustrate the application of the proposed model, we analyse a dataset of credit histories. Copyright 2012, Oxford University Press.

...read moreread less

Proceedings Article•DOI•

Practical collapsed variational bayes inference for hierarchical dirichlet process

[...]

Issei Sato¹, Kenichi Kurihara², Hiroshi Nakagawa¹•Institutions (2)

University of Tokyo¹, Google²

12 Aug 2012

TL;DR: A novel collapsed variational Bayes (CVB) inference for the hierarchical Dirichlet process (HDP) is proposed, which is simple to implement, does not require variance counts to be maintained,does not need to set hyper-parameters, and has good predictive performance.

...read moreread less

Abstract: We propose a novel collapsed variational Bayes (CVB) inference for the hierarchical Dirichlet process (HDP) While the existing CVB inference for the HDP variant of latent Dirichlet allocation (LDA) is more complicated and harder to implement than that for LDA, the proposed algorithm is simple to implement, does not require variance counts to be maintained, does not need to set hyper-parameters, and has good predictive performance

...read moreread less

Posted Content•

Inference of Temporally Varying Bayesian Networks

[...]

Thomas Thorne¹, Michael P. H. Stumpf¹•Institutions (1)

Imperial College London¹

02 Mar 2012-arXiv: Molecular Networks

TL;DR: A method to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point using the Hierarchical Dirichlet Process Hidden Markov Model.

...read moreread less

Abstract: When analysing gene expression time series data an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Whilst some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. Here we present a method that allows us to infer regulatory network structures that may vary between time points, utilising a set of hidden states that describe the network structure at a given time point. To model the distribution of the hidden states we have applied the Hierarchical Dirichlet Process Hideen Markov Model, a nonparametric extension of the traditional Hidden Markov Model, that does not require us to fix the number of hidden states in advance. We apply our method to exisiting microarray expression data as well as demonstrating is efficacy on simulated test data.

...read moreread less

Journal Article•DOI•

Inference of temporally varying Bayesian Networks

[...]

Thomas Thorne¹, Michael P. H. Stumpf¹•Institutions (1)

Imperial College London¹

01 Dec 2012-Bioinformatics

TL;DR: In this paper, Thorne et al. present a method that allows to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point.

...read moreread less

Abstract: Motivation: When analysing gene expression time series data, an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Although some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. Results: Here, we present a method that allows us to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point. To model the distribution of the hidden states, we have applied the Hierarchical Dirichlet Process Hidden Markov Model, a non-parametric extension of the traditional Hidden Markov Model, which does not require us to fix the number of hidden states in advance. We apply our method to existing microarray expression data as well as demonstrating is efficacy on simulated test data. Contact: thomas.thorne@imperial.ac.uk

...read moreread less

Posted Content•

The Hierarchical Dirichlet Process Hidden Semi-Markov Model

[...]

Matthew J. Johnson¹, Alan S. Willsky¹•Institutions (1)

Massachusetts Institute of Technology¹

15 Mar 2012-arXiv: Learning

TL;DR: In this article, the explicit-duration semi-Markovian HMM (HDP-HSMM) model is proposed to learn non-geometric state durations in the Hierarchical Dirichlet Process Hidden Markov Model.

...read moreread less

Abstract: There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the traditional HMM. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed in the parametric setting to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicitduration HDP-HSMM and develop posterior sampling algorithms for efficient inference in both the direct-assignment and weak-limit approximation settings. We demonstrate the utility of the model and our inference methods on synthetic data as well as experiments on a speaker diarization problem and an example of learning the patterns in Morse code.

...read moreread less

Proceedings Article•

Augment-and-Conquer Negative Binomial Processes

[...]

Mingyuan Zhou¹, Lawrence Carin¹•Institutions (1)

Duke University¹

03 Dec 2012

TL;DR: A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters.

...read moreread less

Proceedings Article•DOI•

An online HDP-HMM for joint action segmentation and classification in motion capture data

[...]

Ava Bargi¹, Richard Yi Da Xu¹, Massimo Piccardi¹•Institutions (1)

University of Technology, Sydney¹

16 Jun 2012

TL;DR: This paper proposes a novel online method based on the “sticky” hierarchical Dirichlet process and the hidden Markov model that provides joint segmentation and classification of actions while processing the data in an online, recursive manner, discovering new classes as they occur, and adjusting its parameters over the streaming data.

...read moreread less

Abstract: Since its inception, action recognition research has mainly focused on recognizing actions from closed, predefined sets of classes. Conversely, the problem of recognizing actions from open, possibly incremental sets of classes is still largely unexplored. In this paper, we propose a novel online method based on the “sticky” hierarchical Dirichlet process and the hidden Markov model [11, 5]. This approach, labelled as the online HDP-HMM, provides joint segmentation and classification of actions while a) processing the data in an online, recursive manner, b) discovering new classes as they occur, and c) adjusting its parameters over the streaming data. In a set of experiments, we have applied the online HDP-HMM to recognize actions from motion capture data from the TUM kitchen dataset, a challenging dataset of manipulation actions in a kitchen [12]. The results show significant accuracy in action classification, time segmentation and determination of the number of action classes.

...read moreread less

Proceedings Article•

A Hierarchical Dirichlet Process Model with Multiple Levels of Clustering for Human EEG Seizure Modeling

[...]

Drausin Wulsin¹, Brian Litt¹, Shane T. Jensen¹•Institutions (1)

University of Pennsylvania¹

26 Jun 2012

TL;DR: The MLC-HDP model is the first in the epilepsy literature capable of clustering seizures within and between patients and is found to be comparable to independent human physician clusterings.

...read moreread less

Abstract: Driven by the multi-level structure of human intracranial electroencephalogram (iEEG) recordings of epileptic seizures, we introduce a new variant of a hierarchical Dirichlet Process--the multi-level clustering hierarchical Dirichlet Process (MLC-HDP)--that simultaneously clusters datasets on multiple levels. Our seizure dataset contains brain activity recorded in typically more than a hundred individual channels for each seizure of each patient. The MLC-HDP model clusters over channels-types, seizure-types, and patient-types simultaneously. We describe this model and its implementation in detail. We also present the results of a simulation study comparing the MLC-HDP to a similar model, the Nested Dirichlet Process and finally demonstrate the MLC-HDP's use in modeling seizures across multiple patients. We find the MLC-HDP's clustering to be comparable to independent human physician clusterings. To our knowledge, the MLC-HDP model is the first in the epilepsy literature capable of clustering seizures within and between patients.

...read moreread less

Proceedings Article•

Update Summarization using a Multi-level Hierarchical Dirichlet Process Model

[...]

Jiwei Li¹, Sujian Li¹, Xun Wang¹, Ye Tian, Baobao Chang¹ - Show less +1 more•Institutions (1)

Peking University¹

01 Dec 2012

TL;DR: This paper borrows the idea of evolutionary clustering and proposes a three-level HDP model named h-uHDP, which reveals the diversity and commonality between aspects discovered from two different epochs (i.e. epoch history and epoch update).

...read moreread less

Abstract: Update summarization is a new challenge which combines salience ranking with novelty detection. Previous researches usually convert novelty detection to the problem of redundancy removal or salience re-ranking, and seldom explore the birth, splitting, merging and death of aspects for a given topic. In this paper, we borrow the idea of evolutionary clustering and propose a three-level HDP model named h-uHDP, which reveals the diversity and commonality between aspects discovered from two different epochs (i.e. epoch history and epoch update). Specifically, we strengthen modeling the sentence level in the h-uHDP model to adapt to the sentence extraction based framework. Automatic and manual evaluations on TAC data demonstrate the effectiveness of our update summarization algorithm, especially from the novelty criterion.

...read moreread less

Journal Article•DOI•

Generative Models for Evolutionary Clustering

[...]

Tianbing Xu¹, Zhongfei Zhang¹, Philip S. Yu², Bo Long³•Institutions (3)

Binghamton University¹, University of Illinois at Chicago², Yahoo!³

01 Jul 2012-ACM Transactions on Knowledge Discovery From Data

TL;DR: Two generative models are developed that substantially advance the literature on evolutionary clustering, in the sense that not only do they both perform better than those in the existing literature, but more importantly, they are capable of automatically learning the cluster numbers and explicitly addressing the corresponding issues.

...read moreread less

Abstract: This article studies evolutionary clustering, a recently emerged hot topic with many important applications, noticeably in dynamic social network analysis. In this article, based on the recent literature on nonparametric Bayesian models, we have developed two generative models: DPChain and HDP-HTM. DPChain is derived from the Dirichlet process mixture (DPM) model, with an exponential decaying component along with the time. HDP-HTM combines the hierarchical dirichlet process (HDP) with a hierarchical transition matrix (HTM) based on the proposed Infinite hierarchical Markov state model (iHMS). Both models substantially advance the literature on evolutionary clustering, in the sense that not only do they both perform better than those in the existing literature, but more importantly, they are capable of automatically learning the cluster numbers and explicitly addressing the corresponding issues. Extensive evaluations have demonstrated the effectiveness and the promise of these two solutions compared to the state-of-the-art literature.

...read moreread less