scispace - formally typeset
Search or ask a question

Showing papers on "Hierarchical Dirichlet process published in 2012"


Journal ArticleDOI
03 Dec 2012
TL;DR: This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.

271 citations


Posted Content
TL;DR: This article developed stochastic variational inference, a scalable algorithm for approximating posterior distributions for a large class of probabilistic models, including the hierarchical Dirichlet process topic model.
Abstract: We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.

119 citations


Journal ArticleDOI
TL;DR: In this article, the authors derive a stick-breaking representation for the Dirichlet process from the characterization of the beta process as a completely random measure, which they use to derive a three-parameter generalization of the Beta process.
Abstract: The beta-Bernoulli process provides a Bayesian nonparametric prior for models involving collections of binary-valued features. A draw from the beta process yields an infinite collection of probabilities in the unit interval, and a draw from the Bernoulli process turns these into binary-valued features. Recent work has provided stick-breaking representations for the beta process analogous to the well-known stick-breaking representation for the Dirichlet process. We derive one such stick-breaking representation directly from the characterization of the beta process as a completely random measure. This approach motivates a three-parameter generalization of the beta process, and we study the power laws that can be obtained from this generalized beta process. We present a posterior inference algorithm for the beta-Bernoulli process that exploits the stick-breaking representation, and we present experimental results for a discrete factor-analysis model.

94 citations


Proceedings Article
03 Dec 2012
TL;DR: This paper considers a nonparametric topic model based on the hierarchical Dirichlet process, and develops a novel online variational inference algorithm based on split-merge topic updates that can achieve substantially better predictions of test data than conventional online and batch variational algorithms.
Abstract: Variational methods provide a computationally scalable alternative to Monte Carlo methods for large-scale, Bayesian nonparametric learning. In practice, however, conventional batch and online variational methods quickly become trapped in local optima. In this paper, we consider a nonparametric topic model based on the hierarchical Dirichlet process (HDP), and develop a novel online variational inference algorithm based on split-merge topic updates. We derive a simpler and faster variational approximation of the HDP, and show that by intelligently splitting and merging components of the variational posterior, we can achieve substantially better predictions of test data than conventional online and batch variational algorithms. For streaming analysis of large datasets where batch analysis is in-feasible, we show that our split-merge updates better capture the nonparametric properties of the underlying model, allowing continual learning of new topics.

92 citations


Journal ArticleDOI
TL;DR: It is shown that under mild conditions on the copula functions, the version where only the support points or the weights are dependent on predictors have full weak support.
Abstract: We study the support properties of Dirichlet process–based models for sets of predictor–dependent probability distributions. Exploiting the connection between copulas and stochastic processes, we provide an alternative definition of MacEachern’s dependent Dirichlet processes. Based on this definition, we provide sufficient conditions for the full weak support of different versions of the process. In particular, we show that under mild conditions on the copula functions, the version where only the support points or the weights are dependent on predictors have full weak support. In addition, we also characterize the Hellinger and Kullback–Leibler support of mixtures induced by the different versions of the dependent Dirichlet process. A generalization of the results for the general class of dependent stick–breaking processes is also provided.

82 citations


Posted Content
TL;DR: A nonparametric Bayesian prior for PAM is proposed based on a variant of the hierarchical Dirichlet process (HDP), and it is shown that non Parametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.
Abstract: Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset. In this paper, we propose a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP). Although the HDP can capture topic correlations defined by nested data structure, it does not automatically discover such correlations from unstructured data. By assuming an HDP-based prior for PAM, we are able to learn both the number of topics and how the topics are correlated. We evaluate our model on synthetic and real-world text datasets, and show that nonparametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

79 citations


Journal ArticleDOI
TL;DR: This work presents an unsupervised algorithm for learning finite mixture models from multivariate positive data and develops an approach, based on the minimum message length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture.
Abstract: In this work we present an unsupervised algorithm for learning finite mixture models from multivariate positive data. Indeed, this kind of data appears naturally in many applications, yet it has not been adequately addressed in the past. This mixture model is based on the inverted Dirichlet distribution, which offers a good representation and modeling of positive non-Gaussian data. The proposed approach for estimating the parameters of an inverted Dirichlet mixture is based on the maximum likelihood (ML) using Newton Raphson method. We also develop an approach, based on the minimum message length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture. Experimental results are presented using artificial histograms and real data sets. The challenging problem of software modules classification is investigated within the proposed statistical framework, also.

78 citations


Proceedings Article
03 Dec 2012
TL;DR: This paper derives novel clustering algorithms from the asymptotic limit of the DP and HDP mixtures that features the scalability of existing hard clustering methods as well as the flexibility of Bayesian nonparametric models.
Abstract: Sampling and variational inference techniques are two standard methods for inference in probabilistic models, but for many problems, neither approach scales effectively to large-scale data. An alternative is to relax the probabilistic model into a non-probabilistic formulation which has a scalable associated algorithm. This can often be fulfilled by performing small-variance asymptotics, i.e., letting the variance of particular distributions in the model go to zero. For instance, in the context of clustering, such an approach yields connections between the k-means and EM algorithms. In this paper, we explore small-variance asymptotics for exponential family Dirichlet process (DP) and hierarchical Dirichlet process (HDP) mixture models. Utilizing connections between exponential family distributions and Bregman divergences, we derive novel clustering algorithms from the asymptotic limit of the DP and HDP mixtures that features the scalability of existing hard clustering methods as well as the flexibility of Bayesian nonparametric models. We focus on special cases of our analysis for discrete-data problems, including topic modeling, and we demonstrate the utility of our results by applying variants of our algorithms to problems arising in vision and document analysis.

73 citations


Proceedings Article
01 Jan 2012
TL;DR: This work presents a truncation-free online variational inference algorithm for Bayesian nonparametric models that adapts model complexity on the fly and shows better performance than previous online Variational inference algorithms.
Abstract: We present a truncation-free online variational inference algorithm for Bayesian nonparametric models. Unlike traditional (online) variational inference algorithms that require truncations for the model or the variational distribution, our method adapts model complexity on the fly. Our experiments for Dirichlet process mixture models and hierarchical Dirichlet process topic models on two large-scale data sets show better performance than previous online variational inference algorithms.

63 citations


Journal ArticleDOI
TL;DR: The discrete innite logistic normal distribution (DILN) as mentioned in this paper generalizes the hierarchical Dirichlet process (HDP) to model correlation structure between the weights of the atoms at the group level.
Abstract: We present the discrete innite logistic normal distribution (DILN), a Bayesian nonparametric prior for mixed membership models. DILN generalizes the hierarchical Dirichlet process (HDP) to model correlation structure between the weights of the atoms at the group level. We derive a representation of DILN as a normalized collection of gamma-distributed random variables and study its statistical properties. We derive a variational inference algorithm for approximate posterior inference. We apply DILN to topic modeling of documents and study its empirical performance on four corpora, comparing performance with the HDP and the correlated topic model (CTM). To compute with large-scale data, we develop a stochastic variational inference algorithm for DILN and compare with similar algorithms for HDP and latent Dirichlet allocation (LDA) on a collection of 350; 000 articles from Nature.

53 citations


Posted Content
TL;DR: It is found that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and some understanding of the data properties that give rise to larger improvements is given.
Abstract: The hierarchical Dirichlet process (HDP) has become an important Bayesian nonparametric model for grouped data, such as document collections. The HDP is used to construct a flexible mixed-membership model where the number of components is determined by the data. As for most Bayesian nonparametric models, exact posterior inference is intractable---practitioners use Markov chain Monte Carlo (MCMC) or variational inference. Inspired by the split-merge MCMC algorithm for the Dirichlet process (DP) mixture model, we describe a novel split-merge MCMC sampling algorithm for posterior inference in the HDP. We study its properties on both synthetic data and text corpora. We find that split-merge MCMC for the HDP can provide significant improvements over traditional Gibbs sampling, and we give some understanding of the data properties that give rise to larger improvements.

Journal ArticleDOI
TL;DR: Finite generalized Dirichlet mixture models are extended to the infinite case in which the number of components and relevant features do not need to be known a priori, which provides a natural representation of uncertainty regarding the challenging problem of model selection.
Abstract: Mixture modeling is one of the most useful tools in machine learning and data mining applications. An important challenge when applying finite mixture models is the selection of the number of clusters which best describes the data. Recent developments have shown that this problem can be handled by the application of non-parametric Bayesian techniques to mixture modeling. Another important crucial preprocessing step to mixture learning is the selection of the most relevant features. The main approach in this paper, to tackle these problems, consists on storing the knowledge in a generalized Dirichlet mixture model by applying non-parametric Bayesian estimation and inference techniques. Specifically, we extend finite generalized Dirichlet mixture models to the infinite case in which the number of components and relevant features do not need to be known a priori. This extension provides a natural representation of uncertainty regarding the challenging problem of model selection. We propose a Markov Chain Monte Carlo algorithm to learn the resulted infinite mixture. Through applications involving text and image categorization, we show that infinite mixture models offer a more powerful and robust performance than classic finite mixtures for both clustering and feature selection.

Posted Content
TL;DR: In this article, the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages, and a variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling.
Abstract: By developing data augmentation methods unique to the negative binomial (NB) distribution, we unite seemingly disjoint count and mixture models under the NB process framework. We develop fundamental properties of the models and derive efficient Gibbs sampling inference. We show that the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages. A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters.

Journal ArticleDOI
TL;DR: A Bayesian nonparametric model for clustering partial ranking data, based on the theory of random atomic measures, is proposed, which characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation.
Abstract: In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation. We then develop a Dirichlet process mixture extension of our model and apply it to investigate the clustering of preferences for college degree programmes amongst Irish secondary school graduates. The existence of clusters of applicants who have similar preferences for degree programmes is established and we determine that subject matter and geographical location of the third level institution characterise these clusters.

Proceedings Article
03 Dec 2012
TL;DR: This work presents a truncation-free stochastic variational inference algorithm for Bayesian nonparametric models that adapts model complexity on the fly and performs better than previous stochastically variational inferred inference algorithms.
Abstract: We present a truncation-free stochastic variational inference algorithm for Bayesian nonparametric models. While traditional variational inference algorithms require truncations for the model or the variational distribution, our method adapts model complexity on the fly. We studied our method with Dirichlet process mixture models and hierarchical Dirichlet process topic models on two large data sets. Our method performs better than previous stochastic variational inference algorithms.

Journal ArticleDOI
TL;DR: Through applications in blind signal detection and image segmentation, it is shown that the iSMM possesses the advantages of both the Student's t-distribution and the Dirichlet process mixture, offering a more powerful and robust performance than competing models.

Journal ArticleDOI
TL;DR: This paper develops analogous representations for the feature modeling problem of Bayesian nonparametric clustering, which include the beta process and the Indian buffet process as well as new representations that provide insight into the connections between these processes.
Abstract: One of the focal points of the modern literature on Bayesian nonparametrics has been the problem of clustering, or partitioning, where each data point is modeled as being associated with one and only one of some collection of groups called clusters or partition blocks. Underlying these Bayesian nonparametric models are a set of interrelated stochastic processes, most notably the Dirichlet process and the Chinese restaurant process. In this paper we provide a formal development of an analogous problem, called feature modeling, for associating data points with arbitrary nonnegative integer numbers of groups, now called features or topics. We review the existing combinatorial stochastic process representations for the clustering problem and develop analogous representations for the feature modeling problem. These representations include the beta process and the Indian buffet process as well as new representations that provide insight into the connections between these processes. We thereby bring the same level of completeness to the treatment of Bayesian nonparametric feature modeling that has previously been achieved for Bayesian nonparametric clustering.

Journal ArticleDOI
TL;DR: A generative model for hyperspectral images in which the abundances are sampled from a Dirichlet distribution (DD) mixture model, whose parameters depend on a latent label process, which facilitates the use of scale-recursive estimation algorithms.
Abstract: This paper is concerned with joint Bayesian endmember extraction and linear unmixing of hyperspectral images using a spatial prior on the abundance vectors. We propose a generative model for hyperspectral images in which the abundances are sampled from a Dirichlet distribution (DD) mixture model, whose parameters depend on a latent label process. The label process is then used to enforces a spatial prior which encourages adjacent pixels to have the same label. A Gibbs sampling framework is used to generate samples from the posterior distributions of the abundances and the parameters of the DD mixture model. The spatial prior that is used is a tree-structured sticky hierarchical Dirichlet process (SHDP) and, when used to determine the posterior endmember and abundance distributions, results in a new unmixing algorithm called spatially constrained unmixing (SCU). The directed Markov model facilitates the use of scale-recursive estimation algorithms, and is therefore more computationally efficient as compared to standard Markov random field (MRF) models. Furthermore, the proposed SCU algorithm estimates the number of regions in the image in an unsupervised fashion. The effectiveness of the proposed SCU algorithm is illustrated using synthetic and real data.

Journal ArticleDOI
TL;DR: A simple, yet efficient, procedure for approximating the Levy measure of a Gamma(α,1) random variable is described and a finite sum-representation is derived that converges almost surely to Ferguson’s representation of the Dirichlet process.

Posted Content
TL;DR: In this paper, an infinite hierarchical Dirichlet process (HDP) HMM with a countably infinite state space is proposed for clustering gene expression time course data, which is called the HDP-HMM.
Abstract: Most existing approaches to clustering gene expression time course data treat the different time points as independent dimensions and are invariant to permutations, such as reversal, of the experimental time course. Approaches utilizing HMMs have been shown to be helpful in this regard, but are hampered by having to choose model architectures with appropriate complexities. Here we propose for a clustering application an HMM with a countably infinite state space; inference in this model is possible by recasting it in the hierarchical Dirichlet process (HDP) framework (Teh et al. 2006), and hence we call it the HDP-HMM. We show that the infinite model outperforms model selection methods over finite models, and traditional time-independent methods, as measured by a variety of external and internal indices for clustering on two large publicly available data sets. Moreover, we show that the infinite models utilize more hidden states and employ richer architectures (e.g. state-to-state transitions) without the damaging effects of overfitting.

Journal ArticleDOI
TL;DR: In this paper, a new prior process, called a beta-Dirichlet process, is introduced for the cumulative intensity functions and is proved to be conjugate with a Bayesian semiparametric regression model.
Abstract: Bayesian analysis of a finite state Markov process, which is popularly used to model multistate event history data, is considered. A new prior process, called a beta-Dirichlet process, is introduced for the cumulative intensity functions and is proved to be conjugate. In addition, the beta-Dirichlet prior is applied to a Bayesian semiparametric regression model. To illustrate the application of the proposed model, we analyse a dataset of credit histories. Copyright 2012, Oxford University Press.

Proceedings ArticleDOI
12 Aug 2012
TL;DR: A novel collapsed variational Bayes (CVB) inference for the hierarchical Dirichlet process (HDP) is proposed, which is simple to implement, does not require variance counts to be maintained,does not need to set hyper-parameters, and has good predictive performance.
Abstract: We propose a novel collapsed variational Bayes (CVB) inference for the hierarchical Dirichlet process (HDP) While the existing CVB inference for the HDP variant of latent Dirichlet allocation (LDA) is more complicated and harder to implement than that for LDA, the proposed algorithm is simple to implement, does not require variance counts to be maintained, does not need to set hyper-parameters, and has good predictive performance

Posted Content
TL;DR: A method to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point using the Hierarchical Dirichlet Process Hidden Markov Model.
Abstract: When analysing gene expression time series data an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Whilst some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. Here we present a method that allows us to infer regulatory network structures that may vary between time points, utilising a set of hidden states that describe the network structure at a given time point. To model the distribution of the hidden states we have applied the Hierarchical Dirichlet Process Hideen Markov Model, a nonparametric extension of the traditional Hidden Markov Model, that does not require us to fix the number of hidden states in advance. We apply our method to exisiting microarray expression data as well as demonstrating is efficacy on simulated test data.

Journal ArticleDOI
TL;DR: In this paper, Thorne et al. present a method that allows to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point.
Abstract: Motivation: When analysing gene expression time series data, an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Although some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. Results: Here, we present a method that allows us to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point. To model the distribution of the hidden states, we have applied the Hierarchical Dirichlet Process Hidden Markov Model, a non-parametric extension of the traditional Hidden Markov Model, which does not require us to fix the number of hidden states in advance. We apply our method to existing microarray expression data as well as demonstrating is efficacy on simulated test data. Contact: thomas.thorne@imperial.ac.uk

Posted Content
TL;DR: In this article, the explicit-duration semi-Markovian HMM (HDP-HSMM) model is proposed to learn non-geometric state durations in the Hierarchical Dirichlet Process Hidden Markov Model.
Abstract: There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the traditional HMM. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed in the parametric setting to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicitduration HDP-HSMM and develop posterior sampling algorithms for efficient inference in both the direct-assignment and weak-limit approximation settings. We demonstrate the utility of the model and our inference methods on synthetic data as well as experiments on a speaker diarization problem and an example of learning the patterns in Morse code.

Proceedings Article
03 Dec 2012
TL;DR: A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters.
Abstract: By developing data augmentation methods unique to the negative binomial (NB) distribution, we unite seemingly disjoint count and mixture models under the NB process framework. We develop fundamental properties of the models and derive efficient Gibbs sampling inference. We show that the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages. A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper proposes a novel online method based on the “sticky” hierarchical Dirichlet process and the hidden Markov model that provides joint segmentation and classification of actions while processing the data in an online, recursive manner, discovering new classes as they occur, and adjusting its parameters over the streaming data.
Abstract: Since its inception, action recognition research has mainly focused on recognizing actions from closed, predefined sets of classes. Conversely, the problem of recognizing actions from open, possibly incremental sets of classes is still largely unexplored. In this paper, we propose a novel online method based on the “sticky” hierarchical Dirichlet process and the hidden Markov model [11, 5]. This approach, labelled as the online HDP-HMM, provides joint segmentation and classification of actions while a) processing the data in an online, recursive manner, b) discovering new classes as they occur, and c) adjusting its parameters over the streaming data. In a set of experiments, we have applied the online HDP-HMM to recognize actions from motion capture data from the TUM kitchen dataset, a challenging dataset of manipulation actions in a kitchen [12]. The results show significant accuracy in action classification, time segmentation and determination of the number of action classes.

Proceedings Article
26 Jun 2012
TL;DR: The MLC-HDP model is the first in the epilepsy literature capable of clustering seizures within and between patients and is found to be comparable to independent human physician clusterings.
Abstract: Driven by the multi-level structure of human intracranial electroencephalogram (iEEG) recordings of epileptic seizures, we introduce a new variant of a hierarchical Dirichlet Process--the multi-level clustering hierarchical Dirichlet Process (MLC-HDP)--that simultaneously clusters datasets on multiple levels. Our seizure dataset contains brain activity recorded in typically more than a hundred individual channels for each seizure of each patient. The MLC-HDP model clusters over channels-types, seizure-types, and patient-types simultaneously. We describe this model and its implementation in detail. We also present the results of a simulation study comparing the MLC-HDP to a similar model, the Nested Dirichlet Process and finally demonstrate the MLC-HDP's use in modeling seizures across multiple patients. We find the MLC-HDP's clustering to be comparable to independent human physician clusterings. To our knowledge, the MLC-HDP model is the first in the epilepsy literature capable of clustering seizures within and between patients.

Proceedings Article
Jiwei Li1, Sujian Li1, Xun Wang1, Ye Tian, Baobao Chang1 
01 Dec 2012
TL;DR: This paper borrows the idea of evolutionary clustering and proposes a three-level HDP model named h-uHDP, which reveals the diversity and commonality between aspects discovered from two different epochs (i.e. epoch history and epoch update).
Abstract: Update summarization is a new challenge which combines salience ranking with novelty detection. Previous researches usually convert novelty detection to the problem of redundancy removal or salience re-ranking, and seldom explore the birth, splitting, merging and death of aspects for a given topic. In this paper, we borrow the idea of evolutionary clustering and propose a three-level HDP model named h-uHDP, which reveals the diversity and commonality between aspects discovered from two different epochs (i.e. epoch history and epoch update). Specifically, we strengthen modeling the sentence level in the h-uHDP model to adapt to the sentence extraction based framework. Automatic and manual evaluations on TAC data demonstrate the effectiveness of our update summarization algorithm, especially from the novelty criterion.

Journal ArticleDOI
TL;DR: Two generative models are developed that substantially advance the literature on evolutionary clustering, in the sense that not only do they both perform better than those in the existing literature, but more importantly, they are capable of automatically learning the cluster numbers and explicitly addressing the corresponding issues.
Abstract: This article studies evolutionary clustering, a recently emerged hot topic with many important applications, noticeably in dynamic social network analysis. In this article, based on the recent literature on nonparametric Bayesian models, we have developed two generative models: DPChain and HDP-HTM. DPChain is derived from the Dirichlet process mixture (DPM) model, with an exponential decaying component along with the time. HDP-HTM combines the hierarchical dirichlet process (HDP) with a hierarchical transition matrix (HTM) based on the proposed Infinite hierarchical Markov state model (iHMS). Both models substantially advance the literature on evolutionary clustering, in the sense that not only do they both perform better than those in the existing literature, but more importantly, they are capable of automatically learning the cluster numbers and explicitly addressing the corresponding issues. Extensive evaluations have demonstrated the effectiveness and the promise of these two solutions compared to the state-of-the-art literature.