scispace - formally typeset
Search or ask a question
Topic

Mixture model

About: Mixture model is a research topic. Over the lifetime, 18155 publications have been published within this topic receiving 588317 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model are presented, and two new classes of methods are presented. But neither of these methods is suitable for handling general models with non-conjugate priors.
Abstract: This article reviews Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model and presents two new classes of methods. One new approach is to make Metropolis—Hastings updates of the indicators specifying which mixture component is associated with each observation, perhaps supplemented with a partial form of Gibbs sampling. The other new approach extends Gibbs sampling for these indicators by using a set of auxiliary parameters. These methods are simple to implement and are more efficient than previous ways of handling general Dirichlet process mixture models with non-conjugate priors.

2,320 citations

Journal ArticleDOI
TL;DR: The novelty of the approach is that it does not use a model selection criterion to choose one among a set of preestimated candidate models; instead, it seamlessly integrate estimation and model selection in a single algorithm.
Abstract: This paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective "unsupervised" is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectation-maximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach.

2,182 citations

Journal ArticleDOI
TL;DR: In this paper, a new SAS procedure, TRAJ, is proposed to fit semiparametric mixtures of censored normal, Poisson, zero-inflated Poisson and Bernoulli distributions to longitudinal data.
Abstract: This article introduces a new SAS procedure written by the authors that analyzes longitudinal data (developmental trajectories) by fitting a mixture model. The TRAJ procedure fits semiparametric (discrete) mixtures of censored normal, Poisson, zero-inflated Poisson, and Bernoulli distributions to longitudinal data. Applications to psychometric scale data, offense counts, and a dichotomous prevalence measure in violence research are illustrated. In addition, the use of the Bayesian information criterion to address the problem of model selection, including the estimation of the number of components in the mixture, is demonstrated.

2,085 citations

Proceedings Article
28 Jun 2011
TL;DR: This paper proposes a new framework for learning from large scale datasets based on iterative learning from small mini-batches by adding the right amount of noise to a standard stochastic gradient optimization algorithm and shows that the iterates will converge to samples from the true posterior distribution as the authors anneal the stepsize.
Abstract: In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic gradient optimization algorithm we show that the iterates will converge to samples from the true posterior distribution as we anneal the stepsize. This seamless transition between optimization and Bayesian posterior sampling provides an inbuilt protection against overfitting. We also propose a practical method for Monte Carlo estimates of posterior statistics which monitors a "sampling threshold" and collects samples after it has been surpassed. We apply the method to three models: a mixture of Gaussians, logistic regression and ICA with natural gradients.

2,080 citations

01 Jan 1996
TL;DR: In this article, the authors consider Bayesian counterparts of the classical tests for good-ness of fit and their use in judging the fit of a single Bayesian model to the observed data.
Abstract: This paper considers Bayesian counterparts of the classical tests for good- ness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on auxiliary statistics. The Bayesian formulation facilitates the con- struction and calculation of a meaningful reference distribution not only for any (classical) statistic, but also for any parameter-dependent "statistic" or discrep- ancy. The latter allows us to propose the realized discrepancy assessment of model fitness, which directly measures the true discrepancy between data and the posited model, for any aspect of the model which we want to explore. The computation required for the realized discrepancy assessment is a straightforward byproduct of the posterior simulation used for the original Bayesian analysis. We illustrate with three applied examples. The first example, which serves mainly to motivate the work, illustrates the difficulty of classical tests in assessing the fitness of a Poisson model to a positron emission tomography image that is constrained to be nonnegative. The second and third examples illustrate the details of the posterior predictive approach in two problems: estimation in a model with inequality constraints on the parameters, and estimation in a mixture model. In all three examples, standard test statistics (either a χ 2 or a likelihood ratio) are not pivotal: the difficulty is not just how to compute the reference distribution for the test, but that in the classical framework no such distribution exists, independent of the unknown model parameters.

2,065 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
89% related
Estimator
97.3K papers, 2.6M citations
88% related
Cluster analysis
146.5K papers, 2.9M citations
88% related
Convolutional neural network
74.7K papers, 2M citations
87% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023270
2022595
20211,007
20201,127
20191,181
20181,120