scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: An effective search strategy is introduced that combines the ideas of marginal augmentation and conditional augmentation, together with a deterministic approximation method for selecting good augmentation schemes to obtain efficient Markov chain Monte Carlo algorithms for posterior sampling.
Abstract: The term data augmentation refers to methods for constructing iterative optimization or sampling algorithms via the introduction of unobserved data or latent variables. For deterministic algorithms, the method was popularized in the general statistical community by the seminal article by Dempster, Laird, and Rubin on the EM algorithm for maximizing a likelihood function or, more generally, a posterior density. For stochastic algorithms, the method was popularized in the statistical literature by Tanner and Wong's Data Augmentation algorithm for posterior sampling and in the physics literature by Swendsen and Wang's algorithm for sampling from the Ising and Potts models and their generalizations; in the physics literature, the method of data augmentation is referred to as the method of auxiliary variables. Data augmentation schemes were used by Tanner and Wong to make simulation feasible and simple, while auxiliary variables were adopted by Swendsen and Wang to improve the speed of iterative simulation. In...

906 citations

Journal ArticleDOI
TL;DR: The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
Abstract: Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

903 citations

Journal ArticleDOI
TL;DR: In this article, the authors developed an efficient and effective implementation of the Newton-Raphson (NR) algorithm for estimating the parameters in mixed-effects models for repeated-measures data.
Abstract: We develop an efficient and effective implementation of the Newton—Raphson (NR) algorithm for estimating the parameters in mixed-effects models for repeated-measures data. We formulate the derivatives for both maximum likelihood and restricted maximum likelihood estimation and propose improvements to the algorithm discussed by Jennrich and Schluchter (1986) to speed convergence and ensure a positive-definite covariance matrix for the random effects at each iteration. We use matrix decompositions to develop efficient and computationally stable implementations of both the NR algorithm and an EM algorithm (Laird and Ware 1982) for this model. We compare the two methods (EM vs. NR) in terms of computational order and performance on two sample data sets and conclude that in most situations a well-implemented NR algorithm is preferable to the EM algorithm or EM algorithm with Aitken's acceleration. The term repeated measures refers to experimental designs where there are several individuals and several...

900 citations

Journal ArticleDOI
TL;DR: In this paper, Gibbs sampling is used to evaluate the posterior distribution and Bayes estimators by Gibbs sampling, relying on the missing data structure of the mixture model. And the data augmentation method is shown to converge geometrically, since a duality principle transfers properties from the discrete missing data chain to the parameters.
Abstract: SUMMARY A formal Bayesian analysis of a mixture model usually leads to intractable calculations, since the posterior distribution takes into account all the partitions of the sample. We present approximation methods which evaluate the posterior distribution and Bayes estimators by Gibbs sampling, relying on the missing data structure of the mixture model. The data augmentation method is shown to converge geometrically, since a duality principle transfers properties from the discrete missing data chain to the parameters. The fully conditional Gibbs alternative is shown to be ergodic and geometric convergence is established in the normal case. We also consider non-informative approximations associated with improper priors, assuming that the sample corresponds exactly to a k-component mixture.

895 citations

Proceedings Article
29 Nov 1999
TL;DR: This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models that approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner.
Abstract: This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models. Our approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner. These posteriors fall out of a free-form optimization procedure, which naturally incorporates conjugate priors. Unlike in large sample approximations, the posteriors are generally non-Gaussian and no Hessian needs to be computed. Predictive quantities are obtained analytically. The resulting algorithm generalizes the standard Expectation Maximization algorithm, and its convergence is guaranteed. We demonstrate that this approach can be applied to a large class of models in several domains, including mixture models and source separation.

870 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519