scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a new algorithm for solving a score equation for the maximum likelihood estimate in certain problems of practical interest is presented and examined, and convergence properties of this iterative (fixed-point) algorithm are derived for estimators obtained using only a finite number of iterations.
Abstract: This article presents and examines a new algorithm for solving a score equation for the maximum likelihood estimate in certain problems of practical interest. The method circumvents the need to compute second-order derivatives of the full likelihood function. It exploits the structure of certain models that yield a natural decomposition of a very complicated likelihood function. In this decomposition, the first part is a log-likelihood from a simply analyzed model, and the second part is used to update estimates from the first part. Convergence properties of this iterative (fixed-point) algorithm are examined, and asymptotics are derived for estimators obtained using only a finite number of iterations. Illustrative examples considered in the article include multivariate Gaussian copula models, nonnormal random-effects models, generalized linear mixed models, and state-space models. Properties of the algorithm and of estimators are evaluated in simulation studies on a bivariate copula model and a nonnormal...

142 citations

Journal ArticleDOI
TL;DR: Suboptimal algorithms based on the model provide progressive classification that is much faster than the algorithm based on single-resolution hidden Markov models.
Abstract: This paper treats a multiresolution hidden Markov model for classifying images. Each image is represented by feature vectors at several resolutions, which are statistically dependent as modeled by the underlying state process, a multiscale Markov mesh. Unknowns in the model are estimated by maximum likelihood, in particular by employing the expectation-maximization algorithm. An image is classified by finding the optimal set of states with maximum a posteriori probability. States are then mapped into classes. The multiresolution model enables multiscale information about context to be incorporated into classification. Suboptimal algorithms based on the model provide progressive classification that is much faster than the algorithm based on single-resolution hidden Markov models.

142 citations

Journal ArticleDOI
TL;DR: Some simulation studies are presented to show the advantage of this flexible class of probability distributions in clustering heterogeneous data and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties.

141 citations

Journal ArticleDOI
TL;DR: A class of group ICA models that can accommodate different group structures and include existing models, such as the GIFT and tensor PICA, as special cases are considered and a maximum likelihood (ML) approach with a modified Expectation-Maximization (EM) algorithm is proposed.

141 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that the goodness of fit statistic is a quadratic form in observed proportions when the observed proportions are close to the expected proportions, and the asymptotic efficiency of the maximum likelihood estimator is proved at the same time.
Abstract: This paper is concerned with the theorem that the $X^2$ goodness of fit statistic for a multinomial distribution with $r$ cells and with $s$ parameters fitted by the method of maximum likelihood is distributed as $\chi^2$ with $r - s - 1$ degrees of freedom. Karl Pearson formulated and proved the theorem for the special case $s = 0$. The general theorem was formulated by Fisher [2]. The first attempt at a rigorous proof is due to Cramer [1]. A serious weakness of Cramer's proof is that, in effect, he assumes that the maximum likelihood estimator is consistent. (To be precise, he proves the theorem for the subclass of maximum likelihood estimators that are consistent. But how are we in practice to distinguish between an inconsistent maximum likelihood estimator and a consistent one?) Rao [3] has closed this gap in Cramer's proof by proving the consistency of maximum likelihood for any family of discrete distributions under very general conditions. In this paper the theorem is proved under more general conditions than the combined conditions of Rao and Cramer. Cramer assumes the existence of continuous second partial derivatives with respect to the "unknown" parameter while here only total differentiability at the "true" parameter values is postulated. There is a radical difference in the method of proof. While Cramer regards the maximum likelihood estimate as being the point where the derivative of the log-likelihood function is zero, here it is regarded as the point at which the likelihood function takes values arbitrarily near to its supremum. The method of proof consists essentially of showing that the goodness of fit statistic is a quadratic form in the observed proportions when the observed proportions are close to the expected proportions. The known asymptotic properties of the multinomial distribution are then used. The asymptotic efficiency of the maximum likelihood estimator is proved at the same time.

141 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519