scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: A generative mixture-model approach to clustering directional data based on the von Mises-Fisher distribution, which arises naturally for data distributed on the unit hypersphere, and derives and analyzes two variants of the Expectation Maximization framework for estimating the mean and concentration parameters of this mixture.
Abstract: Several large scale data mining applications, such as text categorization and gene expression analysis, involve high-dimensional data that is also inherently directional in nature. Often such data is L2 normalized so that it lies on the surface of a unit hypersphere. Popular models such as (mixtures of) multi-variate Gaussians are inadequate for characterizing such data. This paper proposes a generative mixture-model approach to clustering directional data based on the von Mises-Fisher (vMF) distribution, which arises naturally for data distributed on the unit hypersphere. In particular, we derive and analyze two variants of the Expectation Maximization (EM) framework for estimating the mean and concentration parameters of this mixture. Numerical estimation of the concentration parameters is non-trivial in high dimensions since it involves functional inversion of ratios of Bessel functions. We also formulate two clustering algorithms corresponding to the variants of EM that we derive. Our approach provides a theoretical basis for the use of cosine similarity that has been widely employed by the information retrieval community, and obtains the spherical kmeans algorithm (kmeans with cosine similarity) as a special case of both variants. Empirical results on clustering of high-dimensional text and gene-expression data based on a mixture of vMF distributions show that the ability to estimate the concentration parameter for each vMF component, which is not present in existing approaches, yields superior results, especially for difficult clustering tasks in high-dimensional spaces.

869 citations

Journal ArticleDOI
TL;DR: In this article, a strategy of using an average information matrix is shown to be computationally convenient and efficient for estimating variance components by restricted maximum likelihood (REML) in the mixed linear model.
Abstract: A strategy of using an average information matrix is shown to be computationally convenient and efficient for estimating variance components by restricted maximum likelihood (REML) in the mixed linear model. Three applications are described. The motivation for the algorithm was the estimation of variance components in the analysis of wheat variety means from 1,071 experiments representing 10 years and 60 locations in New South Wales. We also apply the algorithm to the analysis of designed experiments by incomplete block analysis and spatial analysis of field experiments.

868 citations

Journal ArticleDOI
Dar-Shyang Lee1
TL;DR: An effective scheme to improve the convergence rate without compromising model stability is proposed by replacing the global, static retention factor with an adaptive learning rate calculated for each Gaussian at every frame.
Abstract: Adaptive Gaussian mixtures have been used for modeling nonstationary temporal distributions of pixels in video surveillance applications. However, a common problem for this approach is balancing between model convergence speed and stability. This paper proposes an effective scheme to improve the convergence rate without compromising model stability. This is achieved by replacing the global, static retention factor with an adaptive learning rate calculated for each Gaussian at every frame. Significant improvements are shown on both synthetic and real video data. Incorporating this algorithm into a statistical framework for background subtraction leads to an improved segmentation performance compared to a standard method.

867 citations

Journal ArticleDOI
TL;DR: In this paper, the maximum likelihood method for fitting the linear model when residuals are correlated and when the covariance among the residuals is determined by a parametric model containing unknown parameters is described.
Abstract: We describe the maximum likelihood method for fitting the linear model when residuals are correlated and when the covariance among the residuals is determined by a parametric model containing unknown parameters. Observations are assumed to be Gaussian. We give conditions which ensure consistency and asymptotic normality of the estimators. Our main concern is with the analysis of spatial data and in this context we describe some simulation experiments to assess the small sample behaviour of estimators. We also discuss an application of the spectral approximation to the likelihood for processes on a lattice.

858 citations

Journal ArticleDOI
TL;DR: The mathematical connection between the Expectation-Maximization (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite gaussian mixtures is built up and an explicit expression for the matrix is provided.
Abstract: We build up the mathematical connection between the “Expectation-Maximization” (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix P, and we provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of P and provide new results analyzing the effect that P has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of gaussian mixture models.

849 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519