scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: A new statistical model for time series that iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes is introduced and the results suggest that variational approximations are a viable method for inference and learning in switching state-space models.
Abstract: We introduce a new statistical model for time series that iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time-series models— hidden Markov models and linear dynamical systems—and is closely related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network (Jacobs, Jordan, Nowlan, & Hinton, 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact expectation maximization (EM) algorithm cannot be applied. However, we present a variational approximation that maximizes a lower bound on the log-likelihood and makes use of both the forward and backward recursions for hidden Markov models and the Kalman filter recursions for linear dynamical systems. We tested the algorithm on artificial data sets and a natural data set of respiration force from a patient with sleep apnea. The results suggest that variational approximations are a viable method for inference and learning in switching state-space models.

478 citations

Book ChapterDOI
28 May 2002
TL;DR: It is shown that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance, and is used in a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy.
Abstract: We investigate in this article the rigid registration of large sets of points, generally sampled from surfaces. We formulate this problem as a general Maximum-Likelihood (ML) estimation of the transformation and the matches. We show that, in the specific case of a Gaussian noise, it corresponds to the Iterative Closest Point algorithm(ICP) with the Mahalanobis distance.Then, considering matches as a hidden variable, we obtain a slightly more complex criterion that can be efficiently solved using Expectation-Maximization (EM) principles. In the case of a Gaussian noise, this new methods corresponds to an ICP with multiple matches weighted by normalized Gaussian weights, giving birth to the EM-ICP acronym of the method.The variance of the Gaussian noise is a new parameter that can be viewed as a "scale or blurring factor" on our point clouds. We show that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance. Thus, the idea is to use a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy. Moreover, we show that at each "scale", the criterion can be efficiently approximated using a simple decimation of one point set, which drastically speeds up the algorithm.Experiments on real data demonstrate a spectacular improvement of the performances of EM-ICP w.r.t. the standard ICP algorithm in terms of robustness (a factor of 3 to 4) and speed (a factor 10 to 20), with similar performances in precision. Though the multiscale scheme is only justified with EM, it can also be used to improve ICP, in which case the performances reaches then the one of EM when the data are not too noisy.

470 citations

Journal ArticleDOI
TL;DR: This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.
Abstract: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

465 citations

01 Jan 2014
TL;DR: Maximum likelihood is illustrated by replicating Daniel Treisman's (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics, and concludes that Russia has a higher number of millionaires than economic factors such as market size and tax rate predict.
Abstract: In a previous lecture, we estimated the relationship between dependent and explanatory variables using linear regression. But what if a linear relationship is not an appropriate assumption for our model? One widely used alternative is maximum likelihood estimation, which involves specifying a class of distributions, indexed by unknown parameters, and then using the data to pin down these parameter values. The benefit relative to linear regression is that it allows more flexibility in the probabilistic relationships between variables. Here we illustrate maximum likelihood by replicating Daniel Treisman’s (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics. The paper concludes that Russia has a higher number of billionaires than economic factors such as market size and tax rate predict.

464 citations

Book ChapterDOI
03 Sep 2008
TL;DR: This work presents a method for predicting diffusion probabilities by using the EM algorithm, and defines the likelihood for information diffusion episodes, where an episode means a sequence of newly active nodes.
Abstract: We address a problem of predicting diffusion probabilities in complex networks. As one approach to this problem, we focus on the independent cascade (IC) model, and define the likelihood for information diffusion episodes, where an episode means a sequence of newly active nodes. Then, we present a method for predicting diffusion probabilities by using the EM algorithm. Our experiments using a real network data set show the proposed method works well.

463 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519