Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Variational Learning for Switching State-Space Models

[...]

Zoubin Ghahramani¹, Geoffrey E. Hinton¹•Institutions (1)

University College London¹

01 Apr 2000-Neural Computation

TL;DR: A new statistical model for time series that iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes is introduced and the results suggest that variational approximations are a viable method for inference and learning in switching state-space models.

...read moreread less

Abstract: We introduce a new statistical model for time series that iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time-series models— hidden Markov models and linear dynamical systems—and is closely related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network (Jacobs, Jordan, Nowlan, & Hinton, 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact expectation maximization (EM) algorithm cannot be applied. However, we present a variational approximation that maximizes a lower bound on the log-likelihood and makes use of both the forward and backward recursions for hidden Markov models and the Kalman filter recursions for linear dynamical systems. We tested the algorithm on artificial data sets and a natural data set of respiration force from a patient with sleep apnea. The results suggest that variational approximations are a viable method for inference and learning in switching state-space models.

...read moreread less

478 citations

Book Chapter•DOI•

Multi-scale EM-ICP: A Fast and Robust Approach for Surface Registration

[...]

Sébastien Granger, Xavier Pennec

28 May 2002

TL;DR: It is shown that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance, and is used in a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy.

...read moreread less

Abstract: We investigate in this article the rigid registration of large sets of points, generally sampled from surfaces. We formulate this problem as a general Maximum-Likelihood (ML) estimation of the transformation and the matches. We show that, in the specific case of a Gaussian noise, it corresponds to the Iterative Closest Point algorithm(ICP) with the Mahalanobis distance.Then, considering matches as a hidden variable, we obtain a slightly more complex criterion that can be efficiently solved using Expectation-Maximization (EM) principles. In the case of a Gaussian noise, this new methods corresponds to an ICP with multiple matches weighted by normalized Gaussian weights, giving birth to the EM-ICP acronym of the method.The variance of the Gaussian noise is a new parameter that can be viewed as a "scale or blurring factor" on our point clouds. We show that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance. Thus, the idea is to use a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy. Moreover, we show that at each "scale", the criterion can be efficiently approximated using a simple decimation of one point set, which drastically speeds up the algorithm.Experiments on real data demonstrate a spectacular improvement of the performances of EM-ICP w.r.t. the standard ICP algorithm in terms of robustness (a factor of 3 to 4) and speed (a factor 10 to 20), with similar performances in precision. Though the multiscale scheme is only justified with EM, it can also be used to improve ICP, in which case the performances reaches then the one of EM when the data are not too noisy.

...read moreread less

470 citations

Journal Article•DOI•

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

[...]

Po-Ling Loh, Martin J. Wainwright

16 Sep 2011-arXiv: Statistics Theory

TL;DR: This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.

...read moreread less

Abstract: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

...read moreread less

465 citations

Maximum Likelihood Estimation.

[...]

Thomas Brox

01 Jan 2014

TL;DR: Maximum likelihood is illustrated by replicating Daniel Treisman's (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics, and concludes that Russia has a higher number of millionaires than economic factors such as market size and tax rate predict.

...read moreread less

Abstract: In a previous lecture, we estimated the relationship between dependent and explanatory variables using linear regression. But what if a linear relationship is not an appropriate assumption for our model? One widely used alternative is maximum likelihood estimation, which involves specifying a class of distributions, indexed by unknown parameters, and then using the data to pin down these parameter values. The benefit relative to linear regression is that it allows more flexibility in the probabilistic relationships between variables. Here we illustrate maximum likelihood by replicating Daniel Treisman’s (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics. The paper concludes that Russia has a higher number of billionaires than economic factors such as market size and tax rate predict.

...read moreread less

464 citations

Book Chapter•DOI•

Prediction of Information Diffusion Probabilities for Independent Cascade Model

[...]

Kazumi Saito¹, Ryohei Nakano², Masahiro Kimura³•Institutions (3)

University of Shizuoka¹, Nagoya Institute of Technology², Ryukoku University³

03 Sep 2008

TL;DR: This work presents a method for predicting diffusion probabilities by using the EM algorithm, and defines the likelihood for information diffusion episodes, where an episode means a sequence of newly active nodes.

...read moreread less

Abstract: We address a problem of predicting diffusion probabilities in complex networks. As one approach to this problem, we focus on the independent cascade (IC) model, and define the likelihood for information diffusion episodes, where an episode means a sequence of newly active nodes. Then, we present a method for predicting diffusion probabilities by using the EM algorithm. Our experiments using a real network data set show the proposed method works well.

...read moreread less

463 citations

Collapse

Network Information

Performance

Metrics

12,192

Papers

568,001

Citations

No. of papers in the topic in previous years
Year	Papers
2023	114
2022	245
2021	438
2020	410
2019	484
2018	519

Expectation–maximization algorithm

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics