Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers

[...]

Jianfeng Gao¹, Mark Johnson²•Institutions (2)

Microsoft¹, Brown University²

25 Oct 2008

TL;DR: This paper compares a variety of different Bayesian estimators for Hidden Markov Model POS taggers with various numbers of hidden states on data sets of different sizes and finds that Variational Bayes was the fastest of all the estimators, especially on large data sets, and that explicit Gibbs sampler were generally faster than their collapsed counterparts on largeData sets.

...read moreread less

Abstract: There is growing interest in applying Bayesian techniques to NLP problems. There are a number of different estimators for Bayesian models, and it is useful to know what kinds of tasks each does well on. This paper compares a variety of different Bayesian estimators for Hidden Markov Model POS taggers with various numbers of hidden states on data sets of different sizes. Recent papers have given contradictory results when comparing Bayesian estimators to Expectation Maximization (EM) for unsupervised HMM POS tagging, and we show that the difference in reported results is largely due to differences in the size of the training data and the number of states in the HMM. We invesigate a variety of samplers for HMMs, including some that these earlier papers did not study. We find that all of Gibbs samplers do well with small data sets and few states, and that Variational Bayes does well on large data sets and is competitive with the Gibbs samplers. In terms of times of convergence, we find that Variational Bayes was the fastest of all the estimators, especially on large data sets, and that explicit Gibbs sampler (both pointwise and sentence-blocked) were generally faster than their collapsed counterparts on large data sets.

...read moreread less

107 citations

Journal Article•DOI•

Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations

[...]

Jo Bovy¹, David W. Hogg, Sam T. Roweis¹•Institutions (1)

New York University¹

19 May 2009-arXiv: Methodology

TL;DR: This paper generalizes the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties.

...read moreread less

Abstract: We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation--Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual $d$-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-merge" procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.

...read moreread less

107 citations

Patent•

Visual motion analysis method for detecting arbitrary numbers of moving objects in image sequences

[...]

Allan D. Jepson¹, David J. Fleet¹, Michael J. Black¹•Institutions (1)

Xerox¹

23 May 2002

TL;DR: In this paper, a visual motion analysis method that uses multiple layered global motion models to both detect and reliably track an arbitrary number of moving objects appearing in image sequences is presented, where each global model includes a background layer and one or more foreground polybones, each foreground polybone including a parametric shape model, an appearance model, and a motion model describing an associated moving object.

...read moreread less

Abstract: A visual motion analysis method that uses multiple layered global motion models to both detect and reliably track an arbitrary number of moving objects appearing in image sequences Each global model includes a background layer and one or more foreground “polybones”, each foreground polybone including a parametric shape model, an appearance model, and a motion model describing an associated moving object Each polybone includes an exclusive spatial support region and a probabilistic boundary region, and is assigned an explicit depth ordering Multiple global models having different numbers of layers, depth orderings, motions, etc, corresponding to detected objects are generated, refined using, for example, an EM algorithm, and then ranked/compared Initial guesses for the model parameters are drawn from a proposal distribution over the set of potential (likely) models Bayesian model selection is used to compare/rank the different models, and models having relatively high posterior probability are retained for subsequent analysis

...read moreread less

107 citations

Journal Article•DOI•

On Classification with Incomplete Data

[...]

David P. Williams¹, Xuejun Liao¹, Ya Xue¹, Lawrence Carin¹, Balaji Krishnapuram² - Show less +1 more•Institutions (2)

Duke University¹, Siemens²

01 Mar 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A (supervised) logistic regression algorithm for the classification of incomplete data is developed and is extended to the semisupervised case by incorporating graph-based regularization.

...read moreread less

Abstract: We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation-maximization (EM) and variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data-both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown

...read moreread less

106 citations

Parameter convergence for em and mm algorithms

[...]

Florin Vaida

01 Jan 2005

TL;DR: It is shown that under general, simple, veriable conditions, any EM sequence is convergent, if the maximizer at the M-step is unique; this condition is almost always satis- ed in practice.

...read moreread less

Abstract: It is well known that the likelihood sequence of the EM algorithm is non- decreasing and convergent (Dempster, Laird and Rubin (1977)), and that the limit points of the EM algorithm are stationary points of the likelihood (Wu (1982)), but the issue of the convergence of the EM sequence itself has not been completely settled. In this paper we close this gap and show that under general, simple, veriable conditions, any EM sequence is convergent. In pathological cases we show that the sequence is cycling in the limit among a nite number of stationary points with equal likelihood. The results apply equally to the optimization transfer class of algorithms (MM algorithm) of Lange, Hunter, and Yang (2000). Two dieren t EM algorithms constructed on the same dataset illustrate the convergence and the cyclic behavior. This paper contains new results concerning the convergence of the EM al- gorithm. The EM algorithm was brought into the limelight by Dempster, Laird and Rubin (1977) as a general iterative method of computing the maximum likelihood estimator by maximizing a simpler likelihood on an augmented data space. However, the problem of the convergence of the algorithm has not been satisfactory resolved. Wu (1983), the main theoretical contribution in this area, showed that the limit points of the EM algorithm are stationary points of the likelihood, and that when the likelihood is unimodal, any EM sequence is con- vergent. Boyles (1983) has a number of results along similar lines. These results still allow the possibility of a non-convergent EM sequence when the likelihood is not unimodal. More importantly, the EM algorithm is useful when the likelihood is hard to obtain directly; for these cases, the unimodality of the likelihood is very dicult to verify. Here we give simple, general, veriable conditions for con- vergence: our main result (Theorem 3) is that any EM sequence is convergent, if the maximizer at the M-step is unique. This condition is almost always satis- ed in practice (otherwise the particular EM data augmentation scheme would

...read moreread less

106 citations

Collapse

Network Information

Performance

Metrics

12,192

Papers

568,001

Citations

No. of papers in the topic in previous years
Year	Papers
2023	114
2022	245
2021	438
2020	410
2019	484
2018	519

Expectation–maximization algorithm

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics