scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the maximum likelihood estimation of variance components is used to estimate the probability of a given variance component in a given set of distributions, where the variance component is defined as the sum of the variance components.
Abstract: (1980). Maximum likelihood estimation of variance components. Series Statistics: Vol. 11, No. 4, pp. 545-561.

89 citations

Proceedings ArticleDOI
04 Dec 2007
TL;DR: A new approach to sparseness-based BSS based on the EM algorithm, which iteratively estimates the DOA and the time-frequency mask for each source through the EM algorithms under the sparsness assumption is proposed.
Abstract: In this paper, we propose a new approach to sparseness-based BSS based on the EM algorithm, which iteratively estimates the DOA and the time-frequency mask for each source through the EM algorithm under the sparseness assumption. Our method has the following characteristics: 1) it enables the introduction of physical observation models such as the diffuse sound field, because the likelihood is defined in the original signal domain and not in the feature domain, 2) one does not necessarily have to know in advance the power of the background noise since they are also parameters which can be estimated from the observed signal, 3) it takes short computational time, 4) a common objective function is iteratively increased in localization and separation steps, which correspond to the E-step and M-step, respectively. Although our framework is applicable to general N channel BSS, we will concentrate on the formulation of the problem in the particular case where two sensory inputs are available, and we show some numerical simulation results.

88 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This work proposes an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm that can estimate high-dimensional data distributions more faithfully than the EM algorithm.
Abstract: Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm.

88 citations

Proceedings ArticleDOI
25 Jul 2010
TL;DR: In this article, a probabilistic approach for statistical modeling of the loads in distribution networks is presented, where the Expectation Maximization (EM) algorithm is used to obtain the parameters of the mixture components.
Abstract: This paper presents a probabilistic approach for statistical modelling of the loads in distribution networks. In a distribution network, the Probability Density Functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution. The approach presented in this paper represents all the load pdfs through Gaussian Mixture Model (GMM). The Expectation Maximization (EM) algorithm is used to obtain the parameters of the mixture components. The performance of the method is demonstrated on a 95-bus generic distribution network model.

88 citations

Journal ArticleDOI
TL;DR: Snapclust as discussed by the authors is a fast maximum-likelihood solution to the genetic clustering problem, which allies the advantages of both model-based and geometric approaches, using the Expectation-Maximisation (EM) algorithm.
Abstract: The investigation of genetic clusters in natural populations is an ubiquitous problem in a range of fields relying on the analysis of genetic data, such as molecular ecology, conservation biology and microbiology. Typically, genetic clusters are defined as distinct panmictic populations, or parental groups in the context of hybridisation. Two types of methods have been developed for identifying such clusters: model-based methods, which are usually computer-intensive but yield results which can be interpreted in the light of an explicit population genetic model, and geometric approaches, which are less interpretable but remarkably faster.Here, we introduce snapclust, a fast maximum-likelihood solution to the genetic clustering problem, which allies the advantages of both model-based and geometric approaches. Our method relies on maximising the likelihood of a fixed number of panmictic populations, using a combination of geometric approach and fast likelihood optimisation, using the Expectation-Maximisation (EM) algorithm. It can be used for assigning genotypes to populations and optionally identify various types of hybrids between two parental populations. Several goodness-of-fit statistics can also be used to guide the choice of the retained number of clusters.Using extensive simulations, we show that snapclust performs comparably to current gold standards for genetic clustering as well as hybrid detection, with some advantages for identifying hybrids after several backcrosses, while being orders of magnitude faster than other model-based methods. We also illustrate how snapclust can be used for identifying the optimal number of clusters, and subsequently assign individuals to various hybrid classes simulated from an empirical microsatellite dataset. snapclust is implemented in the package adegenet for the free software R, and is therefore easily integrated into existing pipelines for genetic data analysis. It can be applied to any kind of co-dominant markers, and can easily be extended to more complex models including, for instance, varying ploidy levels. Given its flexibility and computer-efficiency, it provides a useful complement to the existing toolbox for the study of genetic diversity in natural populations.

88 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519