scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors use Monte Carlo simulations to show that maximum likelihood estimation of these data using the beta distribution may provide more accurate and more precise results, and then present empirical analyses illustrating some of these differences.
Abstract: Research in political science is often concerned with modeling dependent variables that are proportions. Proportions are relevant in a wide variety of substantive areas, including elections, the bureaucracy, and interest groups. Yet because most researchers rely upon an approach, OLS, that does not recognize key aspects of proportions, the conclusions we reach from normal models may not provide the best understanding of phenomena of interest in these areas. In this paper, I use Monte Carlo simulations to show that maximum likelihood estimation of these data using the beta distribution may provide more accurate and more precise results. I then present empirical analyses illustrating some of these differences.

235 citations

Journal ArticleDOI
TL;DR: In this paper, a nonparametric maximum likelihood estimator (MLE) of the (N, F) pair is derived for any specified population size N and the MLE of the pair and thus for N is determined.
Abstract: We conduct nonparametric maximum likelihood estimation under two common heterogeneous closed population capture-recapture models. Our models specify mixture models (as did previous researchers' models) which have a common generating distribution, say F, for the capture probabilities. Using Lindsay and Roeder's (1992, Journal of the American Statistical Association 87, 785-794) mixture model results and the EM algorithm, a nonparametric maximum likelihood estimator (MLE) of F for any specified population size N is obtained. Then, the nonparametric MLE of the (N, F) pair and thus for N is determined. Perhaps most importantly, since our MLE pair maximizes the likelihood under the entire nonparametric probability model, it provides an excellent foundation for estimating properties of estimators, conducting a goodness-of-fit test, and performing a likelihood ratio test. These are illustrated in the paper.

234 citations

Journal ArticleDOI
TL;DR: The proposed model is a semi-parametric generalization of the mixture model of Farewell (1982), and a logistic regression model is proposed for the incidence part of the model, and a Kaplan-Meier type approach is used to estimate the latency part ofThe model.
Abstract: A mixture model is an attractive approach for analyzing failure time data in which there are thought to be two groups of subjects, those who could eventually develop the endpoint and those who could not develop the endpoint. The proposed model is a semi-parametric generalization of the mixture model of Farewell (1982). A logistic regression model is proposed for the incidence part of the model, and a Kaplan-Meier type approach is used to estimate the latency part of the model. The estimator arises naturally out of the EM algorithm approach for fitting failure time mixture models as described by Larson and Dinse (1985). The procedure is applied to some experimental data from radiation biology and is evaluated in a Monte Carlo simulation study. The simulation study suggests the semi-parametric procedure is almost as efficient as the correct fully parametric procedure for estimating the regression coefficient in the incidence, but less efficient for estimating the latency distribution.

234 citations

Journal ArticleDOI
TL;DR: The methods are illustrated on a data set involving alternative dosage regimens for the treatment of schizophrenia using haloperidol and on a regression example, where the new methods are compared with complete-case analysis and maximum likelihood for a probit selection model.
Abstract: Pattern-mixture models stratify incomplete data by the pattern of missing values and formulate distinct models within each stratum. Pattern-mixture models are developed for analyzing a random sample on continuous variables y(1), y(2) when values of y(2) are nonrandomly missing. Methods for scalar y(1) and y(2) are here generalized to vector y(1) and y(2) with additional fixed covariates x. Parameters in these models are identified by alternative assumptions about the missing-data mechanism. Models may be underidentified (in which case additional assumptions are needed), just-identified, or overidentified. Maximum likelihood and Bayesian methods are developed for the latter two situations, using the EM and SEM algorithms, direct and interactive simulation methods. The methods are illustrated on a data set involving alternative dosage regimens for the treatment of schizophrenia using haloperidol and on a regression example. Sensitivity to alternative assumptions about the missing-data mechanism is assessed, and the new methods are compared with complete-case analysis and maximum likelihood for a probit selection model.

233 citations

Journal ArticleDOI
TL;DR: Comparisons are presented to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.
Abstract: Finite mixtures of multivariate skew t (MST) distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour. Recently, they have been exploited as an effective tool for modelling flow cytometric data. A number of algorithms for the computation of the maximum likelihood (ML) estimates for the model parameters of mixtures of MST distributions have been put forward in recent years. These implementations use various characterizations of the MST distribution, which are similar but not identical. While exact implementation of the expectation-maximization (EM) algorithm can be achieved for `restricted' characterizations of the component skew t-distributions, Monte Carlo (MC) methods have been used to fit the `unrestricted' models. In this paper, we review several recent fitting algorithms for finite mixtures of multivariate skew t-distributions, at the same time clarifying some of the connections between the various existing proposals. In particular, recent results have shown that the EM algorithm can be implemented exactly for faster computation of ML estimates for mixtures with unrestricted MST components. The gain in computational time is effected by noting that the semi-infinite integrals on the E-step of the EM algorithm can be put in the form of moments of the truncated multivariate non-central t-distribution, similar to the restricted case, which subsequently can be expressed in terms of the non-truncated form of the central t-distribution function for which fast algorithms are available. We present comparisons to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.

233 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519