scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 2009"


Journal ArticleDOI
TL;DR: The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models, which include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture Models.
Abstract: The mixtools package for R provides a set of functions for analyzing a variety of nite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reect some recent research in nite mixture models. In the latter category, mixtools provides algorithms for estimating parameters in a wide range of dierent mixture-of-regression contexts, in multinomial mixtures such as those arising from discretizing continuous multivariate data, in nonparametric situations where the multivariate component densities are completely unspecied, and in semiparametric situations such as a univariate location mixture of symmetric but otherwise unspecied densities. Many of the algorithms of the mixtools package are EM algorithms or are based on EM-like ideas, so this article includes an overview of EM algorithms for nite mixture models.

1,079 citations


Journal ArticleDOI
TL;DR: Experimental work demonstrates that the proposed mean shift/SIFT strategy improves the tracking performance of the classical mean shift and SIFT tracking algorithms in complicated real scenarios.

603 citations


Journal ArticleDOI
TL;DR: A generic on‐line version of the expectation–maximization (EM) algorithm applicable to latent variable models of independent observations that is suitable for conditional models, as illustrated in the case of the mixture of linear regressions model.
Abstract: In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model.

495 citations


Journal ArticleDOI
TL;DR: It is demonstrated how model selection in such probabilistic or generative modeling can facilitate analysis of closely related temporal data currently prevalent in biophysics, and how this technique can be applied to temporal data such as smFRET time series.

396 citations


Journal ArticleDOI
TL;DR: An easy and efficient sparse-representation-based iterative algorithm for image inpainting that allows a high degree of flexibility to recover different structural components in the image (piecewise smooth, curvilinear, texture, etc.).
Abstract: Representing the image to be inpainted in an appropriate sparse representation dictionary, and combining elements from Bayesian statistics and modern harmonic analysis, we introduce an expectation maximization (EM) algorithm for image inpainting and interpolation. From a statistical point of view, the inpainting/interpolation can be viewed as an estimation problem with missing data. Toward this goal, we propose the idea of using the EM mechanism in a Bayesian framework, where a sparsity promoting prior penalty is imposed on the reconstructed coefficients. The EM framework gives a principled way to establish formally the idea that missing samples can be recovered/interpolated based on sparse representations. We first introduce an easy and efficient sparse-representation-based iterative algorithm for image inpainting. Additionally, we derive its theoretical convergence properties. Compared to its competitors, this algorithm allows a high degree of flexibility to recover different structural components in the image (piecewise smooth, curvilinear, texture, etc.). We also suggest some guidelines to automatically tune the regularization parameter.

350 citations


Journal ArticleDOI
TL;DR: An expectation-maximization algorithm is proposed to estimate the underlying presence-absence logistic model for presence-only data and it is shown that the population prevalence of a species is only identifiable when there is some unrealistic constraint on the structure of theLogistic model.
Abstract: In ecological modeling of the habitat of a species, it can be prohibitively expensive to determine species absence. Presence-only data consist of a sample of locations with observed presences and a separate group of locations sampled from the full landscape, with unknown presences. We propose an expectation–maximization algorithm to estimate the underlying presence–absence logistic model for presence-only data. This algorithm can be used with any off-the-shelf logistic model. For models with stepwise fitting procedures, such as boosted trees, the fitting process can be accelerated by interleaving expectation steps within the procedure. Preliminary analyses based on sampling from presence–absence records of fish in New Zealand rivers illustrate that this new procedure can reduce both deviance and the shrinkage of marginal effect estimates that occur in the naive model often used in practice. Finally, it is shown that the population prevalence of a species is only identifiable when there is some unrealistic constraint on the structure of the logistic model. In practice, it is strongly recommended that an estimate of population prevalence be provided.

294 citations


Journal ArticleDOI
TL;DR: In this paper, a special gradient projection method is introduced that exploits effective scaling strategies and steplength updating rules, appropriately designed for improving the convergence rate, and the authors give convergence results for this scheme and evaluate its effectiveness by means of an extensive computational study on the minimization problems arising from the maximum likelihood approach to image deblurring.
Abstract: A class of scaled gradient projection methods for optimization problems with simple constraints is considered. These iterative algorithms can be useful in variational approaches to image deblurring that lead to minimized convex nonlinear functions subject to non-negativity constraints and, in some cases, to an additional flux conservation constraint. A special gradient projection method is introduced that exploits effective scaling strategies and steplength updating rules, appropriately designed for improving the convergence rate. We give convergence results for this scheme and we evaluate its effectiveness by means of an extensive computational study on the minimization problems arising from the maximum likelihood approach to image deblurring. Comparisons with the standard expectation maximization algorithm and with other iterative regularization schemes are also reported to show the computational gain provided by the proposed method.

252 citations


Journal ArticleDOI
TL;DR: A novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary by using the majorization method, an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step.
Abstract: In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary. The convergence of the proposed method to a fixed point is guaranteed, unless the accumulation points form a continuum. This holds for different sparsity measures. The majorization method is an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step. This method has been used successfully in sparse approximation and statistical estimation [ e.g., expectation-maximization (EM)] problems. This paper shows that the majorization method can be used for the dictionary learning problem too. The proposed method is compared with other methods on both synthetic and real data and different constraints on the dictionary are compared. Simulations show the advantages of the proposed method over other currently available dictionary learning methods not only in terms of average performance but also in terms of computation time.

244 citations


Journal ArticleDOI
TL;DR: A new two-parameter distribution family with decreasing failure rate arising by mixing power-series distribution and exponential distribution is introduced and various properties of this family are discussed and the estimation of parameters are obtained by method of maximum likelihood.

203 citations


Journal ArticleDOI
TL;DR: This paper uses a methodology to enable estimators of ERG model parameters to be compared and shows the superiority of the likelihood-based estimators over those based on pseudo- likelihood, with the bias-reduced pseudo-likelihood out-performing the general pseudo-Likelihood.

196 citations


Journal ArticleDOI
TL;DR: A feasible EM algorithm is developed for finding the maximum likelihood estimates of parameters in this context and a general information-based method for obtaining the asymptotic covariance matrix of themaximum likelihood estimators is presented.

Journal ArticleDOI
TL;DR: In this article, the authors considered the estimation of the stress strength parameter R = P (Y X ), when X and Y are independent and both are three-parameter Weibull distributions with the common shape and location parameters but different scale parameters.

Journal ArticleDOI
TL;DR: The proposed semi-automatic segmentations obtained by the proposed method are within the variability of the manual segmentations of two experts, and well suited to a semi- automatic context that requires minimal manual initialization.
Abstract: The goal of this work is to perform a segmentation of the intimamedia thickness (IMT) of carotid arteries in view of computing various dynamical properties of that tissue, such as the elasticity distribution (elastogram). The echogenicity of a region of interest comprising the intima-media layers, the lumen, and the adventitia in an ultrasonic B-mode image is modeled by a mixture of three Nakagami distributions. In a first step, we compute the maximum a posteriori estimator of the proposed model, using the expectation maximization (EM) algorithm. We then compute the optimal segmentation based on the estimated distributions as well as a statistical prior for disease-free IMT using a variant of the exploration/selection (ES) algorithm. Convergence of the ES algorithm to the optimal solution is assured asymptotically and is independent of the initial solution. In particular, our method is well suited to a semi-automatic context that requires minimal manual initialization. Tests of the proposed method on 30 sequences of ultrasonic B-mode images of presumably disease-free control subjects are reported. They suggest that the semi-automatic segmentations obtained by the proposed method are within the variability of the manual segmentations of two experts.

Journal ArticleDOI
TL;DR: A class of group ICA models that can accommodate different group structures and include existing models, such as the GIFT and tensor PICA, as special cases are considered and a maximum likelihood (ML) approach with a modified Expectation-Maximization (EM) algorithm is proposed.

Journal ArticleDOI
TL;DR: An unsupervised approach for feature selection and extraction in mixtures of generalized Dirichlet (GD) distributions that is able to extract independent and non-Gaussian features without loss of accuracy is presented.
Abstract: This paper presents an unsupervised approach for feature selection and extraction in mixtures of generalized Dirichlet (GD) distributions. Our method defines a new mixture model that is able to extract independent and non-Gaussian features without loss of accuracy. The proposed model is learned using the expectation-maximization algorithm by minimizing the message length of the data set. Experimental results show the merits of the proposed methodology in the categorization of object images.

Journal ArticleDOI
TL;DR: An EM algorithm is tailored to operate on spectroscopic samples obtained with the Michigan-MIKE Fiber System as part of the Magellan survey of stellar radial velocities in nearby dwarf spheroidal (dSph) galaxies, and returns accurate parameter estimates much more reliably than conventional methods of contaminant removal.
Abstract: We develop an algorithm for estimating parameters of a distribution sampled with contamination. We employ a statistical technique known as expectation maximization (EM). Given models for both member and contaminant populations, the EM algorithm iteratively evaluates the membership probability of each discrete data point, then uses those probabilities to update parameter estimates for member and contaminant distributions. The EM approach has wide applicability to the analysis of astronomical data. Here we tailor an EM algorithm to operate on spectroscopic samples obtained with the Michigan-MIKE Fiber System (MMFS) as part of our Magellan survey of stellar radial velocities in nearby dwarf spheroidal (dSph) galaxies. These samples, to be presented in a companion paper, contain discrete measurements of line-of-sight velocity, projected position, and pseudo-equivalent width of the Mg-triplet feature, for ~1000-2500 stars per dSph, including some fraction of contamination by foreground Milky Way stars. The EM algorithm uses all of the available data to quantify dSph and contaminant distributions. For distributions (e.g., velocity and Mg-index of dSph stars) assumed to be Gaussian, the EM algorithm returns maximum-likelihood estimates of the mean and variance, as well as the probability that each star is a dSph member. These probabilities can serve as weights in subsequent analyses. Applied to our MMFS data, the EM algorithm identifies more than 5000 stars as probable dSph members. We test the performance of the EM algorithm on simulated data sets that represent a range of sample size, level of contamination, and amount of overlap between dSph and contaminant velocity distributions. The simulations establish that for samples ranging from large (N ~ 3000, characteristic of the MMFS samples) to small (N ~ 30), resembling new samples for extremely faint dSphs), the EM algorithm distinguishes members from contaminants and returns accurate parameter estimates much more reliably than conventional methods of contaminant removal (e.g., sigma clipping).

Journal ArticleDOI
TL;DR: Using the generalized Bayesian theorem, an extension of Bayes' theorem in the belief function framework, a criterion generalizing the likelihood function is derived, allowing the ability of this approach to exploit partial information about class labels.

Journal ArticleDOI
01 Nov 2009-Test
TL;DR: In this paper, the authors considered the statistical inference of the unknown parameters of the generalized exponential distribution in presence of progressive censoring and obtained maximum likelihood estimators of unknown parameters using EM algorithm.
Abstract: In this paper, we consider the statistical inference of the unknown parameters of the generalized exponential distribution in presence of progressive censoring. We obtain maximum likelihood estimators of the unknown parameters using EM algorithm. We also compute the expected Fisher information matrix using the missing value principle. We then use these values to determine the optimal progressive censoring plans. Different optimality criteria are considered, and selected optimal progressive censoring plans are presented. One example has been provided for illustrative purposes.

Posted Content
TL;DR: This paper investigates the use of the Metropolis--Hastings algorithm to compute a pseudo-posterior distribution based on the composite likelihood and two methodologies for adjusting the algorithm are presented.
Abstract: Composite likelihoods are increasingly used in applications where the full likelihood is analytically unknown or computationally prohibitive. Although the maximum composite likelihood estimator has frequentist properties akin to those of the usual maximum likelihood estimator, Bayesian inference based on composite likelihoods has yet to be explored. In this paper we investigate the use of the Metropolis--Hastings algorithm to compute a pseudo-posterior distribution based on the composite likelihood. Two methodologies for adjusting the algorithm are presented and their performance on approximating the true posterior distribution is investigated using simulated data sets and real data on spatial extremes of rainfall.

Posted Content
TL;DR: This paper proposes a new criterion for the Stochastic Block Model, called Integrated Likelihood Variational Bayes (ILvb), based on a non-asymptotic approximation of the marginal likelihood, and describes how the criterion can be computed through a variational Baye EM algorithm.
Abstract: It is now widely accepted that knowledge can be acquired from networks by clustering their vertices according to connection profiles. Many methods have been proposed and in this paper we concentrate on the Stochastic Block Model (SBM). The clustering of vertices and the estimation of SBM model parameters have been subject to previous work and numerous inference strategies such as variational Expectation Maximization (EM) and classification EM have been proposed. However, SBM still suffers from a lack of criteria to estimate the number of components in the mixture. To our knowledge, only one model based criterion, ICL, has been derived for SBM in the literature. It relies on an asymptotic approximation of the Integrated Complete-data Likelihood and recent studies have shown that it tends to be too conservative in the case of small networks. To tackle this issue, we propose a new criterion that we call ILvb, based on a non asymptotic approximation of the marginal likelihood. We describe how the criterion can be computed through a variational Bayes EM algorithm.

Journal ArticleDOI
TL;DR: In this article, a new method for estimating the parameters of multi-channel pilot models that is based on maximum likelihood estimation is presented, which significantly increases the probability of finding the global optimum of the optimization problem.
Abstract: This paper presents a new method for estimating the parameters of multi-channel pilot models that is based on maximum likelihood estimation. To cope with the inherent nonlinearity of this optimization problem, the gradient-based Gauss-Newton algorithm commonly used to optimize the likelihood function in terms of output error is complemented with a genetic algorithm. This significantly increases the probability of finding the global optimum of the optimization problem. The genetic maximum likelihood method is successfully applied to data from a recent human-in-the-loop experiment. Accurate estimates of the pilot model parameters and the remnant characteristics were obtained. Multiple simulations with increasing levels of pilot remnant were performed, using the set of parameters found from the experimental data, to investigate how the accuracy of the parameter estimate is affected by increasing remnant. It is shown that only for very high levels of pilot remnant the bias in the parameter estimates is substantial. Some adjustments to the maximum likelihood method are proposed to reduce this bias.

Journal ArticleDOI
TL;DR: This work proposes an EM algorithm for computing the maximum likelihood and restricted maximum likelihood for linear and nonlinear mixed effects models with censored response, which uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation.
Abstract: We propose an EM algorithm for computing the maximum likelihood and restricted maximum likelihood for linear and nonlinear mixed effects models with censored response. In contrast with previous developments, this algorithm uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation. These expressions rely on formulas for the mean and variance of a truncated multinormal distribution, and can be computed using available software. This leads to an improvement in the speed of computation of up to an order of magnitude. A wide class of mixed effects models is considered, including the Laird–Ware model, and extensions to different structures for the variance components, heteroscedastic and autocorrelated errors, and multilevel models. We apply the methodology to two case studies from our own biostatistical practice, involving the analysis of longitudinal HIV viral load in two recent AIDS studies. The proposed algorithm is implemented in the R package lmec. An appendix which includes further...


Journal ArticleDOI
TL;DR: In this article, an extension of the dynamic logit model is proposed for multivariate categorical longitudinal data, which is based on a marginal parameterization of the conditional distribution of each vector of response variables given the covariates, the lagged response variables, and a set of subject-specific parameters for the unobserved heterogeneity.
Abstract: For the analysis of multivariate categorical longitudinal data, we propose an extension of the dynamic logit model. The resulting model is based on a marginal parameterization of the conditional distribution of each vector of response variables given the covariates, the lagged response variables, and a set of subject-specific parameters for the unobserved heterogeneity. The latter ones are assumed to follow a first-order Markov chain. For the maximum likelihood estimation of the model parameters, we outline an EM algorithm. The data analysis approach based on the proposed model is illustrated by a simulation study and an application to a dataset, which derives from the Panel Study on Income Dynamics and concerns fertility and female participation to the labor market.

01 Jan 2009
TL;DR: In this paper, a model-based clustering using a family of Gaussian mixture models, with parsimonious factor analysis-like covariance structure, is described and an ecient algorithm for its implementation is presented.
Abstract: Model-based clustering using a family of Gaussian mixture models, with parsimonious factor analysis-like covariance structure, is described and an ecient algorithm for its implementation is presented. This algorithm uses the alternating expectationconditional maximization (AECM) variant of the expectation-maximization (EM) algorithm. Two central issues around the implementation of this family of models, namely model selection and convergence criteria, are discussed. These central issues also have implications for other model-based clustering techniques and for the implementation of techniques like the EM algorithm, in general. The Bayesian information criterion (BIC) is used for model selection and Aitken’s acceleration, which is shown to outperform the lack of progress criterion, is used to determine convergence. A brief introduction to parallel computing is then given before the implementation of this algorithm in parallel is facilitated within the master-slave paradigm. A simulation study is then carried out to confirm the eectiveness of this parallelization. The resulting software is applied to two data sets to demonstrate its eectiveness when compared to existing software.

Journal ArticleDOI
TL;DR: An algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm, which is much more flexible and easily applicable than existing algorithms in the literature and yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study.
Abstract: We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density functions are completely unspecified. Sometimes, the density functions may be partially specified by Euclidean parameters, a case we call semiparametric. Our algorithm is much more flexible and easily applicable than existing algorithms in the literature; it can be extended to any number of mixture components and any number of vector coordinates of the multivariate observations. Thus it may be applied even in situations where the model is not identifiable, so care is called for when using it in situations for which identifiability is difficult to establish conclusively. Our algorithm yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study. In another example using a ...

Journal ArticleDOI
TL;DR: An expectation-maximization algorithm is presented for the maximum likelihood estimation of the model parameters in the presence of missing data and a contribution analysis method is proposed to identify which variables contribute the most to the occurrence of outliers, providing valuable information regarding the source of outlying data.

Proceedings ArticleDOI
04 Dec 2009
TL;DR: A new probabilistic model for polyphonic audio termed factorial scaled hidden Markov model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito nonnegative matrix factorization (NMF) model is presented.
Abstract: We present a new probabilistic model for polyphonic audio termed Factorial Scaled Hidden Markov Model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito Nonnegative Matrix Factorization (NMF) model. We describe two expectation-maximization (EM) algorithms for maximum likelihood estimation, which differ by the choice of complete data set. The second EM algorithm, based on a reduced complete data set and multiplicative updates inspired from NMF methodology, exhibits much faster convergence. We consider the FS-HMM in different configurations for the difficult problem of speech / music separation from a single channel and report satisfying results.

Journal ArticleDOI
TL;DR: This paper generalizes the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties.
Abstract: We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation--Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual $d$-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-merge" procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.

Journal ArticleDOI
TL;DR: This paper introduces a novel hidden Markov model where the hidden state distributions are considered to be finite mixtures of multivariate Student's t-densities, and derives an algorithm for the model parameters estimation under a maximum likelihood framework, assuming full, diagonal, and factor-analyzed covariance matrices.
Abstract: Hidden Markov (chain) models using finite Gaussian mixture models as their hidden state distributions have been successfully applied in sequential data modeling and classification applications. Nevertheless, Gaussian mixture models are well known to be highly intolerant to the presence of untypical data within the fitting data sets used for their estimation. Finite Student's t-mixture models have recently emerged as a heavier-tailed, robust alternative to Gaussian mixture models, overcoming these hurdles. To exploit these merits of Student's t-mixture models in the context of a sequential data modeling setting, we introduce, in this paper, a novel hidden Markov model where the hidden state distributions are considered to be finite mixtures of multivariate Student's t-densities. We derive an algorithm for the model parameters estimation under a maximum likelihood framework, assuming full, diagonal, and factor-analyzed covariance matrices. The advantages of the proposed model over conventional approaches are experimentally demonstrated through a series of sequential data modeling applications.