scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 2014"


Reference EntryDOI
29 Sep 2014

515 citations


Journal ArticleDOI
TL;DR: This paper proposes an efficient algorithm, called vector field consensus, for establishing robust point correspondences between two sets of points, and suggests a two-stage strategy, where the nonparametric model is used to reduce the size of the putative set and a parametric variant of the approach to estimate the geometric parameters.
Abstract: In this paper, we propose an efficient algorithm, called vector field consensus, for establishing robust point correspondences between two sets of points. Our algorithm starts by creating a set of putative correspondences which can contain a very large number of false correspondences, or outliers, in addition to a limited number of true correspondences (inliers). Next, we solve for correspondence by interpolating a vector field between the two point sets, which involves estimating a consensus of inlier points whose matching follows a nonparametric geometrical constraint. We formulate this a maximum a posteriori (MAP) estimation of a Bayesian model with hidden/latent variables indicating whether matches in the putative set are outliers or inliers. We impose nonparametric geometrical constraints on the correspondence, as a prior distribution, using Tikhonov regularizers in a reproducing kernel Hilbert space. MAP estimation is performed by the EM algorithm which by also estimating the variance of the prior model (initialized to a large value) is able to obtain good estimates very quickly (e.g., avoiding many of the local minima inherent in this formulation). We illustrate this method on data sets in 2D and 3D and demonstrate that it is robust to a very large number of outliers (even up to 90%). We also show that in the special case where there is an underlying parametric geometrical model (e.g., the epipolar line constraint) that we obtain better results than standard alternatives like RANSAC if a large number of outliers are present. This suggests a two-stage strategy, where we use our nonparametric model to reduce the size of the putative set and then apply a parametric variant of our approach to estimate the geometric parameters. Our algorithm is computationally efficient and we provide code for others to use it. In addition, our approach is general and can be applied to other problems, such as learning with a badly corrupted training data set.

489 citations


01 Jan 2014
TL;DR: Maximum likelihood is illustrated by replicating Daniel Treisman's (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics, and concludes that Russia has a higher number of millionaires than economic factors such as market size and tax rate predict.
Abstract: In a previous lecture, we estimated the relationship between dependent and explanatory variables using linear regression. But what if a linear relationship is not an appropriate assumption for our model? One widely used alternative is maximum likelihood estimation, which involves specifying a class of distributions, indexed by unknown parameters, and then using the data to pin down these parameter values. The benefit relative to linear regression is that it allows more flexibility in the probabilistic relationships between variables. Here we illustrate maximum likelihood by replicating Daniel Treisman’s (2016) paper, Russia’s Billionaires, which connects the number of billionaires in a country to its economic characteristics. The paper concludes that Russia has a higher number of billionaires than economic factors such as market size and tax rate predict.

464 citations


Journal ArticleDOI
TL;DR: The expectation maximization algorithm is modified in order to estimate the parameters of the dynamic factor model on a dataset with an arbitrary pattern of missing data and the model is extended to the case with a serially correlated idiosyncratic component.
Abstract: SUMMARY In this paper we modify the expectation maximization algorithm in order to estimate the parameters of the dynamic factor model on a dataset with an arbitrary pattern of missing data. We also extend the model to the case with a serially correlated idiosyncratic component. The framework allows us to handle efficiently and in an automatic manner sets of indicators characterized by different publication delays, frequencies and sample lengths. This can be relevant, for example, for young economies for which many indicators have been compiled only recently. We evaluate the methodology in a Monte Carlo experiment and we apply it to nowcasting of the euro area gross domestic product. Copyright © 2012 John Wiley & Sons, Ltd.

330 citations


Posted Content
TL;DR: In this article, a two-stage efficient algorithm for multi-class crowd labeling problems is proposed, where the first stage uses the spectral method to obtain an initial estimate of parameters, and the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm.
Abstract: Crowdsourcing is a popular paradigm for effectively collecting labels at low cost. The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.

272 citations


Journal ArticleDOI
TL;DR: The first model-based clustering algorithm for multivariate functional data is proposed, based on the assumption of normality of the principal component scores, and it ability to take into account the dependence among curves.

239 citations


Journal ArticleDOI
TL;DR: Comparisons are presented to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.
Abstract: Finite mixtures of multivariate skew t (MST) distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour. Recently, they have been exploited as an effective tool for modelling flow cytometric data. A number of algorithms for the computation of the maximum likelihood (ML) estimates for the model parameters of mixtures of MST distributions have been put forward in recent years. These implementations use various characterizations of the MST distribution, which are similar but not identical. While exact implementation of the expectation-maximization (EM) algorithm can be achieved for `restricted' characterizations of the component skew t-distributions, Monte Carlo (MC) methods have been used to fit the `unrestricted' models. In this paper, we review several recent fitting algorithms for finite mixtures of multivariate skew t-distributions, at the same time clarifying some of the connections between the various existing proposals. In particular, recent results have shown that the EM algorithm can be implemented exactly for faster computation of ML estimates for mixtures with unrestricted MST components. The gain in computational time is effected by noting that the semi-infinite integrals on the E-step of the EM algorithm can be put in the form of moments of the truncated multivariate non-central t-distribution, similar to the restricted case, which subsequently can be expressed in terms of the non-truncated form of the central t-distribution function for which fast algorithms are available. We present comparisons to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.

233 citations


Journal ArticleDOI
23 Jan 2014-Energies
TL;DR: In this paper, the authors proposed a novel RUL prediction method for lithium-ion batteries based on the Wiener process with measurement error (WPME), which used the truncated normal distribution (TND) based modeling approach for the estimated degradation state and obtained an exact and closed-form RUL distribution by simultaneously considering the measurement uncertainty and the distribution of the estimated drift parameter.
Abstract: Remaining useful life (RUL) prediction is central to the prognostics and health management (PHM) of lithium-ion batteries. This paper proposes a novel RUL prediction method for lithium-ion batteries based on the Wiener process with measurement error (WPME). First, we use the truncated normal distribution (TND) based modeling approach for the estimated degradation state and obtain an exact and closed-form RUL distribution by simultaneously considering the measurement uncertainty and the distribution of the estimated drift parameter. Then, the traditional maximum likelihood estimation (MLE) method for population based parameters estimation is remedied to improve the estimation efficiency. Additionally, we analyze the relationship between the classic MLE method and the combination of the Bayesian updating algorithm and the expectation maximization algorithm for the real time RUL prediction. Interestingly, it is found that the result of the combination algorithm is equal to the classic MLE method. Inspired by this observation, a heuristic algorithm for the real time parameters updating is presented. Finally, numerical examples and a case study of lithium-ion batteries are provided to substantiate the superiority of the proposed RUL prediction method.

202 citations


Journal ArticleDOI
TL;DR: A primer on maximum likelihood and some important extensions which have proven useful in epidemiologic research, and which reveal connections betweenmaximum likelihood and Bayesian methods.
Abstract: The method of maximum likelihood is widely used in epidemiology, yet many epidemiologists receive little or no education in the conceptual underpinnings of the approach. Here we provide a primer on maximum likelihood and some important extensions which have proven useful in epidemiologic research, and which reveal connections between maximum likelihood and Bayesian methods. For a given data set and probability model, maximum likelihood finds values of the model parameters that give the observed data the highest probability. As with all inferential statistical methods, maximum likelihood is based on an assumed model and cannot account for bias sources that are not controlled by the model or the study design. Maximum likelihood is nonetheless popular, because it is computationally straightforward and intuitive and because maximum likelihood estimators have desirable large-sample properties in the (largely fictitious) case in which the model has been correctly specified. Here, we work through an example to illustrate the mechanics of maximum likelihood estimation and indicate how improvements can be made easily with commercial software. We then describe recent extensions and generalizations which are better suited to observational health research and which should arguably replace standard maximum likelihood as the default method.

154 citations


Journal ArticleDOI
TL;DR: This work marks an important step in the non-Gaussian model-based clustering and classification direction, and a variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the generalized inverse Gaussian distribution.
Abstract: A mixture of shifted asymmetric Laplace distributions is introduced and used for clustering and classification. A variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the generalized inverse Gaussian distribution. This approach is mathematically elegant and relatively computationally straightforward. Our novel mixture modelling approach is demonstrated on both simulated and real data to illustrate clustering and classification applications. In these analyses, our mixture of shifted asymmetric Laplace distributions performs favourably when compared to the popular Gaussian approach. This work, which marks an important step in the non-Gaussian model-based clustering and classification direction, concludes with discussion as well as suggestions for future work.

151 citations


Journal ArticleDOI
TL;DR: In this paper, the authors identify an unknown scale parameter ηf that is critical to the identification for consistency and propose a three-step quasi-maximum likelihood procedure with non-Gaussian likelihood functions.
Abstract: The non-Gaussian maximum likelihood estimator is frequently used in GARCH models with the intention of capturing heavy-tailed returns. However, unless the parametric likelihood family contains the true likelihood, the estimator is inconsistent due to density misspecification. To correct this bias, we identify an unknown scale parameter ηf that is critical to the identification for consistency and propose a three-step quasi-maximum likelihood procedure with non-Gaussian likelihood functions. This novel approach is consistent and asymptotically normal under weak moment conditions. Moreover, it achieves better efficiency than the Gaussian alternative, particularly when the innovation error has heavy tails. We also summarize and compare the values of the scale parameter and the asymptotic efficiency for estimators based on different choices of likelihood functions with an increasing level of heaviness in the innovation tails. Numerical studies confirm the advantages of the proposed approach.

Journal ArticleDOI
TL;DR: The logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Abstract: Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.

Journal ArticleDOI
TL;DR: An overview of the theory of 4D image reconstruction for emission tomography is given, and maximum likelihood or maximum a posteriori (MAP) estimation of either linear or non-linear model parameters can be achieved in image space after carrying out a conventional expectation maximization update of the dynamic image series, using a Kullback-Leibler distance metric.
Abstract: An overview of the theory of 4D image reconstruction for emission tomography is given along with a review of the current state of the art, covering both positron emission tomography and single photon emission computed tomography (SPECT). By viewing 4D image reconstruction as a matter of either linear or non-linear parameter estimation for a set of spatiotemporal functions chosen to approximately represent the radiotracer distribution, the areas of so-called 'fully 4D' image reconstruction and 'direct kinetic parameter estimation' are unified within a common framework. Many choices of linear and non-linear parameterization of these functions are considered (including the important case where the parameters have direct biological meaning), along with a review of the algorithms which are able to estimate these often non-linear parameters from emission tomography data. The other crucial components to image reconstruction (the objective function, the system model and the raw data format) are also covered, but in less detail due to the relatively straightforward extension from their corresponding components in conventional 3D image reconstruction. The key unifying concept is that maximum likelihood or maximum a posteriori (MAP) estimation of either linear or non-linear model parameters can be achieved in image space after carrying out a conventional expectation maximization (EM) update of the dynamic image series, using a Kullback-Leibler distance metric (comparing the modeled image values with the EM image values), to optimize the desired parameters. For MAP, an image-space penalty for regularization purposes is required. The benefits of 4D and direct reconstruction reported in the literature are reviewed, and furthermore demonstrated with simple simulation examples. It is clear that the future of reconstructing dynamic or functional emission tomography images, which often exhibit high levels of spatially correlated noise, should ideally exploit these 4D approaches.

Journal ArticleDOI
TL;DR: Parsimonious skew- t and skew-normal analogues of the GPCM family that employ an eigenvalue decomposition of a scale matrix are introduced and are compared to existing models in both unsupervised and semi-supervised classification frameworks.

Journal ArticleDOI
TL;DR: A family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight and can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension.
Abstract: We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.

Journal ArticleDOI
TL;DR: This paper addresses the problem of in-network distributed estimation for sparse vectors, and develops several distributed sparse recursive least-squares (RLS) algorithms based on the maximum likelihood framework, and the expectation-maximization algorithm is used to numerically solve the sparse estimation problem.
Abstract: Distributed estimation over networks has received much attention in recent years due to its broad applicability. Many signals in nature present high level of sparsity, which contain only a few large coefficients among many negligible ones. In this paper, we address the problem of in-network distributed estimation for sparse vectors, and develop several distributed sparse recursive least-squares (RLS) algorithms. The proposed algorithms are based on the maximum likelihood framework, and the expectation-maximization algorithm, with the aid of thresholding operators, is used to numerically solve the sparse estimation problem. To improve the estimation performance, the thresholding operators related to l0- and l1-norms with real-time self-adjustable thresholds are derived. With these thresholding operators, we can exploit the underlying sparsity to implement the distributed estimation with low computational complexity and information exchange amount among neighbors. The sparsity-promoting intensity is also adaptively adjusted so that a good performance of the sparse solution can be achieved. Both theoretical analysis and numerical simulations are presented to show the effectiveness of the proposed algorithms.

Proceedings Article
08 Dec 2014
TL;DR: Experimental results demonstrate that the proposed algorithm for multi-class crowd labeling problems is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.
Abstract: The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.

Journal ArticleDOI
TL;DR: The first maximum likelihood solution to handle the cases where measurements from different participants may be conflicting is provided and is shown to outperform previous work used for corroborating observations, the state-of-the-art fact-finding baselines, as well as simple heuristics such as majority voting.
Abstract: This article addresses the challenge of truth discovery from noisy social sensing data. The work is motivated by the emergence of social sensing as a data collection paradigm of growing interest, where humans perform sensory data collection tasks. Unlike the case with well-calibrated and well-tested infrastructure sensors, humans are less reliable, and the likelihood that participants' measurements are correct is often unknown a priori. Given a set of human participants of unknown trustworthiness together with their sensory measurements, we pose the question of whether one can use this information alone to determine, in an analytically founded manner, the probability that a given measurement is true. In our previous conference paper, we offered the first maximum likelihood solution to the aforesaid truth discovery problem for corroborating observations only. In contrast, this article extends the conference paper and provides the first maximum likelihood solution to handle the cases where measurements from different participants may be conflicting. The article focuses on binary measurements. The approach is shown to outperform our previous work used for corroborating observations, the state-of-the-art fact-finding baselines, as well as simple heuristics such as majority voting.

Book ChapterDOI
06 Sep 2014
TL;DR: A probabilistic generative model and its associated algorithm to jointly register multiple point sets that are realizations of a Gaussian mixture and the registration is cast into a clustering problem.
Abstract: This paper describes a probabilistic generative model and its associated algorithm to jointly register multiple point sets. The vast majority of state-of-the-art registration techniques select one of the sets as the “model” and perform pairwise alignments between the other sets and this set. The main drawback of this mode of operation is that there is no guarantee that the model-set is free of noise and outliers, which contaminates the estimation of the registration parameters. Unlike previous work, the proposed method treats all the point sets on an equal footing: they are realizations of a Gaussian mixture (GMM) and the registration is cast into a clustering problem. We formally derive an EM algorithm that estimates both the GMM parameters and the rotations and translations that map each individual set onto the “central” model. The mixture means play the role of the registered set of points while the variances provide rich information about the quality of the registration. We thoroughly validate the proposed method with challenging datasets, we compare it with several state-of-the-art methods, and we show its potential for fusing real depth data.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the semiparametric inference of the simple Gamma-process model and a random effects variant, where the maximum likelihood estimates of the parameters were obtained through the EM algorithm and the bootstrap was used to construct confidence intervals.
Abstract: This article investigates the semiparametric inference of the simple Gamma-process model and a random-effects variant. Maximum likelihood estimates of the parameters are obtained through the EM algorithm. The bootstrap is used to construct confidence intervals. A simulation study reveals that an estimation based on the full likelihood method is more efficient than the pseudo likelihood method. In addition, a score test is developed to examine the existence of random effects under the semiparametric scenario. A comparison study using a fatigue-crack growth dataset shows that performance of a semiparametric estimation is comparable to the parametric counterpart. This article has supplementary material online.

Journal ArticleDOI
TL;DR: A one-to-one correspondence between the IRLS algorithms and a class of Expectation-Maximization algorithms for constrained maximum likelihood estimation under a Gaussian scale mixture (GSM) distribution is demonstrated.
Abstract: In this paper, we study the theoretical properties of iteratively re-weighted least squares (IRLS) algorithms and their utility in sparse signal recovery in the presence of noise. We demonstrate a one-to-one correspondence between the IRLS algorithms and a class of Expectation-Maximization (EM) algorithms for constrained maximum likelihood estimation under a Gaussian scale mixture (GSM) distribution. The EM formalism, as well as the connection to GSMs, allow us to establish that the IRLS algorithms minimize smooth versions of the lν `norms', for . We leverage EM theory to show that the limit points of the sequence of IRLS iterates are stationary points of the smooth lν “norm” minimization problem on the constraint set. We employ techniques from Compressive Sampling (CS) theory to show that the IRLS algorithm is stable, if the limit point of the iterates coincides with the global minimizer. We further characterize the convergence rate of the IRLS algorithm, which implies global linear convergence for ν = 1 and local super-linear convergence for . We demonstrate our results via simulation experiments. The simplicity of IRLS, along with the theoretical guarantees provided in this contribution, make a compelling case for its adoption as a standard tool for sparse signal recovery.

Book
02 Sep 2014
TL;DR: In this paper, a mixture and classification models and their likelihood estimators are used to measure the robustness of the MAP criterion with respect to the number of components and outliers.
Abstract: Introduction Mixture and classification models and their likelihood estimators General consistency and asymptotic normality Local likelihood estimates Maximum likelihood estimates Notes Mixture models and their likelihood estimators Latent distributions Finite mixture models Identifiable mixture models Asymptotic properties of local likelihood maxima Asymptotic properties of the MLE: constrained nonparametric mixture models Asymptotic properties of the MLE: constrained parametric mixture models Notes Classification models and their criteria Probabilistic criteria for general populations Admissibility and size constraints Steady partitions Elliptical models Normal models Geometric considerations Consistency of the MAP criterion Notes Robustification by trimming Outliers and measures of robustness Outliers The sensitivities Sensitivity of ML estimates of mixture models Breakdown points Trimming the mixture model Trimmed likelihood function of the mixture model Normal components Universal breakdown points of covariance matrices, mixing rates, and means Restricted breakdown point of mixing rates and means Notes Trimming the classification model - the TDC Trimmed MAP classification model Normal case - the Trimmed Determinant Criterion, TDC Breakdown robustness of the constrained TDC Universal breakdown point of covariance matrices and means Restricted breakdown point of the means Notes Algorithms EM algorithm for mixtures General mixtures Normal mixtures Mixtures of multivariate t-distributions Trimming - the EMT algorithm Order of Convergence Acceleration of the mixture EM Notes k-Parameters algorithms General and elliptically symmetric models Steady solutions and trimming Using combinatorial optimization Overall algorithms Notes Hierarchical methods for initial solutions Favorite solutions and cluster validation Scale balance and Pareto solutions Number of components of uncontaminated data Likelihood-ratio tests Using cluster criteria as test statistics Model selection criteria Ridgeline manifold Number of components and outliers Classification trimmed likelihood curves Trimmed BIC Adjusted BIC Cluster validation Separation indices Normality and related tests Visualization Measures of agreement of partitions Stability Notes Variable selection in clustering Irrelevance Definition and general properties The normal case Filters Univariate filters Multivariate filters Wrappers Using the likelihood ratio test Using Bayes factors and their BIC approximations Maximum likelihood subset selection Consistency of the MAP cluster criterion with variable selection Practical guidelines Notes Applications Miscellaneous data sets IRIS data SWISS BILLS STONE FLAKES Gene expression data Supervised and unsupervised methods Combining gene selection and profile clustering Application to the LEUKEMIA data Notes Appendix A: Geometry and linear algebra Appendix B: Topology Appendix C: Analysis Appendix D: Measures and probabilities Appendix E: Probability Appendix F: Statistics Appendix G: Optimization

Journal ArticleDOI
TL;DR: The proposed algorithm can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization, and derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions.
Abstract: This paper addresses the Bayesian estimation of the von-Mises Fisher (vMF) mixture model with variational inference (VI). The learning task in VI consists of optimization of the variational posterior distribution. However, the exact solution by VI does not lead to an analytically tractable solution due to the evaluation of intractable moments involving functional forms of the Bessel function in their arguments. To derive a closed-form solution, we further lower bound the evidence lower bound where the bound is tight at one point in the parameter distribution. While having the value of the bound guaranteed to increase during maximization, we derive an analytically tractable approximation to the posterior distribution which has the same functional form as the assigned prior distribution. The proposed algorithm requires no iterative numerical calculation in the re-estimation procedure, and it can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization. Moreover, we derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions. The performance of the proposed approach is verified by experiments with both synthetic and real data.

Journal ArticleDOI
TL;DR: A novel family of twelve mixture models with random covariates nested in the linear t cluster-weighted model (CWM), is introduced for model-based clustering and provides a unified framework that also includes the linear Gaussian CWM as a special case.

Posted Content
TL;DR: Experimental results over multiple images with different range of complexity validate the efficiency of the proposed technique with regard to segmentation accuracy, speed, and robustness and demonstrate a better performance from the proposed algorithm.
Abstract: This paper explores the use of the Artificial Bee Colony (ABC) algorithm to compute threshold selection for image segmentation. ABC is a heuristic algorithm motivated by the intelligent behavior of honey-bees which has been successfully employed to solve complex optimization problems. In this approach, an image 1D histogram is approximated through a Gaussian mixture model whose parameters are calculated by the ABC algorithm. For the approximation scheme, each Gaussian function represents a pixel class and therefore a threshold. Unlike the Expectation Maximization (EM) algorithm, the ABC based method shows fast convergence and low sensitivity to initial conditions. Remarkably, it also improves complex time consuming computations commonly required by gradient-based methods. Experimental results demonstrate the algorithms ability to perform automatic multi threshold selection yet showing interesting advantages by comparison to other well known algorithms.

Journal ArticleDOI
TL;DR: A robust estimation procedure for mixture linear regression models is proposed by assuming that the error terms follow a Laplace distribution, implemented by an EM algorithm which incorporates two types of missing information from the mixture class membership and the latent variable.

Journal ArticleDOI
TL;DR: A modified version of the proposed method, which fits the mixture regression based on the t-distribution to the data after adaptively trimming high leverage points, has high efficiency due to the adaptive choice of degrees of freedom.

Proceedings Article
21 Jun 2014
TL;DR: This paper provides a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix, and shows that a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal sample complexity.
Abstract: Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM's performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem.

Journal ArticleDOI
TL;DR: A novel approach is developed based on both the Monte Carlo expectation-maximization algorithm and importance sampling to calculate the maximum likelihood estimator and asymptotic variance-covariance matrix of the Markov-switching GARCH model.

Journal ArticleDOI
21 Aug 2014-Test
TL;DR: A comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data is provided and methods for selecting the number of states and for path prediction are outlined.
Abstract: We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. We illustrate the general version of the LM model which includes individual covariates, and several constrained versions. Constraints make the model more parsimonious and allow us to consider and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also illustrate in detail maximum likelihood estimation through the Expectation–Maximization algorithm, which may be efficiently implemented by recursions taken from the hidden Markov literature. We outline methods for obtaining standard errors for the parameter estimates. We also illustrate methods for selecting the number of states and for path prediction. Finally, we mention issues related to Bayesian inference of LM models. Possibilities for further developments are given among the concluding remarks.