Showing papers on "Expectation–maximization algorithm published in 2012"

PDF

Open Access

Journal Article•DOI•

A quasi maximum likelihood approach for large approximate dynamic factor models

[...]

Catherine Doz¹, Domenico Giannone², Lucrezia Reichlin³•Institutions (3)

Paris School of Economics¹, Université libre de Bruxelles², London Business School³

09 Nov 2012-The Review of Economics and Statistics

TL;DR: In this article, the authors show that the common factors based on maximum likelihood are consistent for the size of the cross-section (n) and the sample size (T) going to infinity along any path of n and T and therefore maximum likelihood is viable for n large.

...read moreread less

Abstract: Is maximum likelihood suitable for factor models in large cross-sections of time series? We answer this question from both an asymptotic and an empirical perspective. We show that estimates of the common factors based on maximum likelihood are consistent for the size of the cross-section (n) and the sample size (T) going to infinity along any path of n and T and that therefore maximum likelihood is viable for n large. The estimator is robust to misspecification of the cross-sectional and time series correlation of the the idiosyncratic components. In practice, the estimator can be easily implemented using the Kalman smoother and the EM algorithm as in traditional factor analysis.

...read moreread less

497 citations

Journal Article•DOI•

Bayesian Defogging

[...]

Ko Nishino¹, Louis Kratz¹, Stephen Lombardi¹•Institutions (1)

Drexel University¹

01 Jul 2012-International Journal of Computer Vision

TL;DR: A novel Bayesian probabilistic method that jointly estimates the scene albedo and depth from a single foggy image by fully leveraging their latent statistical structures by exploiting natural image and depth statistics as priors on these hidden layers.

...read moreread less

Abstract: Atmospheric conditions induced by suspended particles, such as fog and haze, severely alter the scene appearance. Restoring the true scene appearance from a single observation made in such bad weather conditions remains a challenging task due to the inherent ambiguity that arises in the image formation process. In this paper, we introduce a novel Bayesian probabilistic method that jointly estimates the scene albedo and depth from a single foggy image by fully leveraging their latent statistical structures. Our key idea is to model the image with a factorial Markov random field in which the scene albedo and depth are two statistically independent latent layers and to jointly estimate them. We show that we may exploit natural image and depth statistics as priors on these hidden layers and estimate the scene albedo and depth with a canonical expectation maximization algorithm with alternating minimization. We experimentally evaluate the effectiveness of our method on a number of synthetic and real foggy images. The results demonstrate that the method achieves accurate factorization even on challenging scenes for past methods that only constrain and estimate one of the latent variables.

...read moreread less

397 citations

Journal Article•DOI•

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

[...]

Po-Ling Loh, Martin J. Wainwright

01 Jun 2012-Annals of Statistics

TL;DR: In this paper, the authors study the problem of high-dimensional sparse linear regression with noisy, missing and/or dependent data and show that a simple algorithm based on projected gradient descent can converge in polynomial time to a small neighborhood of all global minimizers.

...read moreread less

Abstract: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

...read moreread less

376 citations

Proceedings Article•DOI•

On truth discovery in social sensing: a maximum likelihood estimation approach

[...]

Dong Wang¹, Lance M. Kaplan², Hieu Le¹, Tarek Abdelzaher¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, United States Army Research Laboratory²

16 Apr 2012

TL;DR: The approach is shown to outperform the state of the art fact-finding heuristics, as well as simple baselines such as majority voting, and to offer the first optimal solution to the above truth discovery problem.

...read moreread less

Abstract: This paper addresses the challenge of truth discovery from noisy social sensing data. The work is motivated by the emergence of social sensing as a data collection paradigm of growing interest, where humans perform sensory data collection tasks. A challenge in social sensing applications lies in the noisy nature of data. Unlike the case with well-calibrated and well-tested infrastructure sensors, humans are less reliable, and the likelihood that participants' measurements are correct is often unknown a priori. Given a set of human participants of unknown reliability together with their sensory measurements, this paper poses the question of whether one can use this information alone to determine, in an analytically founded manner, the probability that a given measurement is true. The paper focuses on binary measurements. While some previous work approached the answer in a heuristic manner, we offer the first optimal solution to the above truth discovery problem. Optimality, in the sense of maximum likelihood estimation, is attained by solving an expectation maximization problem that returns the best guess regarding the correctness of each measurement. The approach is shown to outperform the state of the art fact-finding heuristics, as well as simple baselines such as majority voting.

...read moreread less

372 citations

Proceedings Article•

A Method of Moments for Mixture Models and Hidden Markov Models

[...]

Animashree Anandkumar¹, Daniel Hsu², Sham M. Kakade²•Institutions (2)

University of California, Irvine¹, Microsoft²

16 Jun 2012

TL;DR: In this article, a method of moments approach is proposed for parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians and hidden Markov models.

...read moreread less

Abstract: Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an ecient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it oers a viable alternative to EM for practical deployment.

...read moreread less

363 citations

Journal Article•DOI•

Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices

[...]

Florent Krzakala¹, Marc Mézard², François Sausset², Yifan Sun¹, Yifan Sun³, Lenka Zdeborová⁴ - Show less +2 more•Institutions (4)

ESPCI ParisTech¹, University of Paris-Sud², Beihang University³, Centre national de la recherche scientifique⁴

01 Aug 2012-Journal of Statistical Mechanics: Theory and Experiment

TL;DR: In this paper, the authors present the probabilistic approach to reconstruction and discuss its optimality and robustness, and derive the derivation of the message passing algorithm for reconstruction and expectation maximization learning of signal model parameters.

...read moreread less

Abstract: Compressed sensing is a signal processing method that acquires data directly in a compressed form. This allows one to make fewer measurements than were considered necessary to record a signal, enabling faster or more precise measurement protocols in a wide range of applications. Using an interdisciplinary approach, we have recently proposed in Krzakala et?al (2012 Phys. Rev. X 2 021005) a strategy that allows compressed sensing to be performed at acquisition rates approaching the theoretical optimal limits. In this paper, we give a more thorough presentation of our approach, and introduce many new results. We present the probabilistic approach to reconstruction and discuss its optimality and robustness. We detail the derivation of the message passing algorithm for reconstruction and expectation maximization learning of signal-model parameters. We further develop the asymptotic analysis of the corresponding phase diagrams with and without measurement noise, for different distributions of signals, and discuss the best possible reconstruction performances regardless of the algorithm. We also present new efficient seeding matrices, test them on synthetic data and analyze their performance asymptotically.

...read moreread less

285 citations

Journal Article•DOI•

Statistical analysis of factor models of high dimension

[...]

Jushan Bai, Kunpeng Li

30 May 2012-arXiv: Statistics Theory

TL;DR: In this paper, the authors considered the maximum likelihood estimation of factor models of high dimension, where the number of variables (N) is comparable with or even greater than the total number of observations (T) and developed an inferential theory to establish not only consistency but also the rate of convergence and limiting distributions.

...read moreread less

Abstract: This paper considers the maximum likelihood estimation of factor models of high dimension, where the number of variables (N) is comparable with or even greater than the number of observations (T). An inferential theory is developed. We establish not only consistency but also the rate of convergence and the limiting distributions. Five different sets of identification conditions are considered. We show that the distributions of the MLE estimators depend on the identification restrictions. Unlike the principal components approach, the maximum likelihood estimator explicitly allows heteroskedasticities, which are jointly estimated with other parameters. Efficiency of MLE relative to the principal components method is also considered.

...read moreread less

249 citations

Proceedings Article•DOI•

Expectation-maximization Gaussian-mixture approximate message passing

[...]

Jeremy Vila¹, Philip Schniter¹•Institutions (1)

Ohio State University¹

21 Mar 2012

TL;DR: This work proposes an empirical-Bayesian technique that simultaneously learns the signal distribution while MMSE-recovering the signal-according to the learned distribution-using AMP, and model the non-zero distribution as a Gaussian mixture and learn its parameters through expectation maximization, using AMP to implement the expectation step.

...read moreread less

Abstract: When recovering a sparse signal from noisy compressive linear measurements, the distribution of the signal's non-zero coefficients can have a profound affect on recovery mean-squared error (MSE). If this distribution was apriori known, one could use efficient approximate message passing (AMP) techniques for nearly minimum MSE (MMSE) recovery. In practice, though, the distribution is unknown, motivating the use of robust algorithms like Lasso—which is nearly minimax optimal—at the cost of significantly larger MSE for non-least-favorable distributions. As an alternative, we propose an empirical-Bayesian technique that simultaneously learns the signal distribution while MMSE-recovering the signal—according to the learned distribution—using AMP. In particular, we model the non-zero distribution as a Gaussian mixture, and learn its parameters through expectation maximization, using AMP to implement the expectation step. Numerical experiments confirm the state-of-the-art performance of our approach on a range of signal classes.

...read moreread less

205 citations

Book Chapter•DOI•

The EM Algorithm

[...]

Shu-Kay Ng¹, Thriyambakam Krishnan, Geoffrey J. McLachlan²•Institutions (2)

Griffith University¹, University of Queensland²

01 Jan 2012-Research Papers in Economics

TL;DR: The second edition of the book chapter attempts to capture advanced developments in EM methodology in recent years, especially in its applications to the related fields of biomedical and health sciences.

...read moreread less

Abstract: The Expectation-Maximization (EM) algorithm is a broadly applicable approach to the iterative computation of maximum likelihood estimates in a wide variety of incomplete-data problems. The EM algorithm has a number of desirable properties, such as its numerical stability, reliable global convergence, and simplicity of implementation. There are, however, two main drawbacks of the basic EM algorithm – lack of an in-built procedure to compute the covariance matrix of the parameter estimates and slow convergence. In addition, some complex problems lead to intractable Expectation-steps and Maximization-steps. The first edition of the book chapter published in 2004 covered the basic theoretical framework of the EM algorithm and discussed further extensions of the EM algorithm to handle complex problems. The second edition attempts to capture advanced developments in EM methodology in recent years, especially in its applications to the related fields of biomedical and health sciences.

...read moreread less

178 citations

Book Chapter•DOI•

Analysis of Missing Data

[...]

John W. Graham¹•Institutions (1)

Pennsylvania State University¹

01 Jan 2012

TL;DR: This chapter presents methods that make the MAR assumption, the EM algorithm for covariance matrices, normal-model multiple imputation (MI), and what I will refer to as FIML (full information maximum likelihood) methods.

...read moreread less

Abstract: In this chapter, I present older methods for handling missing data. I then turn to the major new approaches for handling missing data. In this chapter, I present methods that make the MAR assumption. Included in this introduction are the EM algorithm for covariance matrices, normal-model multiple imputation (MI), and what I will refer to as FIML (full information maximum likelihood) methods. Before getting to these methods, however, I talk about the goals of analysis.

...read moreread less

168 citations

Journal Article•DOI•

Hyperspectral Unmixing Based on Mixtures of Dirichlet Components

[...]

José M. P. Nascimento, Jose M. Bioucas-Dias¹•Institutions (1)

University of Lisbon¹

01 Mar 2012-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A cyclic minimization algorithm is developed where the number of Dirichlet modes is inferred based on the minimum description length principle, thus automatically enforcing the constraints on the abundance fractions imposed by the acquisition process.

...read moreread less

Abstract: This paper introduces a new unsupervised hyperspectral unmixing method conceived to linear but highly mixed hyperspectral data sets, in which the simplex of minimum volume, usually estimated by the purely geometrically based algorithms, is far way from the true simplex associated with the endmembers. The proposed method, an extension of our previous studies, resorts to the statistical framework. The abundance fraction prior is a mixture of Dirichlet densities, thus automatically enforcing the constraints on the abundance fractions imposed by the acquisition process, namely, nonnegativity and sum-to-one. A cyclic minimization algorithm is developed where the following are observed: 1) The number of Dirichlet modes is inferred based on the minimum description length principle; 2) a generalized expectation maximization algorithm is derived to infer the model parameters; and 3) a sequence of augmented Lagrangian-based optimizations is used to compute the signatures of the endmembers. Experiments on simulated and real data are presented to show the effectiveness of the proposed algorithm in unmixing problems beyond the reach of the geometrically based state-of-the-art competitors.

...read moreread less

Journal Article•DOI•

Multivariate mixture modeling using skew-normal independent distributions

[...]

Celso Rômulo Barbosa Cabral¹, Victor H. Lachos², Marcos O. Prates²•Institutions (2)

Federal University of Amazonas¹, State University of Campinas²

01 Jan 2012-Computational Statistics & Data Analysis

TL;DR: A general EM-type algorithm is employed for iteratively computing parameter estimates with emphasis on finite mixtures of skew-normal, skew-t, ske-slash and skew-contaminated normal distributions, and a general information-based method for approximating the asymptotic covariance matrix of the estimates is presented.

...read moreread less

Journal Article•DOI•

Variational Bayesian inference and complexity control for stochastic block models

[...]

Pierre Latouche, Etienne Birmelé¹, Christophe Ambroise¹•Institutions (1)

Centre national de la recherche scientifique¹

05 Apr 2012-Statistical Modelling

TL;DR: In this article, the authors propose a new criterion that is based on a non-asymptotic approximation of the marginal likelihood, which can be computed through a variational Bayes EM algorithm.

...read moreread less

Abstract: It is now widely accepted that knowledge can be acquired from networks by clustering their vertices according to connection profiles. Many methods have been proposed and in this paper we concentrate on the Stochastic Block Model (SBM). The clustering of vertices and the estimation of SBM model parameters have been subject to previous work and numerous inference strategies such as variational Expectation Maximization (EM) and classification EM have been proposed. However, SBM still suffers from a lack of criteria to estimate the number of components in the mixture. To our knowledge, only one model based criterion, ICL, has been derived for SBM in the literature. It relies on an asymptotic approximation of the Integrated Complete-data Likelihood and recent studies have shown that it tends to be too conservative in the case of small networks. To tackle this issue, we propose a new criterion that we call ILvb, based on a non asymptotic approximation of the marginal likelihood. We describe how the criterion can be computed through a variational Bayes EM algorithm.

...read moreread less

Journal Article•DOI•

Initializing the EM algorithm in Gaussian mixture models with an unknown number of components

[...]

Volodymyr Melnykov¹, Igor Melnykov²•Institutions (2)

North Dakota State University¹, Colorado State University–Pueblo²

01 Jun 2012-Computational Statistics & Data Analysis

TL;DR: An approach is proposed for initializing the expectation-maximization (EM) algorithm in multivariate Gaussian mixture models with an unknown number of components by choosing points with higher concentrations of neighbors and using a truncated normal distribution for the preliminary estimation of dispersion matrices.

...read moreread less

Journal Article•DOI•

Efficient Bayesian Inference for Generalized Bradley–Terry Models

[...]

François Caron¹, Arnaud Doucet¹•Institutions (1)

University of Bordeaux¹

10 Apr 2012-Journal of Computational and Graphical Statistics

TL;DR: In this article, the authors show that these MM algorithms can be reinterpreted as special instances of expectation-maximization algorithms associated with suitable sets of latent variables and propose some original extensions.

...read moreread less

Abstract: The Bradley–Terry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behavior, chess ranking, and multiclass classification. Numerous extensions of the basic model have also been proposed in the literature including models with ties, multiple comparisons, group comparisons, and random graphs. From a computational point of view, Hunter has proposed efficient iterative minorization-maximization (MM) algorithms to perform maximum likelihood estimation for these generalized Bradley–Terry models whereas Bayesian inference is typically performed using Markov chain Monte Carlo algorithms based on tailored Metropolis–Hastings proposals. We show here that these MM algorithms can be reinterpreted as special instances of expectation-maximization algorithms associated with suitable sets of latent variables and propose some original extensions. These latent variables all...

...read moreread less

Journal Article•DOI•

Joint Carrier Frequency Offset and Channel Estimation for OFDM Systems via the EM Algorithm in the Presence of Very High Mobility

[...]

Eric Pierre Simon¹, Laurent Ros, Hussein Hijazi, Mounir Ghogho²•Institutions (2)

university of lille¹, University of Leeds²

01 Feb 2012-IEEE Transactions on Signal Processing

TL;DR: It is shown that the proposed EM has a lower computational complexity than the optimum maximum a posteriori estimator and yet incurs only an insignificant loss in performance.

...read moreread less

Abstract: In this paper, the problem of joint carrier frequency offset (CFO) and channel estimation for OFDM systems over the fast time-varying frequency-selective channel is explored within the framework of the expectation-maximization (EM) algorithm and parametric channel model. Assuming that the path delays are known, a novel iterative pilot-aided algorithm for joint estimation of the multipath Rayleigh channel complex gains (CG) and the carrier frequency offset (CFO) is introduced. Each CG time-variation, within one OFDM symbol, is approximated by a basis expansion model (BEM) representation. An autoregressive (AR) model is built to statistically characterize the variations of the BEM coefficients across the OFDM blocks. In addition to the algorithm, the derivation of the hybrid Cramer-Rao bound (HCRB) for CFO and CGs estimation in our context of very high mobility is provided. We show that the proposed EM has a lower computational complexity than the optimum maximum a posteriori estimator and yet incurs only an insignificant loss in performance.

...read moreread less

Journal Article•DOI•

Robust Student's-t Mixture Model With Spatial Constraints and Its Application in Medical Image Segmentation

[...]

Thanh Minh Nguyen¹, Q. M. Jonathan Wu¹•Institutions (1)

University of Windsor¹

01 Jan 2012-IEEE Transactions on Medical Imaging

TL;DR: A new finite Student's-t mixture model (SMM) is proposed that exploits Dirichlet distribution andDirichlet law to incorporate the local spatial constrains in an image and is successfully compared to the state-of-the-art finite mixture models.

...read moreread less

Abstract: Finite mixture model based on the Student's-t distribution, which is heavily tailed and more robust than Gaussian, has recently received great attention for image segmentation. A new finite Student's-t mixture model (SMM) is proposed in this paper. Existing models do not explicitly incorporate the spatial relationships between pixels. First, our model exploits Dirichlet distribution and Dirichlet law to incorporate the local spatial constrains in an image. Secondly, we directly deal with the Student's-t distribution in order to estimate the model parameters, whereas, the Student's-t distributions in previous models are represented as an infinite mixture of scaled Gaussians that lead to an increase in complexity. Finally, instead of using expectation maximization (EM) algorithm, the proposed method adopts the gradient method to minimize the higher bound on the data negative log-likelihood and to optimize the parameters. The proposed model is successfully compared to the state-of-the-art finite mixture models. Numerical experiments are presented where the proposed model is tested on various simulated and real medical images.

...read moreread less

Journal Article•DOI•

Analytic calculations for the EM algorithm for multivariate skew-t mixture models

[...]

Irene Vrbik¹, Paul D. McNicholas¹•Institutions (1)

University of Guelph¹

01 Jun 2012-Statistics & Probability Letters

TL;DR: In this article, the intractable expectations needed in the e −step can be written out analytically, bypassing the need for numerical estimation procedures, such as Monte Carlo methods, leading to accurate calculation of maximum likelihood estimates.

...read moreread less

Journal Article•DOI•

Principal Component Analysis with Noisy and/or Missing Data

[...]

Stephen Bailey¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

19 Sep 2012-Publications of the Astronomical Society of the Pacific

TL;DR: Estimates of the measurement error are used to weight the input data such that the resulting eigenvectors are more sensitive to the true underlying signal variations rather than being pulled by heteroskedastic measurement noise.

...read moreread less

Abstract: We present a method for performing principal component analysis (PCA) on noisy datasets with missing values. Estimates of the measurement error are used to weight the input data such that the resulting eigenvectors, when compared to classic PCA, are more sensitive to the true underlying signal variations rather than being pulled by heteroskedastic measurement noise. Missing data are simply limiting cases of weight = 0. The underlying algorithm is a noise weighted expectation maximization (EM) PCA, which has additional benefits of implementation speed and flexibility for smoothing eigenvectors to reduce the noise contribution. We present applications of this method on simulated data and QSO spectra from the Sloan Digital Sky Survey (SDSS).

...read moreread less

Journal Article•DOI•

Missing values: sparse inverse covariance estimation and an extension to sparse regression

[...]

Nicolas Städler¹, Peter Bühlmann¹•Institutions (1)

ETH Zurich¹

01 Jan 2012-Statistics and Computing

TL;DR: An efficient EM algorithm for optimization with provable numerical convergence properties is proposed and the methodology to handle missing values in a sparse regression context is extended.

...read moreread less

Abstract: We propose an ? 1-regularized likelihood method for estimating the inverse covariance matrix in the high-dimensional multivariate normal model in presence of missing data. Our method is based on the assumption that the data are missing at random (MAR) which entails also the completely missing at random case. The implementation of the method is non-trivial as the observed negative log-likelihood generally is a complicated and non-convex function. We propose an efficient EM algorithm for optimization with provable numerical convergence properties. Furthermore, we extend the methodology to handle missing values in a sparse regression context. We demonstrate both methods on simulated and real data.

...read moreread less

Journal Article•DOI•

Robust statistical modeling using the Birnbaum-Saunders- t distribution applied to insurance

[...]

Gilberto A. Paula¹, Víctor Leiva², Michelli Barros³, Shuangzhe Liu⁴•Institutions (4)

University of São Paulo¹, Valparaiso University², Federal University of Campina Grande³, University of Canberra⁴

01 Jan 2012-Applied Stochastic Models in Business and Industry

TL;DR: In this paper, the authors carried out robust modeling and influence diagnostics in Birnbaum-Saunders regression models and developed BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools.

...read moreread less

Abstract: In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright © 2011 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Generalized exponential-power series distributions

[...]

Eisa Mahmoudi¹, Ali Akbar Jafari¹•Institutions (1)

Yazd University¹

01 Dec 2012-Computational Statistics & Data Analysis

TL;DR: This paper introduces the generalized exponential-power series (GEPS) class of distributions, which is obtained by compounding generalized exponential and power series distributions and obtains several properties of the GEPS distributions such as moments, maximum likelihood estimation procedure via an EM-algorithm and inference for a large sample.

...read moreread less

Journal Article•DOI•

Boltzmann Machine and Mean-Field Approximation for Structured Sparse Decompositions

[...]

Angélique Drémeau¹, Cédric Herzet², Laurent Daudet¹•Institutions (2)

ESPCI ParisTech¹, French Institute for Research in Computer Science and Automation²

01 Jul 2012-IEEE Transactions on Signal Processing

TL;DR: This paper exploits a Boltzmann machine, allowing to take a large variety of structures into account, and resorts to a mean-field approximation and the “variational Bayes expectation-maximization” algorithm to solve a marginalized maximum a posteriori problem.

...read moreread less

Abstract: Taking advantage of the structures inherent in many sparse decompositions constitutes a promising research axis. In this paper, we address this problem from a Bayesian point of view. We exploit a Boltzmann machine, allowing to take a large variety of structures into account, and focus on the resolution of a marginalized maximum a posteriori problem. To solve this problem, we resort to a mean-field approximation and the “variational Bayes expectation-maximization” algorithm. This approach results in a soft procedure making no hard decision on the support or the values of the sparse representation. We show that this characteristic leads to an improvement of the performance over state-of-the-art algorithms.

...read moreread less

Journal Article•DOI•

Principal Component Analysis with Noisy and/or Missing Data

[...]

Stephen Bailey¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

20 Aug 2012-arXiv: Instrumentation and Methods for Astrophysics

TL;DR: In this article, a method for performing Principal Component Analysis (PCA) on noisy datasets with missing values is presented, where estimates of the measurement error are used to weight the input data such that compared to classic PCA, the resulting eigenvectors are more sensitive to the true underlying signal variations rather than being pulled by heteroskedastic measurement noise.

...read moreread less

Abstract: We present a method for performing Principal Component Analysis (PCA) on noisy datasets with missing values. Estimates of the measurement error are used to weight the input data such that compared to classic PCA, the resulting eigenvectors are more sensitive to the true underlying signal variations rather than being pulled by heteroskedastic measurement noise. Missing data is simply the limiting case of weight=0. The underlying algorithm is a noise weighted Expectation Maximization (EM) PCA, which has additional benefits of implementation speed and flexibility for smoothing eigenvectors to reduce the noise contribution. We present applications of this method on simulated data and QSO spectra from the Sloan Digital Sky Survey.

...read moreread less

Journal Article•DOI•

Robust fitting of mixture regression models

[...]

Xiuqin Bai¹, Weixin Yao¹, John E. Boyer¹•Institutions (1)

Kansas State University¹

01 Jul 2012-Computational Statistics & Data Analysis

TL;DR: A robust estimation procedure and an EM-type algorithm to estimate the mixture regression models and it is demonstrated that the proposed new estimation method is robust and works much better than the MLE when there are outliers or the error distribution has heavy tails.

...read moreread less

Journal Article•DOI•

Foreground Object Detection Using Top-Down Information Based on EM Framework

[...]

Zhou Liu¹, Kaiqi Huang¹, Tieniu Tan¹•Institutions (1)

Chinese Academy of Sciences¹

01 Sep 2012-IEEE Transactions on Image Processing

TL;DR: A novel foreground object detection scheme that integrates the top-down information based on the expectation maximization (EM) framework and uses the detection result of moving object to incorporate the domain knowledge of the object shapes into the construction of top- down information.

...read moreread less

Abstract: In this paper, we present a novel foreground object detection scheme that integrates the top-down information based on the expectation maximization (EM) framework. In this generalized EM framework, the top-down information is incorporated in an object model. Based on the object model and the state of each target, a foreground model is constructed. This foreground model can augment the foreground detection for the camouflage problem. Thus, an object's state-specific Markov random field (MRF) model is constructed for detection based on the foreground model and the background model. This MRF model depends on the latent variables that describe each object's state. The maximization of the MRF model is the M-step in the EM framework. Besides fusing spatial information, this MRF model can also adjust the contribution of the top-down information for detection. To obtain detection result using this MRF model, sampling importance resampling is used to sample the latent variable and the EM framework refines the detection iteratively. Besides the proposed generalized EM framework, our method does not need any prior information of the moving object, because we use the detection result of moving object to incorporate the domain knowledge of the object shapes into the construction of top-down information. Moreover, in our method, a kernel density estimation (KDE)—Gaussian mixture model (GMM) hybrid model is proposed to construct the probability density function of background and moving object model. For the background model, it has some advantages over GMM- and KDE-based methods. Experimental results demonstrate the capability of our method, particularly in handling the camouflage problem.

...read moreread less

Journal Article•DOI•

Identification of nonlinear parameter varying systems with missing output data

[...]

Jing Deng¹, Biao Huang¹•Institutions (1)

University of Alberta¹

01 Nov 2012-Aiche Journal

TL;DR: In this article, an identification of nonlinear parameter varying systems using particle filter under the framework of the expectation-maximizaiton (EM) algorithm is described, where particle filters are adopted to deal with the computation of expectation functions.

...read moreread less

Abstract: An identification of nonlinear parameter varying systems using particle filter under the framework of the expectation-maximizaiton (EM) algorithm is described. In chemical industries, processes are often designed to perform tasks under various operating conditions. To circumvent the modeling difficulties rendered by multiple operating conditions and the transitions between different working points, the EM algorithm, which iteratively increases the likelihood function, is applied. Meanwhile the missing output data problem which is common in real industry is also considered in this work. Particle filters are adopted to deal with the computation of expectation functions. The efficiency of the proposed method is illustrated through simulated examples and a pilot-scale experiment. © 2012 American Institute of Chemical Engineers AIChE J, 2012

...read moreread less

Journal Article•DOI•

Distributed Maximum Likelihood for Simultaneous Self-Localization and Tracking in Sensor Networks

[...]

Nikolas Kantas¹, Sumeetpal S. Singh², Arnaud Doucet³•Institutions (3)

Imperial College London¹, University of Cambridge², University of Oxford³

01 Oct 2012-IEEE Transactions on Signal Processing

TL;DR: It is shown that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking are implemented.

...read moreread less

Abstract: We show that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and we implement fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking. For linear Gaussian models, our algorithms can be implemented exactly using a distributed version of the Kalman filter and a novel message passing algorithm. The latter allows each node to compute the local derivatives of the likelihood or the sufficient statistics needed for Expectation-Maximization. In the non-linear case, a solution based on local linearization in the spirit of the Extended Kalman Filter is proposed. In numerical examples we demonstrate that the developed algorithms are able to learn the localization parameters.

...read moreread less

Posted Content•

The compound class of extended Weibull power series distributions

[...]

Rodrigo B. Silva, Marcelo B. Pereira, Cícero R. B. Dias, Gauss M. Cordeiro

05 Apr 2012-arXiv: Methodology

TL;DR: A general method for obtaining more flexible new distributions by compounding the extended Weibull and power series distributions and defines 68 new sub-models, which includes some well-known mixing distributions.

...read moreread less

Abstract: In this paper, we introduce a new class of distributions which is obtained by compounding the extended Weibull and power series distributions. The compounding procedure follows the same set-up carried out by Adamidis and Loukas (1998) and defines at least new 68 sub-models. This class includes some well-known mixing distributions, such as the Weibull power series (Morais and Barreto-Souza, 2010) and exponential power series (Chahkandi and Ganjali, 2009) distributions. Some mathematical properties of the new class are studied including moments and generating function. We provide the density function of the order statistics and obtain their moments. The method of maximum likelihood is used for estimating the model parameters and an EM algorithm is proposed for computing the estimates. Special distributions are investigated in some detail. An application to a real data set is given to show the flexibility and potentiality of the new class of distributions.

...read moreread less

Posted Content•

Expectation Maximization and Complex Duration Distributions for Continuous Time Bayesian Networks

[...]

Uri Nodelman¹, Christian R. Shelton², Daphne Koller¹•Institutions (2)

Stanford University¹, University of California, Riverside²

04 Jul 2012-arXiv: Artificial Intelligence

TL;DR: The EM algorithm is used to extend the representation of CTBNs to allow a much richer class of transition durations distributions, known as phase distributions, which are a highly expressive semi-parametric representation, which can approximate any duration distribution arbitrarily closely.

...read moreread less

Abstract: Continuous time Bayesian networks (CTBNs) describe structured stochastic processes with finitely many states that evolve over continuous time. A CTBN is a directed (possibly cyclic) dependency graph over a set of variables, each of which represents a finite state continuous time Markov process whose transition model is a function of its parents. We address the problem of learning the parameters and structure of a CTBN from partially observed data. We show how to apply expectation maximization (EM) and structural expectation maximization (SEM) to CTBNs. The availability of the EM algorithm allows us to extend the representation of CTBNs to allow a much richer class of transition durations distributions, known as phase distributions. This class is a highly expressive semi-parametric representation, which can approximate any duration distribution arbitrarily closely. This extension to the CTBN framework addresses one of the main limitations of both CTBNs and DBNs - the restriction to exponentially / geometrically distributed duration. We present experimental results on a real data set of people's life spans, showing that our algorithm learns reasonable models - structure and parameters - from partially observed data, and, with the use of phase distributions, achieves better performance than DBNs.

...read moreread less

Collapse