scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 2010"


Book
23 Apr 2010
TL;DR: This chapter discusses how to improve the accuracy of Maximum Likelihood Analyses by extending EM to Multivariate Data, and the role of First Derivatives in this process.
Abstract: Part 1. An Introduction to Missing Data. 1.1 Introduction. 1.2 Chapter Overview. 1.3 Missing Data Patterns. 1.4 A Conceptual Overview of Missing Data heory. 1.5 A More Formal Description of Missing Data Theory. 1.6 Why Is the Missing Data Mechanism Important? 1.7 How Plausible Is the Missing at Random Mechanism? 1.8 An Inclusive Analysis Strategy. 1.9 Testing the Missing Completely at Random Mechanism. 1.10 Planned Missing Data Designs. 1.11 The Three-Form Design. 1.12 Planned Missing Data for Longitudinal Designs. 1.13 Conducting Power Analyses for Planned Missing Data Designs. 1.14 Data Analysis Example. 1.15 Summary. 1.16 Recommended Readings. Part 2. Traditional Methods for Dealing with Missing Data. 2.1 Chapter Overview. 2.2 An Overview of Deletion Methods. 2.3 Listwise Deletion. 2.4 Pairwise Deletion. 2.5 An Overview of Single Imputation Techniques. 2.6 Arithmetic Mean Imputation. 2.7 Regression Imputation. 2.8 Stochastic Regression Imputation. 2.9 Hot-Deck Imputation. 2.10 Similar Response Pattern Imputation. 2.11 Averaging the Available Items. 2.12 Last Observation Carried Forward. 2.13 An Illustrative Simulation Study. 2.14 Summary. 2.15 Recommended Readings. Part 3. An Introduction to Maximum Likelihood Estimation. 3.1 Chapter Overview. 3.2 The Univariate Normal Distribution. 3.3 The Sample Likelihood. 3.4 The Log-Likelihood. 3.5 Estimating Unknown Parameters. 3.6 The Role of First Derivatives. 3.7 Estimating Standard Errors. 3.8 Maximum Likelihood Estimation with Multivariate Normal Data. 3.9 A Bivariate Analysis Example. 3.10 Iterative Optimization Algorithms. 3.11 Significance Testing Using the Wald Statistic. 3.12 The Likelihood Ratio Test Statistic. 3.13 Should I Use the Wald Test or the Likelihood Ratio Statistic? 3.14 Data Analysis Example 1. 3.15 Data Analysis Example 2. 3.16 Summary. 3.17 Recommended Readings. Part 4. Maximum Likelihood Missing Data Handling. 4.1 Chapter Overview. 4.2 The Missing Data Log-Likelihood. 4.3 How Do the Incomplete Data Records Improve Estimation? 4.4 An Illustrative Computer Simulation Study. 4.5 Estimating Standard Errors with Missing Data. 4.6 Observed Versus Expected Information. 4.7 A Bivariate Analysis Example. 4.8 An Illustrative Computer Simulation Study. 4.9 An Overview of the EM Algorithm. 4.10 A Detailed Description of the EM Algorithm. 4.11 A Bivariate Analysis Example. 4.12 Extending EM to Multivariate Data. 4.13 Maximum Likelihood Software Options. 4.14 Data Analysis Example 1. 4.15 Data Analysis Example 2. 4.16 Data Analysis Example 3. 4.17 Data Analysis Example 4. 4.18 Data Analysis Example 5. 4.19 Summary. 4.20 Recommended Readings. Part 5. Improving the Accuracy of Maximum Likelihood Analyses. 5.1 Chapter Overview. 5.2 The Rationale for an Inclusive Analysis Strategy. 5.3 An Illustrative Computer Simulation Study. 5.4 Identifying a Set of Auxiliary Variables. 5.5 Incorporating Auxiliary Variables Into a Maximum Likelihood Analysis. 5.6 The Saturated Correlates Model. 5.7 The Impact of Non-Normal Data. 5.8 Robust Standard Errors. 5.9 Bootstrap Standard Errors. 5.10 The Rescaled Likelihood Ratio Test. 5.11 Bootstrapping the Likelihood Ratio Statistic. 5.12 Data Analysis Example 1. 5.13 Data Analysis Example 2. 5.14 Data Analysis Example 3. 5.15 Summary. 5.16 Recommended Readings. Part 6. An Introduction to Bayesian Estimation. 6.1 Chapter Overview. 6.2 What Makes Bayesian Statistics Different? 6.3 A Conceptual Overview of Bayesian Estimation. 6.4 Bayes' Theorem. 6.5 An Analysis Example. 6.6 How Does Bayesian Estimation Apply to Multiple Imputation? 6.7 The Posterior Distribution of the Mean. 6.8 The Posterior Distribution of the Variance. 6.9 The Posterior Distribution of a Covariance Matrix. 6.10 Summary. 6.11 Recommended Readings. Part 7. The Imputation Phase of Multiple Imputation. 7.1 Chapter Overview. 7.2 A Conceptual Description of the Imputation Phase. 7.3 A Bayesian Description of the Imputation Phase. 7.4 A Bivariate Analysis Example. 7.5 Data Augmentation with Multivariate Data. 7.6 Selecting Variables for Imputation. 7.7 The Meaning of Convergence. 7.8 Convergence Diagnostics. 7.9 Time-Series Plots. 7.10 Autocorrelation Function Plots. 7.11 Assessing Convergence from Alternate Starting Values. 7.12 Convergence Problems. 7.13 Generating the Final Set of Imputations. 7.14 How Many Data Sets Are Needed? 7.15 Summary. 7.16 Recommended Readings. Part 8. The Analysis and Pooling Phases of Multiple Imputation. 8.1 Chapter Overview. 8.2 The Analysis Phase. 8.3 Combining Parameter Estimates in the Pooling Phase. 8.4 Transforming Parameter Estimates Prior to Combining. 8.5 Pooling Standard Errors. 8.6 The Fraction of Missing Information and the Relative Increase in Variance. 8.7 When Is Multiple Imputation Comparable to Maximum Likelihood? 8.8 An Illustrative Computer Simulation Study. 8.9 Significance Testing Using the t Statistic. 8.10 An Overview of Multiparameter Significance Tests. 8.11 Testing Multiple Parameters Using the D1 Statistic. 8.12 Testing Multiple Parameters by Combining Wald Tests. 8.13 Testing Multiple Parameters by Combining Likelihood Ratio Statistics. 8.14 Data Analysis Example 1. 8.15 Data Analysis Example 2. 8.16 Data Analysis Example 3. 8.17 Summary. 8.18 Recommended Readings. Part 9. Practical Issues in Multiple Imputation. 9.1 Chapter Overview. 9.2 Dealing with Convergence Problems. 9.3 Dealing with Non-Normal Data. 9.4 To Round or Not to Round? 9.5 Preserving Interaction Effects. 9.6 Imputing Multiple-Item Questionnaires. 9.7 Alternate Imputation Algorithms. 9.8 Multiple Imputation Software Options. 9.9 Data Analysis Example 1. 9.10 Data Analysis Example 2. 9.11 Summary. 9.12 Recommended Readings. Part 10. Models for Missing Not at Random Data. 10.1 Chapter Overview. 10.2 An Ad Hoc Approach to Dealing with MNAR Data. 10.3 The Theoretical Rationale for MNAR Models. 10.4 The Classic Selection Model. 10.5 Estimating the Selection Model. 10.6 Limitations of the Selection Model. 10.7 An Illustrative Analysis. 10.8 The Pattern Mixture Model. 10.9 Limitations of the Pattern Mixture Model. 10.10 An Overview of the Longitudinal Growth Model. 10.11 A Longitudinal Selection Model. 10.12 Random Coefficient Selection Models. 10.13 Pattern Mixture Models for Longitudinal Analyses. 10.14 Identification Strategies for Longitudinal Pattern Mixture Models. 10.15 Delta Method Standard Errors. 10.16 Overview of the Data Analysis Examples. 10.17 Data Analysis Example 1. 10.18 Data Analysis Example 2. 10.19 Data Analysis Example 3. 10.20 Data Analysis Example 4. 10.21 Summary. 10.22 Recommended Readings. Part 11. Wrapping Things Up: Some Final Practical Considerations. 11.1 Chapter Overview. 11.2 Maximum Likelihood Software Options. 11.3 Multiple Imputation Software Options. 11.4 Choosing between Maximum Likelihood and Multiple Imputation. 11.5 Reporting the Results from a Missing Data Analysis. 11.6 Final Thoughts. 11.7 Recommended Readings.

3,910 citations


Journal ArticleDOI
TL;DR: A probabilistic method, called the Coherent Point Drift (CPD) algorithm, is introduced for both rigid and nonrigid point set registration and a fast algorithm is introduced that reduces the method computation complexity to linear.
Abstract: Point set registration is a key component in many computer vision tasks. The goal of point set registration is to assign correspondences between two sets of points and to recover the transformation that maps one point set to the other. Multiple factors, including an unknown nonrigid spatial transformation, large dimensionality of point set, noise, and outliers, make the point set registration a challenging problem. We introduce a probabilistic method, called the Coherent Point Drift (CPD) algorithm, for both rigid and nonrigid point set registration. We consider the alignment of two point sets as a probability density estimation problem. We fit the Gaussian mixture model (GMM) centroids (representing the first point set) to the data (the second point set) by maximizing the likelihood. We force the GMM centroids to move coherently as a group to preserve the topological structure of the point sets. In the rigid case, we impose the coherence constraint by reparameterization of GMM centroid locations with rigid parameters and derive a closed form solution of the maximization step of the EM algorithm in arbitrary dimensions. In the nonrigid case, we impose the coherence constraint by regularizing the displacement field and using the variational calculus to derive the optimal transformation. We also introduce a fast algorithm that reduces the method computation complexity to linear. We test the CPD algorithm for both rigid and nonrigid transformations in the presence of noise, outliers, and missing points, where CPD shows accurate results and outperforms current state-of-the-art methods.

2,429 citations


Journal ArticleDOI
TL;DR: In this article, a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals, is considered.
Abstract: We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegative matrix factorization (NMF) with the Itakura-Saito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectation-maximization (EM) algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms are applied to stereo audio source separation in various settings, covering blind and supervised separation, music and speech sources, synthetic instantaneous and convolutive mixtures, as well as professionally produced music recordings. Our EM method produces competitive results with respect to state-of-the-art as illustrated on two tasks from the international Signal Separation Evaluation Campaign (SiSEC 2008).

636 citations


Journal ArticleDOI
TL;DR: In this article, the contribution of each source to all mixture channels in the time-frequency domain was modeled as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source.
Abstract: This paper addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source. We then consider four specific covariance models, including a full-rank unconstrained model. We derive a family of iterative expectation-maximization (EM) algorithms to estimate the parameters of each model and propose suitable procedures adapted from the state-of-the-art to initialize the parameters and to align the order of the estimated sources across all frequency bins. Experimental results over reverberant synthetic mixtures and live recordings of speech data show the effectiveness of the proposed approach.

368 citations


Journal ArticleDOI
TL;DR: This article studies the maximum likelihood inference on a class of Wiener processes with random effects for degradation data, one on which n independent subjects, each with a Wiener process with random drift and diffusion parameters, are observed at different times.

346 citations


Journal ArticleDOI
TL;DR: This paper describes a model-based expectation-maximization source separation and localization system for separating and localizing multiple sound sources from an underdetermined reverberant two-channel recording, and creates probabilistic spectrogram masks that can be used for source separation.
Abstract: This paper describes a system, referred to as model-based expectation-maximization source separation and localization (MESSL), for separating and localizing multiple sound sources from an underdetermined reverberant two-channel recording. By clustering individual spectrogram points based on their interaural phase and level differences, MESSL generates masks that can be used to isolate individual sound sources. We first describe a probabilistic model of interaural parameters that can be evaluated at individual spectrogram points. By creating a mixture of these models over sources and delays, the multi-source localization problem is reduced to a collection of single source problems. We derive an expectation-maximization algorithm for computing the maximum-likelihood parameters of this mixture model, and show that these parameters correspond well with interaural parameters measured in isolation. As a byproduct of fitting this mixture model, the algorithm creates probabilistic spectrogram masks that can be used for source separation. In simulated anechoic and reverberant environments, separations using MESSL produced on average a signal-to-distortion ratio 1.6 dB greater and perceptual evaluation of speech quality (PESQ) results 0.27 mean opinion score units greater than four comparable algorithms.

317 citations


Journal ArticleDOI
TL;DR: Both the subject-to-subject heterogeneity and covariate information can be incorporated into the model in a natural way and the bootstrap is used to assess the variability of the maximum likelihood estimators.
Abstract: This paper studies the maximum likelihood estimation of a class of inverse Gaussian process models for degradation data Both the subject-to-subject heterogeneity and covariate information can be incorporated into the model in a natural way The EM algorithm is used to obtain the maximum likelihood estimators of the unknown parameters and the bootstrap is used to assess the variability of the maximum likelihood estimators Simulations are used to validate the method The model is fitted to laser data and corresponding goodness-of-fit tests are carried out Failure time distributions in terms of degradation level passages are calculated and illustrated The supplemental materials for this article are available online

284 citations


Journal ArticleDOI
TL;DR: A detailed review into mixture models and model-based clustering is provided, for providing a convenient yet formal framework for clustering and classication.
Abstract: Finite mixture models have a long history in statistics, hav- ing been used to model pupulation heterogeneity, generalize distributional assumptions, and lately, for providing a convenient yet formal framework for clustering and classication. This paper provides a detailed review into mixture models and model-based clustering. Recent trends in the area, as well as open problems are also discussed.

263 citations


Journal ArticleDOI
TL;DR: This paper presents a robust mixture modeling framework using theMultivariate skew t distributions, an extension of the multivariate Student’s t family with additional shape parameters to regulate skewness, which results in a very complicated likelihood.
Abstract: This paper presents a robust mixture modeling framework using the multivariate skew t distributions, an extension of the multivariate Student's t family with additional shape parameters to regulate skewness The proposed model results in a very complicated likelihood Two variants of Monte Carlo EM algorithms are developed to carry out maximum likelihood estimation of mixture parameters In addition, we offer a general information-based method for obtaining the asymptotic covariance matrix of maximum likelihood estimates Some practical issues including the selection of starting values as well as the stopping criterion are also discussed The proposed methodology is applied to a subset of the Australian Institute of Sport data for illustration

251 citations


Journal ArticleDOI
TL;DR: It is shown that when the dimensionality is high, MH-RM has advantages over existing methods such as numerical quadrature based EM algorithm.
Abstract: A Metropolis–Hastings Robbins–Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The accuracy of the proposed algorithm is demonstrated with simulations. As an illustration, the proposed algorithm is applied to explore the factor structure underlying a new quality of life scale for children. It is shown that when the dimensionality is high, MH-RM has advantages over existing methods such as numerical quadrature based EM algorithm. Extensions of the algorithm to other modeling frameworks are discussed.

241 citations


Journal ArticleDOI
TL;DR: A two-tier item factor analysis model is developed that reduces the dimensionality of the latent variable space, and consequently significant computational savings, and an EM algorithm for full-information maximum marginal likelihood estimation is developed.
Abstract: Motivated by Gibbons et al.’s (Appl. Psychol. Meas. 31:4–19, 2007) full-information maximum marginal likelihood item bifactor analysis for polytomous data, and Rijmen, Vansteelandt, and De Boeck’s (Psychometrika 73:167–182, 2008) work on constructing computationally efficient estimation algorithms for latent variable models, a two-tier item factor analysis model is developed in this research. The modeling framework subsumes standard multidimensional IRT models, bifactor IRT models, and testlet response theory models as special cases. Features of the model lead to a reduction in the dimensionality of the latent variable space, and consequently significant computational savings. An EM algorithm for full-information maximum marginal likelihood estimation is developed. Simulations and real data demonstrations confirm the accuracy and efficiency of the proposed methods. Three real data sets from a large-scale educational assessment, a longitudinal public health survey, and a scale development study measuring patient reported quality of life outcomes are analyzed as illustrations of the model’s broad range of applicability.

Journal ArticleDOI
TL;DR: Simulation studies in the context of channel estimation, employing multipath wireless channels, show that the SPARLS algorithm has significant improvement over the conventional widely used recursive least squares (RLS) algorithm in terms of mean squared error (MSE).
Abstract: We develop a recursive L1-regularized least squares (SPARLS) algorithm for the estimation of a sparse tap-weight vector in the adaptive filtering setting. The SPARLS algorithm exploits noisy observations of the tap-weight vector output stream and produces its estimate using an expectation-maximization type algorithm. We prove the convergence of the SPARLS algorithm to a near-optimal estimate in a stationary environment and present analytical results for the steady state error. Simulation studies in the context of channel estimation, employing multipath wireless channels, show that the SPARLS algorithm has significant improvement over the conventional widely used recursive least squares (RLS) algorithm in terms of mean squared error (MSE). Moreover, these simulation studies suggest that the SPARLS algorithm (with slight modifications) can operate with lower computational requirements than the RLS algorithm, when applied to tap-weight vectors with fixed support.

Journal ArticleDOI
TL;DR: The current research extends the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm to the case of maximum likelihood estimation under user-defined linear restrictions for confirmatory IFA, applying it to the IFA of real data from pediatric quality-of-life (QOL) research.
Abstract: Item factor analysis (IFA), already well established in educational measurement, is increasingly applied to psychological measurement in research settings. However, high-dimensional confirmatory IFA remains a numerical challenge. The current research extends the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm, initially proposed for exploratory IFA, to the case of maximum likelihood estimation under user-defined linear restrictions for confirmatory IFA. MH-RM naturally integrates concepts such as the missing data formulation, data augmentation, the Metropolis algorithm, and stochastic approximation. In a limited simulation study, the accuracy of the MH-RM algorithm is checked against the standard Bock-Aitkin expectation-maximization (EM) algorithm. To demonstrate the efficiency and flexibility of the MH-RM algorithm, it is applied to the IFA of real data from pediatric quality-of-life (QOL) research in comparison with adaptive quadrature-based EM algorithm. The particular data set required a confirmat...

Proceedings Article
01 Jan 2010
TL;DR: In this article, a new class of asym- metric linear mixed models that provides for an efficient estimation of the parame- ters in the analysis of longitudinal data is presented. But the accuracy of the assumed normal distribu- tion is crucial for valid inference of the parameters.
Abstract: Linear mixed models with normally distributed response are routinely used in longitudinal data. However, the accuracy of the assumed normal distribu- tion is crucial for valid inference of the parameters. We present a new class of asym- metric linear mixed models that provides for an efficient estimation of the parame- ters in the analysis of longitudinal data. We assume that, marginally, the random effects follow a multivariate skew-normal/independent distribution (Branco and Dey (2001)) and that the random errors follow a symmetric normal/independent distribution (Lange and Sinsheimer (1993)), providing an appealing robust alter- native to the usual symmetric normal distribution in linear mixed models. Specific distributions examined include the skew-normal, the skew-t, the skew-slash, and the skew-contaminated normal distribution. We present an efficient EM-type algo- rithm for the computation of maximum likelihood estimation of parameters. The technique for the prediction of future responses under this class of distributions is also investigated. The methodology is illustrated through an application to Fram- ingham cholesterol data and a simulation study.

Journal ArticleDOI
TL;DR: In this article, a factorial computer experiment was conducted with synthetic measurement data to compare four likelihood functions and three methods of combining likelihood values using the CERES-Maize model of the Decision Support System for Agrotechnology Transfer (DSSAT).

Journal ArticleDOI
TL;DR: IEMGA is a population-based meta-heuristic algorithm originated from the electromagnetism theory that can automatically converge at a good solution and has the advantages of EM and GA in reducing the computation complexity of EM.
Abstract: Based on the electromagnetism-like algorithm, an evolutionary algorithm, improved EM algorithm with genetic algorithm technique (IEMGA), for optimization of fractional-order PID (FOPID) controller is proposed in this article. IEMGA is a population-based meta-heuristic algorithm originated from the electromagnetism theory. It does not require gradient calculations and can automatically converge at a good solution. For FOPID control optimization, IEMGA simulates the ''attraction'' and ''repulsion'' of charged particles by considering each controller parameters as an electrical charge. The neighborhood randomly local search of EM algorithm is improved by using GA and the competitive concept. IEMGA has the advantages of EM and GA in reducing the computation complexity of EM. Finally, several illustration examples are presented to show the performance and effectiveness.

Journal ArticleDOI
TL;DR: Some simulation studies are presented to show the advantage of this flexible class of probability distributions in clustering heterogeneous data and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties.

Journal ArticleDOI
TL;DR: In this paper, a transposable regularized covariance model is proposed to estimate the mean and non-singular covariance matrices of high-dimensional data in the form of a matrix, where rows and columns each have a separate mean vector and covariance matrix.
Abstract: Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

Journal ArticleDOI
TL;DR: Model-based clustering using a family of Gaussian mixture models, with parsimonious factor analysis-like covariance structure, is described and an ecient algorithm for its implementation is presented, showing its eectiveness when compared to existing software.

Journal ArticleDOI
TL;DR: The meta-analysis command metaan as mentioned in this paper can be used to perform fixed- or random-effects meta analysis, and it can report a variety of heterogeneity measures, including Cochran's Q, I2, H2M and the between-studies variance estimate τ^2.
Abstract: This article describes the new meta-analysis command metaan, which can be used to perform fixed- or random-effects meta-analysis. Besides the standard DerSimonian and Laird approach, metaan offers a wide choice of available models: maximum likelihood, profile likelihood, restricted maximum likelihood, and a permutation model. The command reports a variety of heterogeneity measures, including Cochran's Q, I2, H2M, and the between-studies variance estimate τ^2. A forest plot and a graph of the maximum likelihood function can also be generated.

Journal ArticleDOI
TL;DR: The comparison of three algorithms to compute maximum likelihood estimates of the parameters of these models: the EM algorithm, the classification EM algorithm and the stochastic EM algorithm is proposed.
Abstract: In most applications, the parameters of a mixture of linear regression models are estimated by maximum likelihood using the expectation maximization (EM) algorithm. In this article, we propose the comparison of three algorithms to compute maximum likelihood estimates of the parameters of these models: the EM algorithm, the classification EM algorithm and the stochastic EM algorithm. The comparison of the three procedures was done through a simulation study of the performance (computational effort, statistical properties of estimators and goodness of fit) of these approaches on simulated data sets. Simulation results show that the choice of the approach depends essentially on the configuration of the true regression lines and the initialization of the algorithms.

Proceedings Article
01 Dec 2010
TL;DR: In this paper, the authors apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model, and propose a new general methodology for inference and learning in nonlinear statespace models that are described probabilistically by non-parametric GP models.
Abstract: State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. Copyright 2010 by the authors.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: Experimental results are reported for an implementation in a generalized sidelobe canceller like spatial beamforming configuration for 3 speech sources with significant coherent noise in reverberant environments, demonstrating the usefulness of the novel modeling framework.
Abstract: In this paper we propose to employ directional statistics in a complex vector space to approach the problem of blind speech separation in the presence of spatially correlated noise. We interpret the values of the short time Fourier transform of the microphone signals to be draws from a mixture of complexWatson distributions, a probabilistic model which naturally accounts for spatial aliasing. The parameters of the density are related to the a priori source probabilities, the power of the sources and the transfer function ratios from sources to sensors. Estimation formulas are derived for these parameters by employing the Expectation Maximization (EM) algorithm. The E-step corresponds to the estimation of the source presence probabilities for each time-frequency bin, while the M-step leads to a maximum signal-to-noise ratio (MaxSNR) beamformer in the presence of uncertainty about the source activity. Experimental results are reported for an implementation in a generalized sidelobe canceller (GSC) like spatial beamforming configuration for 3 speech sources with significant coherent noise in reverberant environments, demonstrating the usefulness of the novel modeling framework.

Journal ArticleDOI
TL;DR: The ability of the approach to properly describe trajectories with sudden changes is shown, with real data from two different scenarios: a shopping center and a university campus, as well as a set of human activities in both scenarios is successfully recognized.
Abstract: This paper proposes an approach for recognizing human activities (more specifically, pedestrian trajectories) in video sequences, in a surveillance context. A system for automatic processing of video information for surveillance purposes should be capable of detecting, recognizing, and collecting statistics of human activity, reducing human intervention as much as possible. In the method described in this paper, human trajectories are modeled as a concatenation of segments produced by a set of low level dynamical models. These low level models are estimated in an unsupervised fashion, based on a finite mixture formulation, using the expectation-maximization (EM) algorithm; the number of models is automatically obtained using a minimum message length (MML) criterion. This leads to a parsimonious set of models tuned to the complexity of the scene. We describe the switching among the low-level dynamic models by a hidden Markov chain; thus, the complete model is termed a switched dynamical hidden Markov model (SD-HMM). The performance of the proposed method is illustrated with real data from two different scenarios: a shopping center and a university campus. A set of human activities in both scenarios is successfully recognized by the proposed system. These experiments show the ability of our approach to properly describe trajectories with sudden changes.

Journal ArticleDOI
TL;DR: The EM algorithm is a special case of a more general algorithm called the MM algorithm as mentioned in this paper, which is used to solve high-dimensional optimization and estimation problems, such as random graph models, discriminant analysis and image restoration.
Abstract: The EM algorithm is a special case of a more general algorithm called the MM algorithm. Specific MM algorithms often have nothing to do with missing data. The first M step of an MM algorithm creates a surrogate function that is optimized in the second M step. In minimization, MM stands for majorize–minimize; in maximization, it stands for minorize–maximize. This two-step process always drives the objective function in the right direction. Construction of MM algorithms relies on recognizing and manipulating inequalities rather than calculating conditional expectations. This survey walks the reader through the construction of several specific MM algorithms. The potential of the MM algorithm in solving high-dimensional optimization and estimation problems is its most attractive feature. Our applications to random graph models, discriminant analysis and image restoration showcase this ability.

Journal ArticleDOI
TL;DR: An efficient alternating expectation-conditional maximization (AECM) algorithm for the computation of maximum likelihood estimates of parameters on the basis of two convenient hierarchical formulations is presented.
Abstract: We consider an extension of linear mixed models by assuming a multivariate skew t distribution for the random effects and a multivariate t distribution for the error terms The proposed model provides flexibility in capturing the effects of skewness and heavy tails simultaneously among continuous longitudinal data We present an efficient alternating expectation-conditional maximization (AECM) algorithm for the computation of maximum likelihood estimates of parameters on the basis of two convenient hierarchical formulations The techniques for the prediction of random effects and intermittent missing values under this model are also investigated Our methodologies are illustrated through an application to schizophrenia data

Journal ArticleDOI
TL;DR: A new Bayesian model is proposed for image segmentation based upon Gaussian mixture models (GMM) with spatial smoothness constraints that exploits the Dirichlet compound multinomial (DCM) probability density and a Gauss-Markov random field on theDirichlet parameters to impose smoothness.
Abstract: A new Bayesian model is proposed for image segmentation based upon Gaussian mixture models (GMM) with spatial smoothness constraints. This model exploits the Dirichlet compound multinomial (DCM) probability density to model the mixing proportions (i.e., the probabilities of class labels) and a Gauss-Markov random field (MRF) on the Dirichlet parameters to impose smoothness. The main advantages of this model are two. First, it explicitly models the mixing proportions as probability vectors and simultaneously imposes spatial smoothness. Second, it results in closed form parameter updates using a maximum a posteriori (MAP) expectation-maximization (EM) algorithm. Previous efforts on this problem used models that did not model the mixing proportions explicitly as probability vectors or could not be solved exactly requiring either time consuming Markov Chain Monte Carlo (MCMC) or inexact variational approximation methods. Numerical experiments are presented that demonstrate the superiority of the proposed model for image segmentation compared to other GMM-based approaches. The model is also successfully compared to state of the art image segmentation methods in clustering both natural images and images degraded by noise.

Proceedings ArticleDOI
25 Jul 2010
TL;DR: In this article, a probabilistic approach for statistical modeling of the loads in distribution networks is presented, where the Expectation Maximization (EM) algorithm is used to obtain the parameters of the mixture components.
Abstract: This paper presents a probabilistic approach for statistical modelling of the loads in distribution networks. In a distribution network, the Probability Density Functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution. The approach presented in this paper represents all the load pdfs through Gaussian Mixture Model (GMM). The Expectation Maximization (EM) algorithm is used to obtain the parameters of the mixture components. The performance of the method is demonstrated on a 95-bus generic distribution network model.

Journal ArticleDOI
TL;DR: This work proposes a general framework of functional mixed effects model for within-unit and within-subunit variations are modeled through two separate sets of principal components; the subunit level functions are allowed to be correlated.
Abstract: Hierarchical functional data are widely seen in complex studies where subunits are nested within units, which in turn are nested within treatment groups. We propose a general framework of functional mixed effects model for such data: within-unit and within-subunit variations are modeled through two separate sets of principal components; the subunit level functions are allowed to be correlated. Penalized splines are used to model both the mean functions and the principal components functions, where roughness penalties are used to regularize the spline fit. An expectation–maximization (EM) algorithm is developed to fit the model, while the specific covariance structure of the model is utilized for computational efficiency to avoid storage and inversion of large matrices. Our dimension reduction with principal components provides an effective solution to the difficult tasks of modeling the covariance kernel of a random function and modeling the correlation between functions. The proposed methodology is illus...

Journal ArticleDOI
TL;DR: In this paper, a robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data was developed, which is able to calculate its value anywhere in the parameter space and for any inclination only data set.
Abstract: SUMMARY We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason–Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.