scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 2011"


Journal ArticleDOI
TL;DR: A Maximum Likelihood (ML) framework is employed and an Expectation Maximisation (EM) algorithm is derived to compute these ML estimates, which lend itself perfectly to the particle smoother, which provides arbitrarily good estimates.

543 citations


Journal ArticleDOI
TL;DR: This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.
Abstract: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

465 citations


Journal ArticleDOI
TL;DR: Approximate Bayesian Computing and Pattern-Oriented Modelling are discussed, their potential for integrating stochastic simulation models into a unified framework for statistical modelling is demonstrated, and principles and advantages of these methods are discussed.
Abstract: Statistical models are the traditional choice to test scientific theories when observations, processes or boundary conditions are subject to stochasticity Many important systems in ecology and biology, however, are difficult to capture with statistical models Stochastic simulation models offer an alternative, but they were hitherto associated with a major disadvantage: their likelihood functions can usually not be calculated explicitly, and thus it is difficult to couple them to well-established statistical theory such as maximum likelihood and Bayesian statistics A number of new methods, among them Approximate Bayesian Computing and Pattern-Oriented Modelling, bypass this limitation These methods share three main principles: aggregation of simulated and observed data via summary statistics, likelihood approximation based on the summary statistics, and efficient sampling We discuss principles as well as advantages and caveats of these methods, and demonstrate their potential for integrating stochastic simulation models into a unified framework for statistical modelling

335 citations


Book
19 Mar 2011
TL;DR: This introduction to the expectation–maximization (EM) algorithm provides an intuitive and mathematically rigorous understanding of EM.
Abstract: This introduction to the expectation–maximization (EM) algorithm provides an intuitive and mathematically rigorous understanding of EM. Two of the most popular applications of EM are described in detail: estimating Gaussian mixture models (GMMs), and estimating hidden Markov models (HMMs). EM solutions are also derived for learning an optimal mixture of fixed models, for estimating the parameters of a compound Dirichlet distribution, and for dis-entangling superimposed signals. Practical issues that arise in the use of EM are discussed, as well as variants of the algorithm that help deal with these challenges.

314 citations


Journal ArticleDOI
TL;DR: In this article, two approaches are developed for constructing the maximum likelihood estimates (MLE) of the state and measurement noise covariance matrices from operating input-output data when the states and/or parameters are estimated using the extended Kalman filter (EKF).

241 citations


Journal ArticleDOI
14 Jan 2011-Entropy
TL;DR: Owing to more degrees of freedom in tuning the parameters, the proposed family of AB-multiplicative NMF algorithms is shown to improve robustness with respect to noise and outliers.
Abstract: We propose a class of multiplicative algorithms for Nonnegative Matrix Factorization (NMF) which are robust with respect to noise and outliers. To achieve this, we formulate a new family generalized divergences referred to as the Alpha-Beta-divergences (AB-divergences), which are parameterized by the two tuning parameters, alpha and beta, and smoothly connect the fundamental Alpha-, Beta- and Gamma-divergences. By adjusting these tuning parameters, we show that a wide range of standard and new divergences can be obtained. The corresponding learning algorithms for NMF are shown to integrate and generalize many existing ones, including the Lee-Seung, ISRA (Image Space Reconstruction Algorithm), EMML (Expectation Maximization Maximum Likelihood), Alpha-NMF, and Beta-NMF. Owing to more degrees of freedom in tuning the parameters, the proposed family of AB-multiplicative NMF algorithms is shown to improve robustness with respect to noise and outliers. The analysis illuminates the links of between AB-divergence and other divergences, especially Gamma- and Itakura-Saito divergences.

229 citations


Journal ArticleDOI
TL;DR: An innovative EM-like algorithm, namely, the Expectation Conditional Maximization for Point Registration (ECMPR) algorithm, is introduced, which allows the use of general covariance matrices for the mixture model components and improves over the isotropic covariance case.
Abstract: This paper addresses the issue of matching rigid and articulated shapes through probabilistic point registration. The problem is recast into a missing data framework where unknown correspondences are handled via mixture models. Adopting a maximum likelihood principle, we introduce an innovative EM-like algorithm, namely, the Expectation Conditional Maximization for Point Registration (ECMPR) algorithm. The algorithm allows the use of general covariance matrices for the mixture model components and improves over the isotropic covariance case. We analyze in detail the associated consequences in terms of estimation of the registration parameters, and propose an optimal method for estimating the rotational and translational parameters based on semidefinite positive relaxation. We extend rigid registration to articulated registration. Robustness is ensured by detecting and rejecting outliers through the addition of a uniform component to the Gaussian mixture model at hand. We provide an in-depth analysis of our method and compare it both theoretically and experimentally with other robust methods for point registration.

217 citations


Journal ArticleDOI
TL;DR: An approximation to the prior/posterior distribution of the parameters in the beta distribution is introduced and an analytically tractable (closed form) Bayesian approach to the parameter estimation is proposed.
Abstract: Bayesian estimation of the parameters in beta mixture models (BMM) is analytically intractable. The numerical solutions to simulate the posterior distribution are available, but incur high computational cost. In this paper, we introduce an approximation to the prior/posterior distribution of the parameters in the beta distribution and propose an analytically tractable (closed form) Bayesian approach to the parameter estimation. The approach is based on the variational inference (VI) framework. Following the principles of the VI framework and utilizing the relative convexity bound, the extended factorized approximation method is applied to approximate the distribution of the parameters in BMM. In a fully Bayesian model where all of the parameters of the BMM are considered as variables and assigned proper distributions, our approach can asymptotically find the optimal estimate of the parameters posterior distribution. Also, the model complexity can be determined based on the data. The closed-form solution is proposed so that no iterative numerical calculation is required. Meanwhile, our approach avoids the drawback of overfitting in the conventional expectation maximization algorithm. The good performance of this approach is verified by experiments with both synthetic and real data.

206 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that unbiasedness is enough when the estimated likelihood is used inside a Metropolis-Hastings algorithm, which is perhaps surprising given the celebrated results on maximum simulated likelihood estimation.
Abstract: Suppose we wish to carry out likelihood based inference but we solely have an unbiased simulation based estimator of the likelihood. We note that unbiasedness is enough when the estimated likelihood is used inside a Metropolis-Hastings algorithm. This result has recently been introduced in statistics literature by Andrieu, Doucet, and Holenstein (2007) and is perhaps surprising given the celebrated results on maximum simulated likelihood estimation. Bayesian inference based on simulated likelihood can be widely applied in microeconomics, macroeconomics and flnancial econometrics. One way of generating unbiased estimates of the likelihood is by the use of a particle fllter. We illustrate these methods on four problems in econometrics, producing rather generic methods. Taken together, these methods imply that if we can simulate from an economic model we can carry out likelihood based inference using its simulations.

197 citations


Journal ArticleDOI
TL;DR: A smoothing method based on the log-sum exponential function is developed and indicates that such a smoothing approach leads to a novel smoothed primal-dual model and suggests labelings with maximum entropy.
Abstract: This paper is devoted to the optimization problem of continuous multi-partitioning, or multi-labeling, which is based on a convex relaxation of the continuous Potts model. In contrast to previous efforts, which are tackling the optimal labeling problem in a direct manner, we first propose a novel dual model and then build up a corresponding duality-based approach. By analyzing the dual formulation, sufficient conditions are derived which show that the relaxation is often exact, i.e. there exists optimal solutions that are also globally optimal to the original nonconvex Potts model. In order to deal with the nonsmooth dual problem, we develop a smoothing method based on the log-sum exponential function and indicate that such a smoothing approach leads to a novel smoothed primal-dual model and suggests labelings with maximum entropy. Such a smoothing method for the dual model also yields a new thresholding scheme to obtain approximate solutions. An expectation maximization like algorithm is proposed based on the smoothed formulation which is shown to be superior in efficiency compared to earlier approaches from continuous optimization. Numerical experiments also show that our method outperforms several competitive approaches in various aspects, such as lower energies and better visual quality.

182 citations


Journal ArticleDOI
TL;DR: This paper presents a new approach to inverting (fitting) models of coupled dynamical systems based on state-of-the-art Kalman filtering, which promises to provide a significant advance in characterizing the functional architectures of distributed neuronal systems, even in the absence of known exogenous input.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an online parameter estimation algorithm that combines two key ideas: reparameterizing the problem using complete-data sufficient statistics and exploiting a purely recursive form of smoothing in HMMs based on an auxiliary recursion.
Abstract: Online (also called “recursive” or “adaptive”) estimation of fixed model parameters in hidden Markov models is a topic of much interest in times series modeling. In this work, we propose an online parameter estimation algorithm that combines two key ideas. The first one, which is deeply rooted in the Expectation-Maximization (EM) methodology, consists in reparameterizing the problem using complete-data sufficient statistics. The second ingredient consists in exploiting a purely recursive form of smoothing in HMMs based on an auxiliary recursion. Although the proposed online EM algorithm resembles a classical stochastic approximation (or Robbins–Monro) algorithm, it is sufficiently different to resist conventional analysis of convergence. We thus provide limited results which identify the potential limiting points of the recursion as well as the large-sample behavior of the quantities involved in the algorithm. The performance of the proposed algorithm is numerically evaluated through simulations in the ca...

Proceedings ArticleDOI
01 Nov 2011
TL;DR: This paper embeds the BG-AMP algorithm within an expectation-maximization (EM) framework, and simultaneously reconstruct the signal while learning the prior signal and noise parameters, and achieves excellent performance on a range of signal types.
Abstract: The approximate message passing (AMP) algorithm originally proposed by Donoho, Maleki, and Montanari yields a computationally attractive solution to the usual l 1 -regularized least-squares problem faced in compressed sensing, whose solution is known to be robust to the signal distribution When the signal is drawn iid from a marginal distribution that is not least-favorable, better performance can be attained using a Bayesian variation of AMP The latter, however, assumes that the distribution is perfectly known In this paper, we navigate the space between these two extremes by modeling the signal as iid Bernoulli-Gaussian (BG) with unknown prior sparsity, mean, and variance, and the noise as zero-mean Gaussian with unknown variance, and we simultaneously reconstruct the signal while learning the prior signal and noise parameters To accomplish this task, we embed the BG-AMP algorithm within an expectation-maximization (EM) framework Numerical experiments confirm the excellent performance of our proposed EM-BG-AMP on a range of signal types12

01 Jan 2011
TL;DR: Maximum penalized likelihood estimation is proposed as a method for simultaneously estimating the background rate and the triggering density of Hawkes process intensities that vary over multiple time scales and used to examine self-excitation in Iraq IED event patterns.
Abstract: Estimating the conditional intensity of a self-exciting point process is particularly challenging when both exogenous and endogenous e!ects play a role in clustering. We propose maximum penalized likelihood estimation as a method for simultaneously estimating the background rate and the triggering density of Hawkes process intensities that vary over multiple time scales. We compare the accuracy of the algorithm with the recently introduced Model Independent Stochastic Declustering (MISD) algorithm and then use the model to examine self-excitation in Iraq IED event patterns.

Journal ArticleDOI
TL;DR: The Weibull power series (WPS) class of distributions is introduced, where the compounding procedure follows same way that was previously carried out by Adamidis and Loukas (1998), and several properties of the WPS distributions such as moments, order statistics, estimation by maximum likelihood and inference for a large sample are obtained.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a dimension-reduced mixed-effects model for estimating the parameters and substituting them into the optimal predictors using an empirical-Bayes approach.
Abstract: The use of satellite measurements in climate studies promises many new scientific insights if those data can be efficiently exploited. Due to sparseness of daily data sets, there is a need to fill spatial gaps and to borrow strength from adjacent days. Nonetheless, these satellites are typically capable of conducting on the order of 100,000 retrievals per day, which makes it impossible to apply traditional spatio-temporal statistical methods, even in supercomputing environments. To overcome these challenges, we make use of a spatio-temporal mixed-effects model. For each massive daily data set, dimension reduction is achieved by essentially modelling the underlying process as a linear combination of spatial basis functions on the globe. The application of a dynamical autoregressive model in time, over the reduced space, allows rapid sequential computation of optimal smoothing predictions via the Kalman smoother; this is known as Fixed Rank Smoothing (FRS). The dimension-reduced mixed-effects model contains a number of unknown parameters, including covariance and propagator matrices, which describe the spatial and temporal dependence structure in the reduced-dimensional process. We take an empirical-Bayes approach to inference, which involves estimating the parameters and substituting them into the optimal predictors. Method-of-moments (MM) parameter estimation (currently used in FRS) is typically inefficient compared to maximum likelihood (ML) estimation and can result in large sampling variability. Here, we develop ML estimation via an expectation-maximization (EM) algorithm, which offers stable computation of valid estimators and makes efficient use of spatial and temporal dependence in the data. The two parameter-estimation approaches, MM and ML, are compared in a simulation study. We also apply our methodology to global satellite CO2 measurements: We optimally smooth the sparse daily CO2 maps obtained by the Atmospheric InfraRed Sounder (AIRS) instrument on the Aqua satellite; then, using FRS with EM-estimated parameters, a complete sequence of the daily global CO2 fields can be obtained, together with their associated prediction uncertainties.

Journal ArticleDOI
TL;DR: This work considers selecting both fixed and random effects in a general class of mixed effects models using maximum penalized likelihood (MPL) estimation along with the smoothly clipped absolute deviation (SCAD) and adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions.
Abstract: Summary We consider selecting both fixed and random effects in a general class of mixed effects models using maximum penalized likelihood (MPL) estimation along with the smoothly clipped absolute deviation (SCAD) and adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions. The MPL estimates are shown to possess consistency and sparsity properties and asymptotic normality. A model selection criterion, called the ICQ statistic, is proposed for selecting the penalty parameters (Ibrahim, Zhu, and Tang, 2008, Journal of the American Statistical Association 103, 1648–1658). The variable selection procedure based on ICQ is shown to consistently select important fixed and random effects. The methodology is very general and can be applied to numerous situations involving random effects, including generalized linear mixed models. Simulation studies and a real data set from a Yale infant growth study are used to illustrate the proposed methodology.

Journal ArticleDOI
TL;DR: In this article, the authors generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties.
Abstract: We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation–Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or “underlying” distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a “split-and-merge” procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.

Proceedings ArticleDOI
06 Dec 2011
TL;DR: This work proposes a 'learning-based' approach, WiGEM, where the received signal strength is modeled as a Gaussian Mixture Model (GMM) where Expectation Maximization (EM) is used to learn the maximum likelihood estimates of the model parameters.
Abstract: We consider the problem of localizing a wireless client in an indoor environment based on the signal strength of its transmitted packets as received on stationary sniffers or access points. Several state-of-the-art indoor localization techniques have the drawback that they rely extensively on a labor-intensive 'training' phase that does not scale well. Use of unmodeled hardware with heterogeneous power levels further reduces the accuracy of these techniques.We propose a 'learning-based' approach, WiGEM, where the received signal strength is modeled as a Gaussian Mixture Model (GMM). Expectation Maximization (EM) is used to learn the maximum likelihood estimates of the model parameters. This approach enables us to localize a transmitting device based on the maximum a posteriori estimate. The key insight is to use the physics of wireless propagation, and exploit the signal strength constraints that exist for different transmit power levels. The learning approach not only avoids the labor-intensive training, but also makes the location estimates considerably robust in the face of heterogeneity and various time varying phenomena. We present evaluations on two different indoor testbeds with multiple WiFi devices. We demonstrate that WiGEM's accuracy is at par with or better than state-of-the-art techniques but without requiring any training.

Journal ArticleDOI
TL;DR: This paper proposes a new variational approximation for infinite mixtures of Gaussian processes that uses variational inference and a truncated stick-breaking representation of the Dirichlet process to approximate the posterior of hidden variables involved in the model.
Abstract: This paper proposes a new variational approximation for infinite mixtures of Gaussian processes. As an extension of the single Gaussian process regression model, mixtures of Gaussian processes can characterize varying covariances or multimodal data and reduce the deficiency of the computationally cubic complexity of the single Gaussian process model. The infinite mixture of Gaussian processes further integrates a Dirichlet process prior to allowing the number of mixture components to automatically be determined from data. We use variational inference and a truncated stick-breaking representation of the Dirichlet process to approximate the posterior of hidden variables involved in the model. To fix the hyperparameters of the model, the variational EM algorithm and a greedy algorithm are employed. In addition to presenting the variational infinite-mixture model, we apply it to the problem of traffic flow prediction. Experiments with comparisons to other approaches show the effectiveness of the proposed model.

Journal ArticleDOI
TL;DR: In this paper, a parametric fractional imputation (FPI) method is proposed to generate imputed values from the conditional distribution of the missing data given the observed data, where the fractional weights are computed from the current value of the parameter estimates.
Abstract: Under a parametric model for missing data, the EM algorithm is a popular tool for flnding the maximum likelihood estimates (MLE) of the parameters of the model. Imputation, when carefully done, can be used to facilitate the parameter estimation by applying the complete-sample estimators to the imputed dataset. The basic idea is to generate the imputed values from the conditional distribution of the missing data given the observed data. Multiple imputation is a Bayesian approach to generate the imputed values from the conditional distribution. In this article, parametric fractional imputation is proposed as a parametric approach for generating imputed values. Using fractional weights, the E-step of the EM algorithm can be approximated by the weighted mean of the imputed data likelihood where the fractional weights are computed from the current value of the parameter estimates. Some computational e‐ciency can be achieved using the idea of importance sampling in the Monte Carlo approximation of the conditional expectation. The resulting estimator of the specifled parameters will be identical to the MLE under missing data if the fractional weights are adjusted using a calibration step. The proposed imputation method provides e‐cient parameter estimates for the model parameters specifled and also provides reasonable estimates for parameters that are not part of the imputation model, for example domain means. Thus, the proposed imputation method is a useful tool for general-purpose data analysis. Variance estimation is covered and results from a limited simulation study are presented.

Book Chapter
01 Jan 2011
TL;DR: The maximum likelihood theory for incomplete data from an exponential family through the EM algorithm and the geometry of exponential families are studied.
Abstract: E Barndorff-Nielsen OE, Cox DR () Inference and asymptotics. Chapman & Hall, London Brazzale AR, Davison AC, Reid N () Applied asymptotics: case studies in small-sample statistics. Cambridge University Press, Cambridge Dempster AP, Laird NM, Rubin DB () Maximum likelihood from incomplete data via the EM algorithm (with discussion). J Roy Stat Soc B :– Efron B () The geometry of exponential families. Ann Stat :– Sundberg R () Maximum likelihood theory for incomplete data from an exponential family. Scand J Stat :–

Journal ArticleDOI
TL;DR: The results show that the proposed atlas registration of brain images with gliomas outperforms ORBIT, and the warped templates have better similarity to patient images.
Abstract: This paper investigates the problem of atlas registration of brain images with gliomas. Multiparametric imaging modalities (T1, T1-CE, T2, and FLAIR) are first utilized for segmentations of different tissues, and to compute the posterior probability map (PBM) of membership to each tissue class, using supervised learning. Similar maps are generated in the initially normal atlas, by modeling the tumor growth, using reaction-diffusion equation. Deformable registration using a demons-like algorithm is used to register the patient images with the tumor bearing atlas. Joint estimation of the simulated tumor parameters (e.g., location, mass effect and degree of infiltration), and the spatial transformation is achieved by maximization of the log-likelihood of observation. An expectation-maximization algorithm is used in registration process to estimate the spatial transformation and other parameters related to tumor simulation are optimized through asynchronous parallel pattern search (APPSPACK). The proposed method has been evaluated on five simulated data sets created by statistically simulated deformations (SSD), and fifteen real multichannel glioma data sets. The performance has been evaluated both quantitatively and qualitatively, and the results have been compared to ORBIT, an alternative method solving a similar problem. The results show that our method outperforms ORBIT, and the warped templates have better similarity to patient images.

Journal ArticleDOI
TL;DR: An efficient expectation-maximization (EM) algorithm for maximum-likelihood (ML) estimation is presented for energy-based multisource localization in WSNs using acoustic sensors and simulation results show that the proposed EM algorithm provides a good tradeoff between estimation accuracy and computational complexity.
Abstract: Energy-based multisource localization is an important research problem in wireless sensor networks (WSNs). Existing algorithms for this problem, such as multiresolution (MR) search and exhaustive search methods, are of either high computational complexity or low estimation accuracy. In this paper, an efficient expectation-maximization (EM) algorithm for maximum-likelihood (ML) estimation is presented for energy-based multisource localization in WSNs using acoustic sensors. The basic idea of the algorithm is to decompose each sensor's energy measurement, which is a superimposition of energy signals emitted from multiple sources, into components, each of which corresponds to an individual source, and then estimate the source parameters, such as source energy and location, as well as the decay factor of the signal during propagation. An efficient sequential dominant-source (SDS) initialization scheme and an incremental parameterized search refinement scheme are introduced to speed up the algorithm and improve the estimation accuracy. Theoretic analyses on the algorithm convergence rate, the Cramer-Rao lower bound (CRLB) for localization accuracy, and the computational complexity of the algorithm are also given. The simulation results show that the proposed EM algorithm provides a good tradeoff between estimation accuracy and computational complexity.

Journal ArticleDOI
TL;DR: It is shown that the EM algorithm may be used for estimating the parameters in a parametric statistical model when the observations are fuzzy and are assumed to be related to underlying crisp realizations of a random sample.

Book
13 Jun 2011
TL;DR: The EM algorithm, variational approximations and expectation propagation for mixtures, and nonparametric mixed membership modelling using the IBP compound Dirichlet process are presented.
Abstract: Preface Acknowledgements List of Contributors 1 The EM algorithm, variational approximations and expectation propagation for mixtures D.Michael Titterington 1.1 Preamble 1.2 The EM algorithm 1.3 Variational approximations 1.4 Expectation-propagation Acknowledgements References 2 Online expectation maximisation Olivier Cappe 2.1 Introduction 2.2 Model and assumptions 2.3 The EM algorithm and the limiting EM recursion 2.4 Online expectation maximisation 2.5 Discussion References 3 The limiting distribution of the EM test of the order of a finite mixture J. Chen and Pengfei Li 3.1 Introduction 3.2 The method and theory of the EM test 3.3 Proofs 3.4 Discussion References 4 Comparing Wald and likelihood regions applied to locally identifiable mixture models Daeyoung Kim and Bruce G. Lindsay 4.1 Introduction 4.2 Background on likelihood confidence regions 4.3 Background on simulation and visualisation of the likelihood regions 4.4 Comparison between the likelihood regions and the Wald regions 4.5 Application to a finite mixture model 4.6 Data analysis 4.7 Discussion References 5 Mixture of experts modelling with social science applications Isobel Claire Gormley and Thomas Brendan Murphy 5.1 Introduction 5.2 Motivating examples 5.3 Mixture models 5.4 Mixture of experts models 5.5 A Mixture of experts model for ranked preference data 5.6 A Mixture of experts latent position cluster model 5.7 Discussion Acknowledgements References 6 Modelling conditional densities using finite smooth mixtures Feng Li, Mattias Villani and Robert Kohn 6.1 Introduction 6.2 The model and prior 6.3 Inference methodology 6.4 Applications 6.5 Conclusions Acknowledgements Appendix: Implementation details for the gamma and log-normal models References 7 Nonparametric mixed membership modelling using the IBP compound Dirichlet process Sinead Williamson, Chong Wang, Katherine A. Heller, and David M. Blei 7.1 Introduction 7.2 Mixed membership models 7.3 Motivation 7.4 Decorrelating prevalence and proportion 7.5 Related models 7.6 Empirical studies 7.7 Discussion References 8 Discovering nonbinary hierarchical structures with Bayesian rose trees Charles Blundell, Yee Whye Teh, and Katherine A. Heller 8.1 Introduction 8.2 Prior work 8.3 Rose trees, partitions and mixtures 8.4 Greedy Construction of Bayesian Rose Tree Mixtures 8.5 Bayesian hierarchical clustering, Dirichlet process models and product partition models 8.6 Results 8.7 Discussion References 9 Mixtures of factor analyzers for the analysis of high-dimensional data Geoffrey J. McLachlan, Jangsun Baek, and Suren I. Rathnayake 9.1 Introduction 9.2 Single-factor analysis model 9.3 Mixtures of factor analyzers 9.4 Mixtures of common factor analyzers (MCFA) 9.5 Some related approaches 9.6 Fitting of factor-analytic models 9.7 Choice of the number of factors q 9.8 Example 9.9 Low-dimensional plots via MCFA approach 9.10 Multivariate t-factor analysers 9.11 Discussion Appendix References 10 Dealing with Label Switching under model uncertainty Sylvia Fruhwirth-Schnatter 10.1 Introduction 10.2 Labelling through clustering in the point-process representation 10.3 Identifying mixtures when the number of components is unknown 10.4 Overfitting heterogeneity of component-specific parameters 10.5 Concluding remarks References 11 Exact Bayesian analysis of mixtures Christian .P. Robert and Kerrie L. Mengersen 11.1 Introduction 11.2 Formal derivation of the posterior distribution References 12 Manifold MCMC for mixtures Vassilios Stathopoulos and Mark Girolami 12.1 Introduction 12.2 Markov chain Monte Carlo methods 12.3 Finite Gaussian mixture models 12.4 Experiments 12.5 Discussion Acknowledgements Appendix References 13 How many components in a finite mixture? Murray Aitkin 13.1 Introduction 13.2 The galaxy data 13.3 The normal mixture model 13.4 Bayesian analyses 13.5 Posterior distributions for K (for flat prior) 13.6 Conclusions from the Bayesian analyses 13.7 Posterior distributions of the model deviances 13.8 Asymptotic distributions 13.9 Posterior deviances for the galaxy data 13.10 Conclusion References 14 Bayesian mixture models: a blood-free dissection of a sheep Clair L. Alston, Kerrie L. Mengersen, and Graham E. Gardner 14.1 Introduction 14.2 Mixture models 14.3 Altering dimensions of the mixture model 14.4 Bayesian mixture model incorporating spatial information 14.5 Volume calculation 14.6 Discussion References Index.

Journal ArticleDOI
TL;DR: The goal of this paper is to perform a segmentation of atherosclerotic plaques in view of evaluating their burden and to provide boundaries for computing properties such as the plaque deformation and elasticity distribution (elastogram and modulogram).
Abstract: The goal of this paper is to perform a segmentation of atherosclerotic plaques in view of evaluating their burden and to provide boundaries for computing properties such as the plaque deformation and elasticity distribution (elastogram and modulogram). The echogenicity of a region of interest comprising the plaque, the vessel lumen, and the adventitia of the artery wall in an ultrasonic B-mode image was modeled by mixtures of three Nakagami distributions, which yielded the likelihood of a Bayesian segmentation model. The main contribution of this paper is the estimation of the motion field and its integration into the prior of the Bayesian model that included a local geometrical smoothness constraint, as well as an original spatiotemporal cohesion constraint. The Maximum A Posteriori of the proposed model was computed with a variant of the exploration/selection algorithm. The starting point is a manual segmentation of the first frame. The proposed method was quantitatively compared with manual segmentations of all frames by an expert technician. Various measures were used for this evaluation, including the mean point-to-point distance and the Hausdorff distance. Results were evaluated on 94 sequences of 33 patients (for a total of 8988 images). We report a mean point-to-point distance of 0.24 ± 0.08 mm and a Hausdorff distance of 1.24 ± 0.40 mm. Our tests showed that the algorithm was not sensitive to the degree of stenosis or calcification.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a method for vector field learning with outliers, called vector field consensus (VFC), which could distinguish inliers from outliers and learn a vector field fitting for the inLiers simultaneously, and it is very robust to outliers.
Abstract: We propose a method for vector field learning with outliers, called vector field consensus (VFC). It could distinguish inliers from outliers and learn a vector field fitting for the inliers simultaneously. A prior is taken to force the smoothness of the field, which is based on the Tiknonov regularization in vector-valued reproducing kernel Hilbert space. Under a Bayesian framework, we associate each sample with a latent variable which indicates whether it is an inlier, and then formulate the problem as maximum a posteriori problem and use Expectation Maximization algorithm to solve it. The proposed method possesses two characteristics: 1) robust to outliers, and being able to tolerate 90% outliers and even more, 2) computationally efficient. As an application, we apply VFC to solve the problem of mismatch removing. The results demonstrate that our method outperforms many state-of-the-art methods, and it is very robust.

Journal ArticleDOI
TL;DR: In this paper, an iterative algorithm for high-dimensional linear inverse problems, which is regularized by a differentiable discrete approximation of the total variation (TV) penalty, is presented.
Abstract: This paper describes an iterative algorithm for high-dimensional linear inverse problems, which is regularized by a differentiable discrete approximation of the total variation (TV) penalty. The algorithm is an interlaced iterative method based on optimization transfer with a separable quadratic surrogate for the TV penalty. The surrogate cost function is optimized using the block iterative regularized algebraic reconstruction technique (RSART). A proof of convergence is given and convergence is illustrated by numerical experiments with simulated parallel-beam computerized tomography (CT) data. The proposed method provides a block-iterative and convergent, hence efficient and reliable, algorithm to investigate the effects of TV regularization in applications such as CT.

Journal ArticleDOI
TL;DR: In this article, the authors advocate the use of multivariate t-distributions for more robust inference of graphs and demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a computationally efficient approach to model selection in the T-distribution case.
Abstract: Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models, and resulting recent progress includes the development of fitting methodology involving penalization of the likelihood function. In this paper we advocate the use of multivariate t-distributions for more robust inference of graphs. In particular, we demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a computationally efficient approach to model selection in the t-distribution case. We consider two versions of multivariate t-distributions, one of which requires the use of approximation techniques. For this distribution, we describe a Markov chain Monte Carlo EM algorithm based on a Gibbs sampler as well as a simple variational approximation that makes the resulting method feasible in large problems.