scispace - formally typeset
Search or ask a question
Topic

Expectation–maximization algorithm

About: Expectation–maximization algorithm is a research topic. Over the lifetime, 11823 publications have been published within this topic receiving 528693 citations. The topic is also known as: EM algorithm & Expectation Maximization.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a unified approach of treating multivariate linear normal models is presented, which is based on a useful extension of the growth curve model, and the finding of maximum likelihood estimators when linear restrictions exist on the parameters describing the mean in the growing curve model is considered.

101 citations

Book
13 Jun 2011
TL;DR: The EM algorithm, variational approximations and expectation propagation for mixtures, and nonparametric mixed membership modelling using the IBP compound Dirichlet process are presented.
Abstract: Preface Acknowledgements List of Contributors 1 The EM algorithm, variational approximations and expectation propagation for mixtures D.Michael Titterington 1.1 Preamble 1.2 The EM algorithm 1.3 Variational approximations 1.4 Expectation-propagation Acknowledgements References 2 Online expectation maximisation Olivier Cappe 2.1 Introduction 2.2 Model and assumptions 2.3 The EM algorithm and the limiting EM recursion 2.4 Online expectation maximisation 2.5 Discussion References 3 The limiting distribution of the EM test of the order of a finite mixture J. Chen and Pengfei Li 3.1 Introduction 3.2 The method and theory of the EM test 3.3 Proofs 3.4 Discussion References 4 Comparing Wald and likelihood regions applied to locally identifiable mixture models Daeyoung Kim and Bruce G. Lindsay 4.1 Introduction 4.2 Background on likelihood confidence regions 4.3 Background on simulation and visualisation of the likelihood regions 4.4 Comparison between the likelihood regions and the Wald regions 4.5 Application to a finite mixture model 4.6 Data analysis 4.7 Discussion References 5 Mixture of experts modelling with social science applications Isobel Claire Gormley and Thomas Brendan Murphy 5.1 Introduction 5.2 Motivating examples 5.3 Mixture models 5.4 Mixture of experts models 5.5 A Mixture of experts model for ranked preference data 5.6 A Mixture of experts latent position cluster model 5.7 Discussion Acknowledgements References 6 Modelling conditional densities using finite smooth mixtures Feng Li, Mattias Villani and Robert Kohn 6.1 Introduction 6.2 The model and prior 6.3 Inference methodology 6.4 Applications 6.5 Conclusions Acknowledgements Appendix: Implementation details for the gamma and log-normal models References 7 Nonparametric mixed membership modelling using the IBP compound Dirichlet process Sinead Williamson, Chong Wang, Katherine A. Heller, and David M. Blei 7.1 Introduction 7.2 Mixed membership models 7.3 Motivation 7.4 Decorrelating prevalence and proportion 7.5 Related models 7.6 Empirical studies 7.7 Discussion References 8 Discovering nonbinary hierarchical structures with Bayesian rose trees Charles Blundell, Yee Whye Teh, and Katherine A. Heller 8.1 Introduction 8.2 Prior work 8.3 Rose trees, partitions and mixtures 8.4 Greedy Construction of Bayesian Rose Tree Mixtures 8.5 Bayesian hierarchical clustering, Dirichlet process models and product partition models 8.6 Results 8.7 Discussion References 9 Mixtures of factor analyzers for the analysis of high-dimensional data Geoffrey J. McLachlan, Jangsun Baek, and Suren I. Rathnayake 9.1 Introduction 9.2 Single-factor analysis model 9.3 Mixtures of factor analyzers 9.4 Mixtures of common factor analyzers (MCFA) 9.5 Some related approaches 9.6 Fitting of factor-analytic models 9.7 Choice of the number of factors q 9.8 Example 9.9 Low-dimensional plots via MCFA approach 9.10 Multivariate t-factor analysers 9.11 Discussion Appendix References 10 Dealing with Label Switching under model uncertainty Sylvia Fruhwirth-Schnatter 10.1 Introduction 10.2 Labelling through clustering in the point-process representation 10.3 Identifying mixtures when the number of components is unknown 10.4 Overfitting heterogeneity of component-specific parameters 10.5 Concluding remarks References 11 Exact Bayesian analysis of mixtures Christian .P. Robert and Kerrie L. Mengersen 11.1 Introduction 11.2 Formal derivation of the posterior distribution References 12 Manifold MCMC for mixtures Vassilios Stathopoulos and Mark Girolami 12.1 Introduction 12.2 Markov chain Monte Carlo methods 12.3 Finite Gaussian mixture models 12.4 Experiments 12.5 Discussion Acknowledgements Appendix References 13 How many components in a finite mixture? Murray Aitkin 13.1 Introduction 13.2 The galaxy data 13.3 The normal mixture model 13.4 Bayesian analyses 13.5 Posterior distributions for K (for flat prior) 13.6 Conclusions from the Bayesian analyses 13.7 Posterior distributions of the model deviances 13.8 Asymptotic distributions 13.9 Posterior deviances for the galaxy data 13.10 Conclusion References 14 Bayesian mixture models: a blood-free dissection of a sheep Clair L. Alston, Kerrie L. Mengersen, and Graham E. Gardner 14.1 Introduction 14.2 Mixture models 14.3 Altering dimensions of the mixture model 14.4 Bayesian mixture model incorporating spatial information 14.5 Volume calculation 14.6 Discussion References Index.

100 citations

Journal ArticleDOI
TL;DR: The use of conditional maximum-likelihood training for the TSBN is investigated and it is found that this gives rise to improved classification performance over the ML-trained TSBN.
Abstract: We are concerned with the problem of image segmentation, in which each pixel is assigned to one of a predefined finite number of labels. In Bayesian image analysis, this requires fusing together local predictions for the class labels with a prior model of label images. Following the work of Bouman and Shapiro (1994), we consider the use of tree-structured belief networks (TSBNs) as prior models. The parameters in the TSBN are trained using a maximum-likelihood objective function with the EM algorithm and the resulting model is evaluated by calculating how efficiently it codes label images. A number of authors have used Gaussian mixture models to connect the label field to the image data. We compare this approach to the scaled-likelihood method of Smyth (1994) and Morgan and Bourlard (1995), where local predictions of pixel classification from neural networks are fused with the TSBN prior. Our results show a higher performance is obtained with the neural networks. We evaluate the classification results obtained and emphasize not only the maximum a posteriori segmentation, but also the uncertainty, as evidenced e.g., by the pixelwise posterior marginal entropies. We also investigate the use of conditional maximum-likelihood training for the TSBN and find that this gives rise to improved classification performance over the ML-trained TSBN.

100 citations

Journal Article
TL;DR: This work builds on the information bottleneck framework of Tishby et al. (1999) and constructs a learning algorithm that combines an information-theoretic smoothing term with a continuation procedure that bypasses local maxima and achieves superior solutions.
Abstract: A central challenge in learning probabilistic graphical models is dealing with domains that involve hidden variables. The common approach for learning model parameters in such domains is the expectation maximization (EM) algorithm. This algorithm, however, can easily get trapped in sub-optimal local maxima. Learning the model structure is even more challenging. The structural EM algorithm can adapt the structure in the presence of hidden variables, but usually performs poorly without prior knowledge about the cardinality and location of the hidden variables. In this work, we present a general approach for learning Bayesian networks with hidden variables that overcomes these problems. The approach builds on the information bottleneck framework of Tishby et al. (1999). We start by proving formal correspondence between the information bottleneck objective and the standard parametric EM functional. We then use this correspondence to construct a learning algorithm that combines an information-theoretic smoothing term with a continuation procedure. Intuitively, the algorithm bypasses local maxima and achieves superior solutions by following a continuous path from a solution of, an easy and smooth, target function, to a solution of the desired likelihood function. As we show, our algorithmic framework allows learning of the parameters as well as the structure of a network. In addition, it also allows us to introduce new hidden variables during model selection and learn their cardinality. We demonstrate the performance of our procedure on several challenging real-life data sets.

100 citations

Book ChapterDOI
30 Apr 2007

100 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
91% related
Deep learning
79.8K papers, 2.1M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
84% related
Artificial neural network
207K papers, 4.5M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023114
2022245
2021438
2020410
2019484
2018519