scispace - formally typeset
Journal ArticleDOI

On the number of components in a Gaussian mixture model

TLDR
This work reviews various methods that have been proposed to answer the question of how many components to include in the normal mixture model and proposes a probabilistic clustering procedure corresponding to the g components in the mixture model.
Abstract
Mixture distributions, in particular normal mixtures, are applied to data with two main purposes in mind. One is to provide an appealing semiparametric framework in which to model unknown distributional shapes, as an alternative to, say, the kernel density method. The other is to use the mixture model to provide a probabilistic clustering of the data into g clusters corresponding to the g components in the mixture model. In both situations, there is the question of how many components to include in the normal mixture model. We review various methods that have been proposed to answer this question. WIREs Data Mining Knowl Discov 2014, 4:341-355. doi: 10.1002/widm.1135

read more

Citations
More filters
Journal ArticleDOI

mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

TL;DR: This updated version of mclust adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.

Mathematical Methods Of Statistics

TL;DR: The mathematical methods of statistics is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.
Journal ArticleDOI

Unsupervised Learning Methods for Molecular Simulation Data.

TL;DR: This Review provides a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicates likely directions for further developments in the field.
Proceedings ArticleDOI

Sliced Wasserstein Distance for Learning Gaussian Mixture Models

TL;DR: This work proposes an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm that can estimate high-dimensional data distributions more faithfully than the EM algorithm.
Journal ArticleDOI

A Gaussian Mixture Model Representation of Endmember Variability in Hyperspectral Unmixing.

TL;DR: In this article, a Gaussian mixture model (GMM) is proposed to represent endmember variability in hyperspectral unmixing, which can not only estimate the abundances and distribution parameters, but also the distinct endmember set for each pixel.
References
More filters
Journal ArticleDOI

A new look at the statistical model identification

TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Journal ArticleDOI

Estimating the Dimension of a Model

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.

Estimating the dimension of a model

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Proceedings Article

Information Theory and an Extention of the Maximum Likelihood Principle

H. Akaike
TL;DR: The classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion to provide answers to many practical problems of statistical model fitting.