scispace - formally typeset
Search or ask a question

Showing papers on "Mixture model published in 2012"


Journal ArticleDOI
TL;DR: This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
Abstract: Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an overview of this progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.

9,091 citations


Journal ArticleDOI
TL;DR: A pre-trained deep neural network hidden Markov model (DNN-HMM) hybrid architecture that trains the DNN to produce a distribution over senones (tied triphone states) as its output that can significantly outperform the conventional context-dependent Gaussian mixture model (GMM)-HMMs.
Abstract: We propose a novel context-dependent (CD) model for large-vocabulary speech recognition (LVSR) that leverages recent advances in using deep belief networks for phone recognition. We describe a pre-trained deep neural network hidden Markov model (DNN-HMM) hybrid architecture that trains the DNN to produce a distribution over senones (tied triphone states) as its output. The deep belief network pre-training algorithm is a robust and often helpful way to initialize deep neural networks generatively that can aid in optimization and reduce generalization error. We illustrate the key components of our model, describe the procedure for applying CD-DNN-HMMs to LVSR, and analyze the effects of various modeling choices on performance. Experiments on a challenging business search dataset demonstrate that CD-DNN-HMMs can significantly outperform the conventional context-dependent Gaussian mixture model (GMM)-HMMs, with an absolute sentence accuracy improvement of 5.8% and 9.2% (or relative error reduction of 16.0% and 23.2%) over the CD-GMM-HMMs trained using the minimum phone error rate (MPE) and maximum-likelihood (ML) criteria, respectively.

3,120 citations


Journal Article
TL;DR: This paper provides an overview of this progress and repres nts the shared views of four research groups who have had recent successes in using deep neural networks for a coustic modeling in speech recognition.
Abstract: Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an overview of this progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.

2,527 citations


Journal ArticleDOI
TL;DR: It is shown that better phone recognition on the TIMIT dataset can be achieved by replacing Gaussian mixture models by deep neural networks that contain many layers of features and a very large number of parameters.
Abstract: Gaussian mixture models are currently the dominant technique for modeling the emission distribution of hidden Markov models for speech recognition. We show that better phone recognition on the TIMIT dataset can be achieved by replacing Gaussian mixture models by deep neural networks that contain many layers of features and a very large number of parameters. These networks are first pre-trained as a multi-layer generative model of a window of spectral feature vectors without making use of any discriminative information. Once the generative pre-training has designed the features, we perform discriminative fine-tuning using backpropagation to adjust the features slightly to make them better at predicting a probability distribution over the states of monophone hidden Markov models.

1,767 citations


Posted Content
TL;DR: A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices, and implies a robust and computationally tractable estimation approach for several popular latent variable models.
Abstract: This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.

842 citations


Journal ArticleDOI
TL;DR: This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.

549 citations


Journal ArticleDOI
TL;DR: A dual mathematical interpretation of the proposed framework with a structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared with traditional sparse inverse problem techniques.
Abstract: A general framework for solving image inverse problems with piecewise linear estimations is introduced in this paper. The approach is based on Gaussian mixture models, which are estimated via a maximum a posteriori expectation-maximization algorithm. A dual mathematical interpretation of the proposed framework with a structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared with traditional sparse inverse problem techniques. We demonstrate that, in a number of image inverse problems, including interpolation, zooming, and deblurring of narrow kernels, the same simple and computationally efficient algorithm yields results in the same ballpark as that of the state of the art.

505 citations


Journal ArticleDOI
TL;DR: The proposed framework employs local Fisher's discriminant analysis to reduce the dimensionality of the data while preserving its multi-dimensional structure, while a subsequent Gaussian mixture model or support vector machine provides effective classification of the reduced-dimension multimodal data.
Abstract: Hyperspectral imagery typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image; however, when used in statistical pattern-classification tasks, the resulting high-dimensional feature spaces often tend to result in ill-conditioned formulations. Popular dimensionality-reduction techniques such as principal component analysis, linear discriminant analysis, and their variants typically assume a Gaussian distribution. The quadratic maximum-likelihood classifier commonly employed for hyperspectral analysis also assumes single-Gaussian class-conditional distributions. Departing from this single-Gaussian assumption, a classification paradigm designed to exploit the rich statistical structure of the data is proposed. The proposed framework employs local Fisher's discriminant analysis to reduce the dimensionality of the data while preserving its multimodal structure, while a subsequent Gaussian mixture model or support vector machine provides effective classification of the reduced-dimension multimodal data. Experimental results on several different multiple-class hyperspectral-classification tasks demonstrate that the proposed approach significantly outperforms several traditional alternatives.

408 citations


Proceedings Article
16 Jun 2012
TL;DR: In this article, a method of moments approach is proposed for parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians and hidden Markov models.
Abstract: Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an ecient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it oers a viable alternative to EM for practical deployment.

363 citations


Proceedings ArticleDOI
03 Jun 2012
TL;DR: A novel approach for trajectory prediction is proposed which has the capability to predict the vehicle's trajectory several seconds in advance, the so called long-term prediction.
Abstract: In the context of driver assistance, an accurate and reliable prediction of the vehicle's trajectory is beneficial. This can be useful either to increase the flexibility of comfort systems or, in the more interesting case, to detect potentially dangerous situations as early as possible. In this contribution, a novel approach for trajectory prediction is proposed which has the capability to predict the vehicle's trajectory several seconds in advance, the so called long-term prediction. To achieve this, previously observed motion patterns are used to infer a joint probability distribution as motion model. Using this distribution, a trajectory can be predicted by calculating the probability for the future motion, conditioned on the current observed history motion pattern. The advantage of the probabilistic modeling is that the result is not only a prediction, but rather a whole distribution over the future trajectories and a specific prediction can be made by the evaluation of the statistical properties, e.g. the mean of this conditioned distribution. Additionally, an evaluation of the variance can be used to examine the reliability of the prediction.

322 citations


Journal ArticleDOI
TL;DR: This paper presents an alternative approach to pseudo measurement modeling in the context of distribution system state estimation (DSSE), where pseudo measurements are generated from a few real measurements using artificial neural networks in conjunction with typical load profiles.
Abstract: This paper presents an alternative approach to pseudo measurement modeling in the context of distribution system state estimation (DSSE). In the proposed approach, pseudo measurements are generated from a few real measurements using artificial neural networks (ANNs) in conjunction with typical load profiles. The error associated with the generated pseudo measurements is made suitable for use in the weighted least squares (WLS) state estimation by decomposition into several components through the Gaussian mixture model (GMM). The effect of ANN-based pseudo measurement modeling on the quality of state estimation is demonstrated on a 95-bus section of the U.K. generic distribution system (UKGDS) model.

Journal ArticleDOI
03 Dec 2012
TL;DR: This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.

DOI
21 May 2012
TL;DR: A new Stata command, traj, is demonstrated for fitting to longitudinal data finite (discrete) mixture models designed to identify clusters of individuals following similar progressions of some behavior or outcome over age or time.
Abstract: Group-based trajectory models are used to investigate population differences in the developmental courses of behaviors or outcomes . This article demonstrates a new Stata command, traj, for fitting to longitudinal data finite (discrete) mixture models designed to identify clusters of individuals following similar progressions of some behavior or outcome over age or time. Censored normal, Poisson, zero-inflated Poisson, and Bernoulli distributions are supported. Applications to psychometric scale data, count data, and a dichotomous prevalence measure are illustrated

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This work proposes a patch based approach, where it is shown that the light field patches with the same disparity value lie on a low-dimensional subspace and that the dimensionality of such subspaces varies quadratically with the disparity value.
Abstract: With the recent availability of commercial light field cameras, we can foresee a future in which light field signals will be as common place as images. Hence, there is an imminent need to address the problem of light field processing. We provide a common framework for addressing many of the light field processing tasks, such as denoising, angular and spatial superresolution, etc. (in essence, all processing tasks whose observation models are linear). We propose a patch based approach, where we model the light field patches using a Gaussian mixture model (GMM). We use the ”disparity pattern” of the light field data to design the patch prior. We show that the light field patches with the same disparity value (i.e., at the same depth from the focal plane) lie on a low-dimensional subspace and that the dimensionality of such subspaces varies quadratically with the disparity value. We then model the patches as Gaussian random variables conditioned on its disparity value, thus, effectively leading to a GMM model. During inference, we first find the disparity value of a patch by a fast subspace projection technique and then reconstruct it using the LMMSE algorithm. With this prior and inference algorithm, we show that we can perform many different processing tasks under a common framework.

Posted Content
Daniel Hsu1, Sham M. Kakade1
TL;DR: In this paper, a simple spectral decomposition technique was used to obtain consistent parameter estimates from low-order observable moments, without additional minimum separation assumptions needed by previous computationally efficient estimation procedures.
Abstract: This work provides a computationally efficient and statistically consistent moment-based estimator for mixtures of spherical Gaussians. Under the condition that component means are in general position, a simple spectral decomposition technique yields consistent parameter estimates from low-order observable moments, without additional minimum separation assumptions needed by previous computationally efficient estimation procedures. Thus computational and information-theoretic barriers to efficient estimation in mixture models are precluded when the mixture components have means in general position and spherical covariances. Some connections are made to estimation problems related to independent component analysis.

Journal ArticleDOI
TL;DR: An adaptive image equalization algorithm that automatically enhances the contrast in an input image that is free of parameter setting for a given dynamic range of the enhanced image and can be applied to a wide range of image types.
Abstract: In this paper, we propose an adaptive image equalization algorithm that automatically enhances the contrast in an input image. The algorithm uses the Gaussian mixture model to model the image gray-level distribution, and the intersection points of the Gaussian components in the model are used to partition the dynamic range of the image into input gray-level intervals. The contrast equalized image is generated by transforming the pixels' gray levels in each input interval to the appropriate output gray-level interval according to the dominant Gaussian component and the cumulative distribution function of the input interval. To take account of the hypothesis that homogeneous regions in the image represent homogeneous silences (or set of Gaussian components) in the image histogram, the Gaussian components with small variances are weighted with smaller values than the Gaussian components with larger variances, and the gray-level distribution is also used to weight the components in the mapping of the input interval to the output interval. Experimental results show that the proposed algorithm produces better or comparable enhanced images than several state-of-the-art algorithms. Unlike the other algorithms, the proposed algorithm is free of parameter setting for a given dynamic range of the enhanced image and can be applied to a wide range of image types.

Book ChapterDOI
07 Oct 2012
TL;DR: This paper presents both a novel domain transform mixture model which outperforms a single transform model when multiple domains are present, and a novel constrained clustering method that successfully discovers latent domains.
Abstract: Recent domain adaptation methods successfully learn cross-domain transforms to map points between source and target domains. Yet, these methods are either restricted to a single training domain, or assume that the separation into source domains is known a priori. However, most available training data contains multiple unknown domains. In this paper, we present both a novel domain transform mixture model which outperforms a single transform model when multiple domains are present, and a novel constrained clustering method that successfully discovers latent domains. Our discovery method is based on a novel hierarchical clustering technique that uses available object category information to constrain the set of feasible domain separations. To illustrate the effectiveness of our approach we present experiments on two commonly available image datasets with and without known domain labels: in both cases our method outperforms baseline techniques which use no domain adaptation or domain adaptation methods that presume a single underlying domain shift.

Posted Content
TL;DR: Excess correlation analysis (ECA) as mentioned in this paper is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs).
Abstract: The problem of topic modeling can be seen as a generalization of the clustering problem, in that it posits that observations are generated due to multiple latent factors (e.g., the words in each document are generated as a mixture of several active topics, as opposed to just one). This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic probability vectors (the distributions over words for each topic), when only the words are observed and the corresponding topics are hidden. We provide a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of mixture models, including the popular latent Dirichlet allocation (LDA) model. For LDA, the procedure correctly recovers both the topic probability vectors and the prior over the topics, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, termed Excess Correlation Analysis (ECA), is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs). Moreover, the algorithm is scalable since the SVD operations are carried out on $k\times k$ matrices, where $k$ is the number of latent factors (e.g. the number of topics), rather than in the $d$-dimensional observed space (typically $d \gg k$).

Journal ArticleDOI
TL;DR: A robust EM clustering algorithm for Gaussian mixture models is developed, first creating a new way to solve these initialization problems, and then constructing a schema to automatically obtain an optimal number of clusters.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper introduces a novel model, the Robust Boltzmann Machine (RoBM), which allows BoltZmann Machines to be robust to corruptions and is significantly better at recognition and denoising on several face databases.
Abstract: While Boltzmann Machines have been successful at unsupervised learning and density modeling of images and speech data, they can be very sensitive to noise in the data. In this paper, we introduce a novel model, the Robust Boltzmann Machine (RoBM), which allows Boltzmann Machines to be robust to corruptions. In the domain of visual recognition, the RoBM is able to accurately deal with occlusions and noise by using multiplicative gating to induce a scale mixture of Gaussians over pixels. Image denoising and in-painting correspond to posterior inference in the RoBM. Our model is trained in an unsupervised fashion with unlabeled noisy data and can learn the spatial structure of the occluders. Compared to standard algorithms, the RoBM is significantly better at recognition and denoising on several face databases.

Journal ArticleDOI
Jie Yu1
TL;DR: The proposed NKGMM approach outperforms the ICA and GMM methods in early detection of process faults, minimization of false alarms, and isolation of faulty variables of nonlinear and non-Gaussian multimode processes.

Journal ArticleDOI
TL;DR: A new statistical sharpness measure is proposed by exploiting the spreading of the wavelet coefficients distribution to measure the degree of the image's blur and is exploited to perform adaptive image fusion in wavelet domain.

Proceedings Article
03 Dec 2012
TL;DR: A probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), is proposed, derived from the logistic stick-breaking process, to solve the superset label learning problem by maximizing the likelihood of the candidate label sets of training instances.
Abstract: In the superset label learning problem (SLL), each training instance provides a set of candidate labels of which one is the true label of the instance. As in ordinary regression, the candidate label set is a noisy version of the true label. In this work, we solve the problem by maximizing the likelihood of the candidate label sets of training instances. We propose a probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), to do the job. The LSB-CMM is derived from the logistic stick-breaking process. It first maps data points to mixture components and then assigns to each mixture component a label drawn from a component-specific multinomial distribution. The mixture components can capture underlying structure in the data, which is very useful when the model is weakly supervised. This advantage comes at little cost, since the model introduces few additional parameters. Experimental tests on several real-world problems with superset labels show results that are competitive or superior to the state of the art. The discovered underlying structures also provide improved explanations of the classification predictions.

Book ChapterDOI
07 Oct 2012
TL;DR: A new hierarchical spatial model that can capture an exponential number of poses with a compact mixture representation on each part using latent nodes so that it can represent high-order spatial relationship among parts with exact inference.
Abstract: Human pose estimation requires a versatile yet well-constrained spatial model for grouping locally ambiguous parts together to produce a globally consistent hypothesis. Previous works either use local deformable models deviating from a certain template, or use a global mixture representation in the pose space. In this paper, we propose a new hierarchical spatial model that can capture an exponential number of poses with a compact mixture representation on each part. Using latent nodes, it can represent high-order spatial relationship among parts with exact inference. Different from recent hierarchical models that associate each latent node to a mixture of appearance templates (like HoG), we use the hierarchical structure as a pure spatial prior avoiding the large and often confounding appearance space. We verify the effectiveness of this model in three ways. First, samples representing human-like poses can be drawn from our model, showing its ability to capture high-order dependencies of parts. Second, our model achieves accurate reconstruction of unseen poses compared to a nearest neighbor pose representation. Finally, our model achieves state-of-art performance on three challenging datasets, and substantially outperforms recent hierarchical models.

Journal ArticleDOI
TL;DR: A novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure is put forth, known as the tEIGEN family.
Abstract: The last decade has seen an explosion of work on the use of mixture models for clustering The use of the Gaussian mixture model has been common practice, with constraints sometimes imposed upon the component covariance matrices to give families of mixture models Similar approaches have also been applied, albeit with less fecundity, to classification and discriminant analysis In this paper, we begin with an introduction to model-based clustering and a succinct account of the state-of-the-art We then put forth a novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure This family, which is largely a t-analogue of the well-known MCLUST family, is known as the tEIGEN family The efficacy of this family for clustering, classification, and discriminant analysis is illustrated with both real and simulated data The performance of this family is compared to its Gaussian counterpart on three real data sets

Journal ArticleDOI
TL;DR: A statistical model is developed that incorporates an adjusted limited dependent variable approach to reflect the upper bound and the large gap in feasible EQ-5D questionnaire values and demonstrated superior performance in a rheumatoid arthritis setting.

Book ChapterDOI
07 Oct 2012
TL;DR: This paper proposes a new image representation for texture categorization and facial analysis, relying on the use of higher-order local differential statistics as features, which consistently achieves state-of-the-art performance on challenging texture and facialAnalysis datasets outperforming contemporary methods.
Abstract: This paper proposes a new image representation for texture categorization and facial analysis, relying on the use of higher-order local differential statistics as features. In contrast with models based on the global structure of textures and faces, it has been shown recently that small local pixel pattern distributions can be highly discriminative. Motivated by such works, the proposed model employs higher-order statistics of local non-binarized pixel patterns for the image description. Hence, in addition to being remarkably simple, it requires neither any user specified quantization of the space (of pixel patterns) nor any heuristics for discarding low occupancy volumes of the space. This leads to a more expressive representation which, when combined with discriminative SVM classifier, consistently achieves state-of-the-art performance on challenging texture and facial analysis datasets outperforming contemporary methods (with similar powerful classifiers).

Journal ArticleDOI
TL;DR: An update to the Cyber-T web server is presented, incorporating several useful new additions and improvements, and several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available.
Abstract: The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.

Proceedings ArticleDOI
26 Aug 2012
TL;DR: Two novel unsupervised methods based on the notions of Non- localness and Geometric-Localness to prune noisy data from tweet messages are proposed to improve the baselines significantly and show comparable results with the supervised state-of-the-art method.
Abstract: We study the problem of predicting home locations of Twitter users using contents of their tweet messages. Using three probability models for locations, we compare both the Gaussian Mixture Model (GMM) and the Maximum Likelihood Estimation (MLE). In addition, we propose two novel unsupervised methods based on the notions of Non-Localness and Geometric-Localness to prune noisy data from tweet messages. In the experiments, our unsupervised approach improves the baselines significantly and shows comparable results with the supervised state-of-the-art method. For 5,113 Twitter users in the test set, on average, our approach with only 250 selected local words or less is able to predict their home locations (within 100 miles) with the accuracy of 0.499, or has 509.3 miles of average error distance at best.

Journal ArticleDOI
TL;DR: Expectation-maximization algorithms for fitting multivariate Gaussian mixture models to data that are truncated, censored or truncated and censored are presented.