Showing papers on "Expectation–maximization algorithm published in 2007"

PDF

Open Access

MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering †

[...]

01 Jan 2007

TL;DR: A number of features of the software have been changed in this version, and the functionality has been expanded to include regularization for normal mixture models via a Bayesian prior.

...read moreread less

Abstract: MCLUST is a contributed R package for normal mixture modeling and model-based clustering. It provides functions for parameter estimation via the EM algorithm for normal mixture models with a variety of covariance structures, and functions for simulation from these models. Also included are functions that combine model-based hierarchical clustering, EM for mixture estimation and the Bayesian Information Criterion (BIC) in comprehensive strategies for clustering, density estimation and discriminant analysis. There is additional functionality for displaying and visualizing the models along with clustering and classification results. A number of features of the software have been changed in this version, and the functionality has been expanded to include regularization for normal mixture models via a Bayesian prior. A web page with related links including license information can be found at http://www.stat.washington.edu/mclust.

...read moreread less

494 citations

Journal Article•DOI•

Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering

[...]

Chris Fraley¹, Adrian E. Raftery¹•Institutions (1)

University of Washington¹

01 Sep 2007-Journal of Classification

TL;DR: A modified version of BIC is proposed, where the likelihood is evaluated at the MAP instead of the MLE, and the resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE.

...read moreread less

Abstract: Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM algorithm may fail as the result of singularities or degeneracies. To avoid this, we propose replacing the MLE by a maximum a posteriori (MAP) estimator, also found by the EM algorithm. For choosing the number of components and the model parameterization, we propose a modified version of BIC, where the likelihood is evaluated at the MAP instead of the MLE. We use a highly dispersed proper conjugate prior, containing a small fraction of one observation's worth of information. The resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE, EM and BIC.

...read moreread less

434 citations

Proceedings Article•

Transferring naive bayes classifiers for text classification

[...]

Wenyuan Dai¹, Gui-Rong Xue¹, Qiang Yang², Yong Yu¹•Institutions (2)

Shanghai Jiao Tong University¹, Hong Kong University of Science and Technology²

22 Jul 2007

TL;DR: This paper proposes a novel transfer-learning algorithm for text classification based on an EM-based Naive Bayes classifiers and shows that the algorithm outperforms the traditional supervised and semi-supervised learning algorithms when the distributions of the training and test sets are increasingly different.

...read moreread less

Abstract: A basic assumption in traditional machine learning is that the training and test data distributions should be identical. This assumption may not hold in many situations in practice, but we may be forced to rely on a different-distribution data to learn a prediction model. For example, this may be the case when it is expensive to label the data in a domain of interest, although in a related but different domain there may be plenty of labeled data available. In this paper, we propose a novel transfer-learning algorithm for text classification based on an EM-based Naive Bayes classifiers. Our solution is to first estimate the initial probabilities under a distribution Dl of one labeled data set, and then use an EM algorithm to revise the model for a different distribution Du of the test data which are unlabeled. We show that our algorithm is very effective in several different pairs of domains, where the distances between the different distributions are measured using the Kullback-Leibler (KL) divergence. Moreover, KL-divergence is used to decide the trade-off parameters in our algorithm. In the experiment, our algorithm outperforms the traditional supervised and semi-supervised learning algorithms when the distributions of the training and test sets are increasingly different.

...read moreread less

392 citations

Journal Article•DOI•

Maximum likelihood estimation in semiparametric regression models with censored data

[...]

Donglin Zeng¹, Danyu Lin¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Sep 2007-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this paper, the authors present several classes of semiparametric regression models, which extend the existing models in important directions, and construct appropriate likelihood functions involving both finite dimensional and infinite dimensional parameters.

...read moreread less

Abstract: Summary. Semiparametric regression models play a central role in formulating the effects of covariates on potentially censored failure times and in the joint modelling of incomplete repeated measures and failure times in longitudinal studies. The presence of infinite dimensional parameters poses considerable theoretical and computational challenges in the statistical analysis of such models. We present several classes of semiparametric regression models, which extend the existing models in important directions. We construct appropriate likelihood functions involving both finite dimensional and infinite dimensional parameters. The maximum likelihood estimators are consistent and asymptotically normal with efficient variances. We develop simple and stable numerical techniques to implement the corresponding inference procedures. Extensive simulation experiments demonstrate that the inferential and computational methods proposed perform well in practical settings. Applications to three medical studies yield important new insights. We conclude that there is no reason, theoretical or numerical, not to use maximum likelihood estimation for semiparametric regression models.We discuss several areas that need further research.

...read moreread less

314 citations

Journal Article•

Penalized Model-Based Clustering with Application to Variable Selection

[...]

Wei Pan, Xiaotong Shen

01 May 2007-Journal of Machine Learning Research

TL;DR: A penalized likelihood approach with an L1 penalty function is proposed, automatically realizing variable selection via thresholding and delivering a sparse solution in model-based clustering analysis with a common diagonal covariance matrix.

...read moreread less

Abstract: Variable selection in clustering analysis is both challenging and important. In the context of model-based clustering analysis with a common diagonal covariance matrix, which is especially suitable for "high dimension, low sample size" settings, we propose a penalized likelihood approach with an L1 penalty function, automatically realizing variable selection via thresholding and delivering a sparse solution. We derive an EM algorithm to fit our proposed model, and propose a modified BIC as a model selection criterion to choose the number of components and the penalization parameter. A simulation study and an application to gene function prediction with gene expression profiles demonstrate the utility of our method.

...read moreread less

307 citations

Journal Article•DOI•

A new lifetime distribution

[...]

Coşkun Kuş¹•Institutions (1)

Selçuk University¹

01 May 2007-Computational Statistics & Data Analysis

TL;DR: The EM algorithm is used to determine the maximum likelihood estimates and the asymptotic variances and covariance of these estimates are obtained and the convergence of the proposed EM scheme is investigated.

...read moreread less

297 citations

Journal Article•DOI•

Towards a coherent statistical framework for dense deformable template estimation

[...]

Stéphanie Allassonnière¹, Yali Amit², Alain Trouvé¹•Institutions (2)

University of Paris¹, University of Chicago²

01 Feb 2007-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: A rigorous Bayesian framework is proposed for which it is proved asymptotic consistency of the maximum a posteriori estimate and which leads to an effective iterative estimation algorithm of the geometric and photometric parameters in the small sample setting.

...read moreread less

Abstract: Summary. The problem of estimating probabilistic deformable template models in the field of computer vision or of probabilistic atlases in the field of computational anatomy has not yet received a coherent statistical formulation and remains a challenge. We provide a careful definition and analysis of a well-defined statistical model based on dense deformable templates for grey level images of deformable objects. We propose a rigorous Bayesian framework for which we prove asymptotic consistency of the maximum a posteriori estimate and which leads to an effective iterative estimation algorithm of the geometric and photometric parameters in the small sample setting. The model is extended to mixtures of finite numbers of such components leading to a fine description of the photometric and geometric variations of an object class. We illustrate some of the ideas with images of handwritten digits and apply the estimated models to classification through maximum likelihood.

...read moreread less

261 citations

Book Chapter•DOI•

DENCLUE 2.0: fast clustering based on kernel density estimation

[...]

Alexander Hinneburg¹, Hans-Henning Gabriel²•Institutions (2)

Martin Luther University of Halle-Wittenberg¹, Otto-von-Guericke University Magdeburg²

06 Sep 2007

TL;DR: A new hill climbing procedure for Gaussian kernels, which adjusts the step size automatically at no extra costs is introduced and it is proved that the procedure converges exactly towards a local maximum by reducing it to a special case of the expectation maximization algorithm.

...read moreread less

Abstract: The Denclue algorithm employs a cluster model based on kernel density estimation A cluster is defined by a local maximum of the estimated density function Data points are assigned to clusters by hill climbing, ie points going to the same local maximum are put into the same cluster A disadvantage of Denclue 10 is, that the used hill climbing may make unnecessary small steps in the beginning and never converges exactly to the maximum, it just comes close We introduce a new hill climbing procedure for Gaussian kernels, which adjusts the step size automatically at no extra costs We prove that the procedure converges exactly towards a local maximum by reducing it to a special case of the expectation maximization algorithm We show experimentally that the new procedure needs much less iterations and can be accelerated by sampling based methods with sacrificing only a small amount of accuracy

...read moreread less

242 citations

Journal Article•DOI•

Image thresholding based on the EM algorithm and the generalized Gaussian distribution

[...]

Yakoub Bazi¹, Lorenzo Bruzzone¹, Farid Melgani¹•Institutions (1)

University of Trento¹

01 Feb 2007-Pattern Recognition

TL;DR: A novel parametric and global image histogram thresholding method based on the estimation of the statistical parameters of ''object'' and ''background'' classes by the expectation-maximization (EM) algorithm, under the assumption that these two classes follow a generalized Gaussian (GG) distribution.

...read moreread less

238 citations

Journal Article•

Finite mixture modelling using the skew normal distribution

[...]

Tsung-I Lin, Jack C. Lee, Shu Y. Yen

01 Jul 2007-Statistica Sinica

TL;DR: In this paper, the problem of analyzing a mixture of skew nor-mal distributions from the likelihood-based and Bayesian perspectives is addressed, and a fully Bayesian approach using the Markov chain Monte Carlo method is developed to carry out posterior analyses.

...read moreread less

Abstract: Normal mixture models provide the most popular framework for mod- elling heterogeneity in a population with continuous outcomes arising in a variety of subclasses. In the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied prob- lems. In this article, we address the problem of analyzing a mixture of skew nor- mal distributions from the likelihood-based and Bayesian perspectives, respectively. Computational techniques using EM-type algorithms are employed for iteratively computing maximum likelihood estimates. Also, a fully Bayesian approach using the Markov chain Monte Carlo method is developed to carry out posterior analyses. Numerical results are illustrated through two examples.

...read moreread less

205 citations

Journal Article•DOI•

Gaussian Mean-Shift Is an EM Algorithm

[...]

Miguel Á. Carreira-Perpiñán¹•Institutions (1)

Oregon Health & Science University¹

01 May 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown that, when the kernel is Gaussian, mean-shift is an expectation-maximization (EM) algorithm and, whenThe kernel is non-Gaussian,mean- shift is a generalized EM algorithm and that, in general, its convergence is of linear order.

...read moreread less

Abstract: The mean-shift algorithm, based on ideas proposed by Fukunaga and Hosteller, is a hill-climbing algorithm on the density defined by a finite mixture or a kernel density estimate Mean-shift can be used as a nonparametric clustering method and has attracted recent attention in computer vision applications such as image segmentation or tracking We show that, when the kernel is Gaussian, mean-shift is an expectation-maximization (EM) algorithm and, when the kernel is non-Gaussian, mean-shift is a generalized EM algorithm This implies that mean-shift converges from almost any starting point and that, in general, its convergence is of linear order For Gaussian mean-shift, we show: 1) the rate of linear convergence approaches 0 (superlinear convergence) for very narrow or very wide kernels, but is often close to 1 (thus, extremely slow) for intermediate widths and exactly 1 (sublinear convergence) for widths at which modes merge, 2) the iterates approach the mode along the local principal component of the data points from the inside of the convex hull of the data points, and 3) the convergence domains are nonconvex and can be disconnected and show fractal behavior We suggest ways of accelerating mean-shift based on the EM interpretation

...read moreread less

Journal Article•

A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

[...]

Sanjoy Dasgupta, Leonard J. Schulman

01 May 2007-Journal of Machine Learning Research

TL;DR: It is shown that, given data from a mixture of k well-separated spherical Gaussians in ℜd, a simple two-round variant of EM will, with high probability, learn the parameters of the Gaussian to near-optimal precision, if the dimension is high.

...read moreread less

Abstract: We show that, given data from a mixture of k well-separated spherical Gaussians in ℜd, a simple two-round variant of EM will, with high probability, learn the parameters of the Gaussians to near-optimal precision, if the dimension is high (d >> ln k). We relate this to previous theoretical and empirical work on the EM algorithm.

...read moreread less

Posted Content•

Model Selection Through Sparse Maximum Likelihood Estimation

[...]

Onureena Banerjee, Laurent El Ghaoui, Alexandre d'Aspremont

04 Jul 2007-arXiv: Artificial Intelligence

TL;DR: Two new algorithms for solving problems with at least a thousand nodes in the Gaussian case are presented, based on Nesterov's first order method, which yields a complexity estimate with a better dependence on problem size than existing interior point methods.

...read moreread less

Abstract: We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse Our approach is to solve a maximum likelihood problem with an added l_1-norm penalty term The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case Our first algorithm uses block coordinate descent, and can be interpreted as recursive l_1-norm penalized regression Our second algorithm, based on Nesterov's first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods Using a log determinant relaxation of the log partition function (Wainwright & Jordan (2006)), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case We test our algorithms on synthetic data, as well as on gene expression and senate voting records data

...read moreread less

Journal Article•DOI•

Robust mixture modeling using the skew t distribution

[...]

Tsung-I Lin¹, Jack C. Lee², Wan J. Hsieh²•Institutions (2)

National Chung Hsing University¹, National Chiao Tung University²

01 Jun 2007-Statistics and Computing

TL;DR: This article proposes a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings and presents analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates.

...read moreread less

Abstract: A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.

...read moreread less

Proceedings Article•

Expectation Maximization and Posterior Constraints

[...]

Kuzman Ganchev¹, Ben Taskar¹, João Gama¹•Institutions (1)

University of Pennsylvania¹

03 Dec 2007

TL;DR: This paper presents an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm, and shows that simple, intuitive posterior constraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.

...read moreread less

Abstract: The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily to find a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this. Unfortunately, it is typically difficult to add even simple a-priori information about latent variables in graphical models without making the models overly complex or intractable. In this paper, we present an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm. Our method can be used to learn tractable graphical models that satisfy additional, otherwise intractable constraints. Focusing on clustering and the alignment problem for statistical machine translation, we show that simple, intuitive posterior constraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.

...read moreread less

Journal Article•DOI•

Protein bioinformatics and mixtures of bivariate von Mises distributions for angular data.

[...]

Kanti V. Mardia¹, Charles C. Taylor¹, Ganesh Subramaniam²•Institutions (2)

University of Leeds¹, AT&T Labs²

01 Jun 2007-Biometrics

TL;DR: This work examines two natural bivariate von Mises distributions--referred to as Sine and Cosine models--which have five parameters and, for concentrated data, tend to a bivariate normal distribution, and sees that the Cosine model may be preferred.

...read moreread less

Abstract: Summary A fundamental problem in bioinformatics is to characterize the secondary structure of a protein, which has traditionally been carried out by examining a scatterplot (Ramachandran plot) of the conformational angles. We examine two natural bivariate von Mises distributions—referred to as Sine and Cosine models—which have five parameters and, for concentrated data, tend to a bivariate normal distribution. These are analyzed and their main properties derived. Conditions on the parameters are established which result in bimodal behavior for the joint density and the marginal distribution, and we note an interesting situation in which the joint density is bimodal but the marginal distributions are unimodal. We carry out comparisons of the two models, and it is seen that the Cosine model may be preferred. Mixture distributions of the Cosine model are fitted to two representative protein datasets using the expectation maximization algorithm, which results in an objective partition of the scatterplot into a number of components. Our results are consistent with empirical observations; new insights are discussed.

...read moreread less

Journal Article•DOI•

Online EM Algorithm for Latent Data Models

[...]

Olivier Cappé, Eric Moulines

27 Dec 2007-arXiv: Computation

TL;DR: In this article, a generic online version of the Expectation-Maximization (EM) algorithm is proposed for latent variable models of independent observations, which is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution.

...read moreread less

Abstract: In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model.

...read moreread less

Methods for large scale SVD with missing values

[...]

Miklós Kurucz, András A. Benczúr¹, Károly Csalogány•Institutions (1)

Hungarian Academy of Sciences¹

01 Jan 2007

TL;DR: A systematic exploration of expectation maximization methods based both on the Lanczos algorithm and power iteration for recommenders based solely on low rank approximations of the rating matrix.

...read moreread less

Abstract: We compare recommenders based solely on low rank approximations of the rating matrix. The key difficulty lies in the sparseness of the known ratings within the matrix that cause expactation maximization algorithms converge very slow. Among the prior publicly known attempts for this problem a gradient boosting approach proved most successful in spite of the fact that the resulting vectors are nonorthogonal and prone to numeric errors. We systematically explore expectation maximization methods based both on the Lanczos algorithm and power iteration; novel in this paper is the efficient handling of the dense estimate matrix used as input to a next iteration. We also compare sequence transformation methods to speed up convergence.

...read moreread less

Journal Article•DOI•

Statistical Reconstruction for Cosmic Ray Muon Tomography

[...]

Larry J. Schultz¹, G. S. Blanpied², Konstantin N. Borozdin¹, Andrew M. Fraser¹, Nicolas W. Hengartner¹, Alexei V. Klimenko, Christopher Morris¹, C. Oram¹, M. Sossong¹ - Show less +5 more•Institutions (2)

Los Alamos National Laboratory¹, University of South Carolina²

01 Aug 2007-IEEE Transactions on Image Processing

TL;DR: A maximum likelihood/expectation maximization maximization tomographic reconstruction algorithm designed for the technique which exploits the multiple Coulomb scattering of muon particles to perform nondestructive inspection without the use of artificial radiation.

...read moreread less

Abstract: Highly penetrating cosmic ray muons constantly shower the earth at a rate of about 1 muon per cm2 per minute We have developed a technique which exploits the multiple Coulomb scattering of these particles to perform nondestructive inspection without the use of artificial radiation In prior work , we have described heuristic methods for processing muon data to create reconstructed images In this paper, we present a maximum likelihood/expectation maximization tomographic reconstruction algorithm designed for the technique This algorithm borrows much from techniques used in medical imaging, particularly emission tomography, but the statistics of muon scattering dictates differences We describe the statistical model for multiple scattering, derive the reconstruction algorithm, and present simulated examples We also propose methods to improve the robustness of the algorithm to experimental errors and events departing from the statistical model

...read moreread less

Journal Article•DOI•

A Class-Adaptive Spatially Variant Mixture Model for Image Segmentation

[...]

Christophoros Nikou, Nikolas P. Galatsanos, Aristidis Likas

01 Apr 2007-IEEE Transactions on Image Processing

TL;DR: A new family of smoothness priors for the label probabilities in spatially variant mixture models with Gauss-Markov random field-based priors is proposed, which allow all their parameters to be estimated in closed form via the maximum a posteriori (MAP) estimation using the expectation-maximization methodology.

...read moreread less

Abstract: We propose a new approach for image segmentation based on a hierarchical and spatially variant mixture model. According to this model, the pixel labels are random variables and a smoothness prior is imposed on them. The main novelty of this work is a new family of smoothness priors for the label probabilities in spatially variant mixture models. These Gauss-Markov random field-based priors allow all their parameters to be estimated in closed form via the maximum a posteriori (MAP) estimation using the expectation-maximization methodology. Thus, it is possible to introduce priors with multiple parameters that adapt to different aspects of the data. Numerical experiments are presented where the proposed MAP algorithms were tested in various image segmentation scenarios. These experiments demonstrate that the proposed segmentation scheme compares favorably to both standard and previous spatially constrained mixture model-based segmentation

...read moreread less

Journal Article•DOI•

Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution

[...]

Geoff McLachlan¹, Richard Bean¹, L. Ben-Tovim Jones¹•Institutions (1)

University of Queensland¹

01 Jul 2007-Computational Statistics & Data Analysis

TL;DR: An EM-based algorithm is developed for the fitting of mixtures of t-factor analyzers and its application is demonstrated in the clustering of some microarray gene-expression data.

...read moreread less

Journal Article•DOI•

Common-input models for multiple neural spike-train data.

[...]

Jayant E. Kulkarni¹, Liam Paninski¹•Institutions (1)

Columbia University¹

01 Dec 2007-Network: Computation In Neural Systems

TL;DR: A multivariate point-process model in which the observed activity of a network of neurons depends on three terms: the experimentally-controlled stimulus; the spiking history of the observed neurons; and a hidden term that corresponds, for example, to common input from an unobserved population of neurons that is presynaptic to two or more cells in the observed population.

...read moreread less

Abstract: Recent developments in multi-electrode recordings enable the simultaneous measurement of the spiking activity of many neurons. Analysis of such multineuronal data is one of the key challenge in computational neuroscience today. In this work, we develop a multivariate point-process model in which the observed activity of a network of neurons depends on three terms: (1) the experimentally-controlled stimulus; (2) the spiking history of the observed neurons; and (3) a hidden term that corresponds, for example, to common input from an unobserved population of neurons that is presynaptic to two or more cells in the observed population. We consider two models for the network firing-rates, one of which is computationally and analytically tractable but can lead to unrealistically high firing-rates, while the other with reasonable firing-rates imposes a greater computational burden. We develop an expectation-maximization algorithm for fitting the parameters of both the models. For the analytically tractable model the expectation step is based on a continuous-time implementation of the extended Kalman smoother, and the maximization step involves two concave maximization problems which may be solved in parallel. The other model that we consider necessitates the use of Monte Carlo methods for the expectation as well as maximization step. We discuss the trade-off involved in choosing between the two models and the associated methods. The techniques developed allow us to solve a variety of inference problems in a straightforward, computationally efficient fashion; for example, we may use the model to predict network activity given an arbitrary stimulus, infer a neuron's ring rate given the stimulus and the activity of the other observed neurons, and perform optimal stimulus decoding and prediction. We present several detailed simulation studies which explore the strengths and limitations of our approach.

...read moreread less

Journal Article•DOI•

Multi-target state estimation and track continuity for the particle PHD filter

[...]

Daniel E. Clark¹, J. Bell¹•Institutions (1)

Heriot-Watt University¹

01 Oct 2007-IEEE Transactions on Aerospace and Electronic Systems

TL;DR: In this article, a particle filter approach for approximating the first-order moment of a joint, or probability hypothesis density (PHD), has demonstrated a feasible suboptimal method for tracking a time-varying number of targets in real-time.

...read moreread less

Abstract: Particle filter approaches for approximating the first-order moment of a joint, or probability hypothesis density (PHD), have demonstrated a feasible suboptimal method for tracking a time-varying number of targets in real-time. We consider two techniques for estimating the target states at each iteration, namely k-means clustering and mixture modelling via the expectation-maximization (EM) algorithm. We present novel techniques for associating the targets between frames to enable track continuity.

...read moreread less

Journal Article•

Missing data: A comparison of neural network and expectation maximization techniques

[...]

Fulufhelo V. Nelwamondo¹, Shakir Mohamed, Tshilidzi Marwala•Institutions (1)

University of the Witwatersrand¹

01 Jan 2007-Current Science

TL;DR: A comparison of the two techniques for missing data imputation using datasets of an industrial power plant, an industrial winding process and HIV sero-prevalence survey data shows that the EM algorithm is more suitable and performs better in cases where there is little or no interdependency between the input variables.

...read moreread less

Abstract: Two techniques have emerged from the recent litera-ture as candidate solutions to the problem of missing data imputation These are the expectation maximiza-tion (EM) algorithm and the auto-associative neural network and genetic algorithm (GA) combination Both these techniques have been discussed individually and their merits discussed at length in the available literature However, they have not been compared with each other This article provides a comparison of the two techniques using datasets of an industrial power plant, an industrial winding process and HIV sero-prevalence survey data Results show that the EM al-gorithm is more suitable and performs better in cases where there is little or no interdependency between the input variables, whereas the auto-associative neural network and GA combination is suitable when there are inherent nonlinear relationships between some of the given variables Keywords:

...read moreread less

Proceedings Article•DOI•

Cluster analysis of heterogeneous rank data

[...]

Ludwig M. Busse¹, Peter Orbanz¹, Joachim M. Buhmann¹•Institutions (1)

ETH Zurich¹

20 Jun 2007

TL;DR: Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model and demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.

...read moreread less

Abstract: Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.

...read moreread less

Journal Article•DOI•

EM enhancement of 3D head pose estimated by point at infinity

[...]

Jian-Gang Wang¹, Eric Sung²•Institutions (2)

Institute for Infocomm Research Singapore¹, Nanyang Technological University²

01 Dec 2007-Image and Vision Computing

TL;DR: A new approach is proposed for estimating 3D head pose from a monocular image that employs general prior knowledge of face structure and the corresponding geometrical constraints provided by the location of a certain vanishing point to determine the pose of human faces.

...read moreread less

Journal Article•DOI•

A Spatially Constrained Generative Model and an EM Algorithm for Image Segmentation

[...]

A. Diplaros¹, Nikos Vlassis², Theo Gevers¹•Institutions (2)

University of Amsterdam¹, Technical University of Crete²

01 May 2007-IEEE Transactions on Neural Networks

TL;DR: A novel spatially constrained generative model and an expectation-maximization (EM) algorithm for model-based image segmentation that achieves competitive segmentation results compared to other Markov-based methods and is in general faster.

...read moreread less

Abstract: In this paper, we present a novel spatially constrained generative model and an expectation-maximization (EM) algorithm for model-based image segmentation. The generative model assumes that the unobserved class labels of neighboring pixels in the image are generated by prior distributions with similar parameters, where similarity is defined by entropic quantities relating to the neighboring priors. In order to estimate model parameters from observations, we derive a spatially constrained EM algorithm that iteratively maximizes a lower bound on the data log-likelihood, where the penalty term is data-dependent. Our algorithm is very easy to implement and is similar to the standard EM algorithm for Gaussian mixtures with the main difference that the labels posteriors are "smoothed" over pixels between each E- and M-step by a standard image filter. Experiments on synthetic and real images show that our algorithm achieves competitive segmentation results compared to other Markov-based methods, and is in general faster

...read moreread less

Journal Article•DOI•

Feature-Preserving MRI Denoising: A Nonparametric Empirical Bayes Approach

[...]

Suyash P. Awate¹, Ross T. Whitaker²•Institutions (2)

University of Pennsylvania¹, University of Utah²

04 Sep 2007-IEEE Transactions on Medical Imaging

TL;DR: A novel method for Bayesian denoising of magnetic resonance (MR) images that bootstraps itself by inferring the prior, i.e., the uncorrupted-image statistics, from the corrupted input data and the knowledge of the Rician noise model is presented.

...read moreread less

Abstract: This paper presents a novel method for Bayesian denoising of magnetic resonance (MR) images that bootstraps itself by inferring the prior, i.e., the uncorrupted-image statistics, from the corrupted input data and the knowledge of the Rician noise model. The proposed method relies on principles from empirical Bayes (EB) estimation. It models the prior in a nonparametric Markov random field (MRF) framework and estimates this prior by optimizing an information-theoretic metric using the expectation-maximization algorithm. The generality and power of nonparametric modeling, coupled with the EB approach for prior estimation, avoids imposing ill-fitting prior models for denoising. The results demonstrate that, unlike typical denoising methods, the proposed method preserves most of the important features in brain MR images. Furthermore, this paper presents a novel Bayesian-inference algorithm on MRFs, namely iterated conditional entropy reduction (ICER). This paper also extends the application of the proposed method for denoising diffusion-weighted MR images. Validation results and quantitative comparisons with the state of the art in MR-image denoising clearly depict the advantages of the proposed method.

...read moreread less

Journal Article•DOI•

Finite mixtures of multivariate Poisson distributions with application

[...]

Dimitris Karlis¹, Loukia Meligkotsidou²•Institutions (2)

Athens University of Economics and Business¹, Lancaster University²

01 Jun 2007-Journal of Statistical Planning and Inference

TL;DR: In this article, the authors examined finite mixtures of multivariate Poisson distributions as an alternative class of models for multivariate count data, allowing for both overdispersion in the marginal distributions and negative correlation, while they are computationally tractable using standard ideas from finite mixture modelling.

...read moreread less

Journal Article•DOI•

The Variational Inference Approach to Joint Data Detection and Phase Noise Estimation in OFDM

[...]

Darryl Dexu Lin¹, Teng Joon Lim¹•Institutions (1)

University of Toronto¹

01 May 2007-IEEE Transactions on Signal Processing

TL;DR: A systematic probabilistic framework that leads to both optimal and near-optimal OFDM detection schemes in the presence of unknown PHN is presented and it is pointed out that the expectation-maximization algorithm is a special case of the variational-inference-based joint estimator.

...read moreread less

Abstract: This paper studies the mitigation of phase noise (PHN) in orthogonal frequency-division multiplexing (OFDM) data detection. We present a systematic probabilistic framework that leads to both optimal and near-optimal OFDM detection schemes in the presence of unknown PHN. In contrast to the conventional approach that cancels the common (average) PHN, our aim is to jointly estimate the complete PHN sequence and the data symbol sequence. We derive a family of low-complexity OFDM detectors for this purpose. The theoretical foundation on which these detectors are based is called variational inference, an approximate probabilistic inference technique associated with the minimization of variational free energy. In deriving the proposed schemes, we also point out that the expectation-maximization algorithm is a special case of the variational-inference-based joint estimator. Further complexity reduction is obtained using the conjugate gradient (CG) method, and only a few CG iterations are needed to closely approach the ideal joint estimator output

...read moreread less

Collapse