Showing papers on "Expectation–maximization algorithm published in 2003"

PDF

Open Access

Dissertation•

Variational Algorithms for Approximate Bayesian Inference

[...]

01 Jan 2003

TL;DR: A unified variational Bayesian (VB) framework which approximates computations in models with latent variables using a lower bound on the marginal likelihood and is compared to other methods including sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC.

...read moreread less

Abstract: The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents a unified variational Bayesian (VB) framework which approximates these computations in models with latent variables using a lower bound on the marginal likelihood. Chapter 1 presents background material on Bayesian inference, graphical models, and propagation algorithms. Chapter 2 forms the theoretical core of the thesis, generalising the expectation- maximisation (EM) algorithm for learning maximum likelihood parameters to the VB EM algorithm which integrates over model parameters. The algorithm is then specialised to the large family of conjugate-exponential (CE) graphical models, and several theorems are presented to pave the road for automated VB derivation procedures in both directed and undirected graphs (Bayesian and Markov networks, respectively). Chapters 3–5 derive and apply the VB EM algorithm to three commonly-used and important models: mixtures of factor analysers, linear dynamical systems, and hidden Markov models. It is shown how model selection tasks such as determining the dimensionality, cardinality, or number of variables are possible using VB approximations. Also explored are methods for combining sampling procedures with variational approximations, to estimate the tightness of VB bounds and to obtain more effective sampling algorithms. Chapter 6 applies VB learning to a long-standing problem of scoring discrete-variable directed acyclic graphs, and compares the performance to annealed importance sampling amongst other methods. Throughout, the VB approximation is compared to other methods including sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC. The thesis concludes with a discussion of evolving directions for model selection including infinite models and alternative approximations to the marginal likelihood.

...read moreread less

1,930 citations

Journal Article•DOI•

Multivariate probit regression using simulated maximum likelihood

[...]

Lorenzo Cappellari¹, Stephen P. Jenkins²•Institutions (2)

University of Eastern Piedmont¹, University of Essex²

01 Sep 2003-Stata Journal

TL;DR: In this paper, the application of the GHK simulation method for maximum likelihood estimation of the multivariate probit regression model is discussed, and a Stata program mvprobit is described.

...read moreread less

Abstract: We discuss the application of the GHK simulation method for maximum likelihood estimation of the multivariate probit regression model and describe and illustrate a Stata program mvprobit for this purpose.

...read moreread less

962 citations

Journal Article•DOI•

Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models

[...]

Christophe Biernacki¹, Gilles Celeux², Gérard Govaert³•Institutions (3)

University of Franche-Comté¹, French Institute for Research in Computer Science and Automation², University of Technology of Compiègne³

28 Jan 2003-Computational Statistics & Data Analysis

TL;DR: Simple methods to choose sensible starting values for the EM algorithm to get maximum likelihood parameter estimation in mixture models are compared and the simple random initialization which is probably the most employed way of initiating EM is often outperformed by strategies using CEM, SEM or shorts runs of EM before running EM.

...read moreread less

619 citations

Journal Article•DOI•

Longitudinal Modeling with Randomly and Systematically Missing Data: A Simulation of Ad Hoc, Maximum Likelihood, and Multiple Imputation Techniques

[...]

Daniel A. Newman

01 Jul 2003-Organizational Research Methods

TL;DR: This article compares six missing data techniques (MDTs) and supports maximum likelihood and MI approaches, which particularly outperform listwise deletion for parameters involving many recouped cases.

...read moreread less

Abstract: For organizational research on individual change, missing data can greatly reduce longitudinal sample size and potentially bias parameter estimates. Within the structural equation modeling framework, this article compares six missing data techniques (MDTs): listwise deletion, pairwise deletion, stochastic regression imputation, the expectation-maximization (EM) algorithm, full information maximization likelihood (FIML), and multiple imputation (MI). The rationale for each technique is reviewed, followed by Monte Carlo analysis based on a threewave simulation of organizational commitment and turnover intentions. Parameter estimates and standard errors for each MDT are contrasted with complete-data estimates, under three mechanisms of missingness (completely random, random, and nonrandom) and three levels of missingness (25%, 50%, and 75%; all monotone missing). Results support maximum likelihood and MI approaches, which particularly outperform listwise deletion for parameters involving many recouped cases....

...read moreread less

556 citations

Journal Article•DOI•

Learning Occupancy Grid Maps with Forward Sensor Models

[...]

Sebastian Thrun¹•Institutions (1)

Stanford University¹

01 Sep 2003-Autonomous Robots

TL;DR: A new algorithm for acquiring occupancy grid maps with mobile robots that employs the expectation maximization algorithm for searching maps that maximize the likelihood of the sensor measurements, and is often more accurate than those generated using traditional techniques.

...read moreread less

Abstract: This article describes a new algorithm for acquiring occupancy grid maps with mobile robots. Existing occupancy grid mapping algorithms decompose the high-dimensional mapping problem into a collection of one-dimensional problems, where the occupancy of each grid cell is estimated independently. This induces conflicts that may lead to inconsistent maps, even for noise-free sensors. This article shows how to solve the mapping problem in the original, high-dimensional space, thereby maintaining all dependencies between neighboring cells. As a result, maps generated by our approach are often more accurate than those generated using traditional techniques. Our approach relies on a statistical formulation of the mapping problem using forward models. It employs the expectation maximization algorithm for searching maps that maximize the likelihood of the sensor measurements.

...read moreread less

550 citations

The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures

[...]

Matthew J. Beal, Zoubin Ghahramani

16 Sep 2003

TL;DR: This method constructs and optimises a lower bound on the marginal likelihood using variational calculus, resulting in an iterative algorithm which generalises the EM algorithm by maintaining posterior distributions over both latent variables and parameters.

...read moreread less

Abstract: We present an efficient procedure for estimating the marginal likelihood of probabilistic models with latent variables or incomplete data. This method constructs and optimises a lower bound on the marginal likelihood using variational calculus, resulting in an iterative algorithm which generalises the EM algorithm by maintaining posterior distributions over both latent variables and parameters. We define the family of conjugate-exponential models—which includes finite mixtures of exponential family models, factor analysis, hidden Markov models, linear state-space models, and other models of interest—for which this bound on the marginal likelihood can be computed very simply through a modification of the standard EM algorithm. In particular, we focus on applying these bounds to the problem of scoring discrete directed graphical model structures (Bayesian networks). Extensive simulations comparing the variational bounds to the usual approach based on the Bayesian Information Criterion (BIC) and to a sampling-based gold standard method known as Annealed Importance Sampling (AIS) show that variational bounds substantially outperform BIC in finding the correct model structure at relatively little computational cost, while approaching the performance of the much more costly AIS procedure. Using AIS allows us to provide the first serious case study of the tightness of variational bounds. We also analyse the perfomance of AIS through a variety of criteria, discuss the use of other variational approaches to estimating marginal likelihoods based on Bethe and Kikuchi approximations, and outline directions in which this work can be extended.

...read moreread less

527 citations

Journal Article•DOI•

Gene networks inference using dynamic Bayesian networks

[...]

Bruno-Edouard Perrin¹, Liva Ralaivola, Aurélien J. Mazurie, Samuele Bottani, Jacques Mallet, Florence d'Alché-Buc - Show less +2 more•Institutions (1)

Centre national de la recherche scientifique¹

27 Sep 2003

TL;DR: This article deals with the identification of gene regulatory networks from experimental data using a statistical machine learning approach that can be described as a dynamic Bayesian network particularly well suited to tackle the stochastic nature of gene regulation and gene expression measurement.

...read moreread less

Abstract: This article deals with the identification of gene regulatory networks from experimental data using a statistical machine learning approach. A stochastic model of gene interactions capable of handling missing variables is proposed. It can be described as a dynamic Bayesian network particularly well suited to tackle the stochastic nature of gene regulation and gene expression measurement. Parameters of the model are learned through a penalized likelihood maximization implemented through an extended version of EM algorithm. Our approach is tested against experimental data relative to the S.O.S. DNA Repair network of the Escherichia coli bacterium. It appears to be able to extract the main regulations between the genes involved in this network. An added missing variable is found to model the main protein of the network. Good prediction abilities on unlearned data are observed. These first results are very promising: they show the power of the learning algorithm and the ability of the model to capture gene interactions.

...read moreread less

462 citations

Journal Article•DOI•

Estimating a state-space model from point process observations

[...]

Anne C. Smith¹, Emery N. Brown¹•Institutions (1)

Harvard University¹

01 May 2003-Neural Computation

TL;DR: Inspired by neurophysiology experiments in which neural spiking activity is induced by an implicit (latent) stimulus, an algorithm to estimate a state-space model observed through point process measurements is developed.

...read moreread less

Abstract: A widely used signal processing paradigm is the state-space model. The state-space model is defined by two equations: an observation equation that describes how the hidden state or latent process is observed and a state equation that defines the evolution of the process through time. Inspired by neurophysiology experiments in which neural spiking activity is induced by an implicit (latent) stimulus, we develop an algorithm to estimate a state-space model observed through point process measurements. We represent the latent process modulating the neural spiking activity as a gaussian autoregressive model driven by an external stimulus. Given the latent process, neural spiking activity is characterized as a general point process defined by its conditional intensity function. We develop an approximate expectation-maximization (EM) algorithm to estimate the unobservable state-space process, its parameters, and the parameters of the point process. The EM algorithm combines a point process recursive nonlinear filter algorithm, the fixed interval smoothing algorithm, and the state-space covariance algorithm to compute the complete data log likelihood efficiently. We use a Kolmogorov-Smirnov test based on the time-rescaling theorem to evaluate agreement between the model and point process data. We illustrate the model with two simulated data examples: an ensemble of Poisson neurons driven by a common stimulus and a single neuron whose conditional intensity function is approximated as a local Bernoulli process.

...read moreread less

407 citations

Journal Article•DOI•

Multilevel Latent Class Models

[...]

Jeroen K. Vermunt¹•Institutions (1)

Tilburg University¹

01 Jan 2003-Sociological Methodology

TL;DR: In this article, Parametric and non-parametric random-coefficient latent class models are proposed to modify the assumption that observations are independent, which can be used for the analysis of data collected with complex sampling designs, data with a multilevel structure, and multiple-group data for more than a few groups.

...read moreread less

Abstract: The latent class (LC) models that have been developed so far assume that observations are independent. Parametric and non-parametric random-coefficient LC models are proposed here, which will make it possible to modify this assumption. For example, the models can be used for the analysis of data collected with complex sampling designs, data with a multilevel structure, and multiple-group data for more than a few groups. An adapted EM algorithm is presented that makes maximum-likelihood estimation feasible. The new model is illustrated with examples from organizational, educational, and cross-national comparative research.

...read moreread less

388 citations

Journal Article•DOI•

Efficient greedy learning of Gaussian mixture models

[...]

Jakob Verbeek¹, Nikos Vlassis¹, Ben Kröse¹•Institutions (1)

University of Amsterdam¹

01 Feb 2003-Neural Computation

TL;DR: A heuristic for searching for the optimal component to insert in the greedy learning of gaussian mixtures is proposed and can be particularly useful when the optimal number of mixture components is unknown.

...read moreread less

Abstract: This article concerns the greedy learning of gaussian mixtures. In the greedy approach, mixture components are inserted into the mixture one after the other. We propose a heuristic for searching for the optimal component to insert. In a randomized manner, a set of candidate new components is generated. For each of these candidates, we find the locally optimal new component and insert it into the existing mixture. The resulting algorithm resolves the sensitivity to initialization of state-of-the-art methods, like expectation maximization, and has running time linear in the number of data points and quadratic in the (final) number of mixture components. Due to its greedy nature, the algorithm can be particularly useful when the optimal number of mixture components is unknown. Experimental results comparing the proposed algorithm to other methods on density estimation and texture segmentation are provided.

...read moreread less

380 citations

Journal Article•DOI•

Robust, automatic spike sorting using mixtures of multivariate t-distributions.

[...]

Shy Shoham¹, Matthew R. Fellows², Richard A. Normann¹•Institutions (2)

University of Utah¹, Brown University²

15 Aug 2003-Journal of Neuroscience Methods

TL;DR: An adaptation of a new expectation-maximization based competitive mixture decomposition algorithm is introduced and it is shown that it efficiently and reliably performs mixture decompositions of t-distributions.

...read moreread less

Proceedings Article•

The IM algorithm: a variational approach to Information Maximization

[...]

David Barber¹, Felix Agakov¹•Institutions (1)

University of Edinburgh¹

09 Dec 2003

TL;DR: The resulting IM algorithm is analagous to the EM algorithm, yet maximises mutual information, as opposed to likelihood, in the maximisation of information transmission over noisy channels.

...read moreread less

Abstract: The maximisation of information transmission over noisy channels is a common, albeit generally computationally difficult problem. We approach the difficulty of computing the mutual information for noisy channels by using a variational approximation. The resulting IM algorithm is analagous to the EM algorithm, yet maximises mutual information, as opposed to likelihood. We apply the method to several practical examples, including linear compression, population encoding and CDMA.

...read moreread less

Journal Article•DOI•

The comparative efficacy of imputation methods for missing data in structural equation modeling

[...]

Alan Olinsky¹, Shaw K. Chen², Lisa L. Harlow³•Institutions (3)

Bryant University¹, College of Business Administration², University of Rhode Island³

16 Nov 2003-European Journal of Operational Research

TL;DR: This paper compares the efficacy of five current, and promising, methods that can be used to deal with missing data and concludes that MI, because of its theoretical and distributional underpinnings, is probably most promising for future applications in this field.

...read moreread less

Journal Article•DOI•

Choosing initial values for the EM algorithm for finite mixtures

[...]

Dimitris Karlis¹, Evdokia Xekalaki¹•Institutions (1)

Athens University of Economics and Business¹

28 Jan 2003-Computational Statistics & Data Analysis

TL;DR: Several methods for choosing initial values for the EM algorithm in the case of finite mixtures are compared as well as to propose some new methods based on modifications of existing ones.

...read moreread less

Journal Article•DOI•

Using the expectation maximization algorithm to estimate coefficient alpha for scales with item-level missing data.

[...]

Craig K. Enders¹•Institutions (1)

University of Miami¹

01 Sep 2003-Psychological Methods

TL;DR: The 2-step approach using EM consistently yielded the most accurate reliability estimates and produced coverage rates close to the advertised 95% rate.

...read moreread less

Abstract: A 2-step approach for obtaining internal consistency reliability estimates with item-level missing data is outlined. In the 1st step, a covariance matrix and mean vector are obtained using the expectation maximization (EM) algorithm. In the 2nd step, reliability analyses are carried out in the usual fashion using the EM covariance matrix as input. A Monte Carlo simulation examined the impact of 6 variables (scale length, response categories, item correlations, sample size, missing data, and missing data technique) on 3 different outcomes: estimation bias, mean errors, and confidence interval coverage. The 2-step approach using EM consistently yielded the most accurate reliability estimates and produced coverage rates close to the advertised 95% rate. An easy method of implementing the procedure is outlined.

...read moreread less

Journal Article•DOI•

Transformation-invariant clustering using the EM algorithm

[...]

Brendan J. Frey¹, Nebojsa Jojic²•Institutions (2)

University of Toronto¹, Microsoft²

01 Jan 2003-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input, by approximating the nonlinear transformation manifold by a discrete set of points.

...read moreread less

Abstract: Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.

...read moreread less

Journal Article•DOI•

Unsupervised learning of human motion

[...]

Yang Song, Luís F. Gonçalves, Pietro Perona¹•Institutions (1)

California Institute of Technology¹

01 Jul 2003-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An unsupervised learning algorithm that can obtain a probabilistic model of an object composed of a collection of parts automatically from unlabeled training data is presented.

...read moreread less

Abstract: An unsupervised learning algorithm that can obtain a probabilistic model of an object composed of a collection of parts (a moving human body in our examples) automatically from unlabeled training data is presented. The training data include both useful "foreground" features as well as features that arise from irrelevant background clutter - the correspondence between parts and detected features is unknown. The joint probability density function of the parts is represented by a mixture of decomposable triangulated graphs which allow for fast detection. To learn the model structure as well as model parameters, an EM-like algorithm is developed where the labeling of the data (part assignments) is treated as hidden variables. The unsupervised learning technique is not limited to decomposable triangulated graphs. The efficiency and effectiveness of our algorithm is demonstrated by applying it to generate models of human motion automatically from unlabeled image sequences, and testing the learned models on a variety of sequences.

...read moreread less

Journal Article•DOI•

A new statistical wideband spatio-temporal channel model for 5-GHz band WLAN systems

[...]

Chia-Chin Chong¹, CM Tan², David Laurenson¹, Stephen McLaughlin¹, Mark A Beach, Andrew R Nix - Show less +2 more•Institutions (2)

University of Edinburgh¹, University of Bristol²

19 Feb 2003-IEEE Journal on Selected Areas in Communications

TL;DR: A new statistical wideband indoor channel model which incorporates both the clustering of multipath components (MPCs) and the correlation between the spatial and temporal domains is proposed and the model validity is confirmed by comparison with two existing models reported in the literature.

...read moreread less

Abstract: In this paper, a new statistical wideband indoor channel model which incorporates both the clustering of multipath components (MPCs) and the correlation between the spatial and temporal domains is proposed. The model is derived based on measurement data collected at a carrier frequency of 5.2 GHz in three different indoor scenarios and is suitable for performance analysis of HIPERLAN/2 and IEEE 802.11a systems that employ smart antenna architectures. MPC parameters are estimated using the super-resolution frequency domain space-alternating generalized expectation maximization (FD-SAGE) algorithm and clusters are identified in the spatio-temporal domain by a nonparametric density estimation procedure. The description of the clustering observed within the channel relies on two classes of parameters, namely, intercluster and intracluster parameters which characterize the cluster and MPC, respectively. All parameters are described by a set of empirical probability density functions (pdfs) derived from the measured data. The correlation properties are incorporated in two joint pdfs for cluster and MPC positions, respectively. The clustering effect also gives rise to two classes of channel power density spectra (PDS)-intercluster and intracluster PDS-which are shown to exhibit exponential and Laplacian functions in the delay and angular domains, respectively. Finally, the model validity is confirmed by comparison with two existing models reported in the literature.

...read moreread less

Proceedings Article•DOI•

Discovering clusters in motion time-series data

[...]

Jonathan Alon¹, Stan Sclaroff¹, George Kollios¹, Vladimir Pavlovic²•Institutions (2)

Boston University¹, Rutgers University²

18 Jun 2003

TL;DR: An approach is proposed for clustering time-series data that allows each sequence to belong to more than a single HMM with some probability, and the hard decision about the sequence class membership can be deferred until a later time when such a decision is required.

...read moreread less

Abstract: An approach is proposed for clustering time-series data. The approach can be used to discover groupings of similar object motions that were observed in a video collection. A finite mixture of hidden Markov models (HMMs) is fitted to the motion data using the expectation maximization (EM) framework. Previous approaches for HMM-based clustering employ a k-means formulation, where each sequence is assigned to only a single HMM. In contrast, the formulation presented in this paper allows each sequence to belong to more than a single HMM with some probability, and the hard decision about the sequence class membership can be deferred until a later time when such a decision is required. Experiments with simulated data demonstrate the benefit of using this EM-based approach when there is more "overlap" in the processes generating the data. Experiments with real data show the promising potential of HMM-based motion clustering in a number of applications.

...read moreread less

Journal Article•DOI•

EM algorithm system modeling by image-space techniques for PET reconstruction

[...]

Andrew J. Reader¹, Peter J Julyan, Heather Williams, D.L. Hastings, Jamal Zweit¹ - Show less +1 more•Institutions (1)

University of Manchester¹

14 Oct 2003-IEEE Transactions on Nuclear Science

TL;DR: The method demonstrates improved image quality in all cases when compared to the conventional FBP and EM methods presently used for clinical data (which do not include resolution modeling).

...read moreread less

Abstract: Methodology for PET system modeling using image-space techniques in the expectation maximization (EM) algorithm is presented. The approach, applicable to both list-mode data and projection data, is of particular significance to EM algorithm implementations which otherwise only use basic system models (such as those which calculate the system matrix elements on the fly). A basic version of the proposed technique can be implemented using image-space convolution, in order to include resolution effects into the system matrix, so that the EM algorithm gradually recovers the modeled resolution with each update. The improved system modeling (achieved by inclusion of two convolutions per iteration) results in both enhanced resolution and lower noise, and there is often no need for regularization-other than to limit the number of iterations. Tests have been performed with simulated list-mode data and also with measured projection data from a GE Advance PET scanner, for both [/sup 18/F]-FDG and [/sup 124/I]-NaI. The method demonstrates improved image quality in all cases when compared to the conventional FBP and EM methods presently used for clinical data (which do not include resolution modeling). The benefits of this approach for /sup 124/I (which has a low positron yield and a large positron range, usually resulting in noisier and poorer resolution images) are particularly noticeable.

...read moreread less

Journal Article•DOI•

Estimating Hidden Semi-Markov Chains From Discrete Sequences

[...]

Yann Guédon

01 Sep 2003-Journal of Computational and Graphical Statistics

TL;DR: A new forward-backward algorithm is proposed whose complexity is similar to that of the Viterbi algorithm in terms of sequence length (quadratic in the worst case in time and linear in space) and opens the way to the maximum likelihood estimation of hidden semi-Markov chains from long sequences.

...read moreread less

Abstract: This article addresses the estimation of hidden semi-Markov chains from nonstationary discrete sequences. Hidden semi-Markov chains are particularly useful to model the succession of homogeneous zones or segments along sequences. A discrete hidden semi-Markov chain is composed of a nonobservable state process, which is a semi-Markov chain, and a discrete output process. Hidden semi-Markov chains generalize hidden Markov chains and enable the modeling of various durational structures. From an algorithmic point of view, a new forward-backward algorithm is proposed whose complexity is similar to that of the Viterbi algorithm in terms of sequence length (quadratic in the worst case in time and linear in space). This opens the way to the maximum likelihood estimation of hidden semi-Markov chains from long sequences. This statistical modeling approach is illustrated by the analysis of branching and flowering patterns in plants.

...read moreread less

Journal Article•DOI•

Maximum pseudo likelihood estimation in network tomography

[...]

Gang Liang¹, Bin Yu¹•Institutions (1)

University of California, Berkeley¹

01 Aug 2003-IEEE Transactions on Signal Processing

TL;DR: A pseudo likelihood approach is proposed to solve a group of network tomography problems and some statistical properties of the pseudo likelihood estimator, such as consistency and asymptotic normality, are established.

...read moreread less

Abstract: Network monitoring and diagnosis are key to improving network performance. The difficulties of performance monitoring lie in today's fast growing Internet, accompanied by increasingly heterogeneous and unregulated structures. Moreover, these tasks become even harder since one cannot rely on the collaboration of individual routers and servers to measure network traffic directly. Even though the aggregative nature of possible network measurements gives rise to inverse problems, existing methods for solving inverse problems are usually computationally intractable or statistically inefficient. A pseudo likelihood approach is proposed to solve a group of network tomography problems. The basic idea of pseudo likelihood is to form simple subproblems and ignore the dependences among the subproblems to form a product likelihood of the subproblems. As a result, this approach keeps a good balance between the computational complexity and the statistical efficiency of the parameter estimation. Some statistical properties of the pseudo likelihood estimator, such as consistency and asymptotic normality, are established. A pseudo expectation-maximization (EM) algorithm is developed to maximize the pseudo log-likelihood function. Two examples, with simulated or real data, are used to illustrate the pseudo likelihood proposal: 1) inference of the internal link delay distributions through multicast end-to-end measurements; 2) origin-destination matrix estimation through link traffic counts.

...read moreread less

Journal Article•DOI•

Finite Mixture Distributions, Sequential Likelihood and the EM Algorithm

[...]

Peter Arcidiacono¹, John Bailey Jones²•Institutions (2)

Duke University¹, University at Albany, SUNY²

01 May 2003-Econometrica

TL;DR: In this article, an extension of the EM algorithm reintroduced additive separability, thus allowing one to estimate parameters sequentially during each maximization step, and they showed that, relative to full information maximum likelihood, their sequential estimator can generate large computational savings with little loss of efficiency.

...read moreread less

Abstract: A popular way to account for unobserved heterogeneity is to assume that the data are drawn from a finite mixture distribution. A barrier to using finite mixture models is that parameters that could previously be estimated in stages must now be estimated jointly: using mixture distributions destroys any additive separability of the log-likelihood function. We show, however, that an extension of the EM algorithm reintroduces additive separability, thus allowing one to estimate parameters sequentially during each maximization step. In establishing this result, we develop a broad class of estimators for mixture models. Returning to the likelihood problem, we show that, relative to full information maximum likelihood, our sequential estimator can generate large computational savings with little loss of efficiency.

...read moreread less

Journal Article•DOI•

Network delay tomography

[...]

Yolanda Tsang¹, Mark Coates², Robert Nowak³•Institutions (3)

Rice University¹, McGill University², University of Wisconsin-Madison³

01 Aug 2003-IEEE Transactions on Signal Processing

TL;DR: This paper presents a novel methodology for inferring the queuing delay distributions across internal links in the network based solely on unicast, end-to-end measurements and develops a new estimation methodology based on recently proposed nonparametric, wavelet-based density estimation method.

...read moreread less

Abstract: The substantial overhead of performing internal network monitoring motivates techniques for inferring spatially localized information about performance using only end-to-end measurements. In this paper, we present a novel methodology for inferring the queuing delay distributions across internal links in the network based solely on unicast, end-to-end measurements. The major contributions are: 1) we formulate a measurement procedure for estimation and localization of delay distribution based on end-to-end packet pairs; 2) we develop a simple way to compute maximum likelihood estimates (MLEs) using the expectation-maximization (EM) algorithm; 3) we develop a new estimation methodology based on recently proposed nonparametric, wavelet-based density estimation method; and 4) we optimize the computational complexity of the EM algorithm by developing a new fast Fourier transform implementation. Realistic network simulations are carried out using network-level simulator ns-2 to demonstrate the accuracy of the estimation procedure.

...read moreread less

Proceedings Article•DOI•

Turbo synchronization: an EM algorithm interpretation

[...]

Nele Noels¹, Cédric Herzet², A. Dejonghe², Vincenzo Lottici³, Heidi Steendam¹, Marc Moeneclaey¹, Marco Luise¹, Luc Vandendorpe² - Show less +4 more•Institutions (3)

Ghent University¹, Université catholique de Louvain², University of Pisa³

11 May 2003

TL;DR: It is shown how maximum-likelihood estimation of those synchronization parameters can be implemented by means of the iterative expectation-maximization (EM) algorithm, and how the EM algorithm iterations can be combined with those of a turbo receiver, leading to a general theoretical framework for turbo synchronization.

...read moreread less

Abstract: This paper is devoted to turbo synchronization, that is to say the use of soft information to estimate parameters like carrier phase, frequency offset or timing within a turbo receiver. It is shown how maximum-likelihood estimation of those synchronization parameters can be implemented by means of the iterative expectation-maximization (EM) algorithm [A.P. Dempster, et al., 1977]. Then we show that the EM algorithm iterations can be combined with those of a turbo receiver. This leads to a general theoretical framework for turbo synchronization. The soft decision-directed ad-hoc algorithm proposed in V. Lottici and M. Luise, [2002] for carrier phase recovery turns out to be a particular instance of this implementation. The proposed mathematical framework is illustrated by simulations reported for the particular case of carrier phase estimation combined with iterative demodulation and decoding [S. ten Brink, et al., 1998].

...read moreread less

Proceedings Article•DOI•

Gaussian mixture sigma-point particle filters for sequential probabilistic inference in dynamic state-space models

[...]

R. van der Merwe¹, Eric A. Wan¹•Institutions (1)

Oregon Health & Science University¹

06 Apr 2003

TL;DR: A novel recursive Bayesian estimation algorithm that combines an importance sampling based measurement update step with a bank of sigma-point Kalman filters for the time-update and proposal distribution generation is presented.

...read moreread less

Abstract: For sequential probabilistic inference in nonlinear non-Gaussian systems, approximate solutions must be used. We present a novel recursive Bayesian estimation algorithm that combines an importance sampling based measurement update step with a bank of sigma-point Kalman filters for the time-update and proposal distribution generation. The posterior state density is represented by a Gaussian mixture model that is recovered from the weighted particle set of the measurement update step by means of a weighted EM algorithm. This step replaces the resampling stage needed by most particle filters and mitigates the "sample depletion" problem. We show that this new approach has an improved estimation performance and reduced computational complexity compared to other related algorithms.

...read moreread less

Journal Article•DOI•

EM algorithms for Gaussian mixtures with split-and-merge operation

[...]

Zhihua Zhang¹, Chibiao Chen¹, Jian Sun², Kap Luk Chan¹•Institutions (2)

Nanyang Technological University¹, Microsoft²

01 Sep 2003-Pattern Recognition

TL;DR: A new modified EM algorithm is constructed that is efficient for unsupervised color image segmentation and developed through the singular value decomposition and the Cholesky decomposition.

...read moreread less

Journal Article•DOI•

Dynamic Latent Trait Models for Multidimensional Longitudinal Data

[...]

David B. Dunson

01 Sep 2003-Journal of the American Statistical Association

TL;DR: In this paper, a general modeling framework is proposed that allows mixtures of count, categorical, and continuous response variables, and each response is related to age-specific latent traits through a generalized linear model that accommodates item-specific measurement errors.

...read moreread less

Abstract: This article presents a new approach for analysis of multidimensional longitudinal data, motivated by studies using an item response battery to measure traits of an individual repeatedly over time. A general modeling framework is proposed that allows mixtures of count, categorical, and continuous response variables. Each response is related to age-specific latent traits through a generalized linear model that accommodates item-specific measurement errors. A transition model allows the latent traits at a given age to depend on observed predictors and on previous latent traits for that individual. Following a Bayesian approach to inference, a Markov chain Monte Carlo algorithm is proposed for posterior computation. The methods are applied to data from a neurotoxicity study of the pesticide methoxychlor, and evidence of a dose-dependent increase in motor activity is presented.

...read moreread less

Book Chapter•DOI•

Energy based acoustic source localization

[...]

Xiaohong Sheng¹, Yu Hen Hu¹•Institutions (1)

University of Wisconsin-Madison¹

22 Apr 2003

TL;DR: Maximum Likelihood (ML) estimation with Expectation Maximization (EM) solution and projection solution are proposed to solve this energy based source location (EBL) problem and results show that energy based acoustic source localization algorithms are accurate and robust.

...read moreread less

Abstract: A novel source localization approach using acoustic energy measurements from the individual sensors in the sensor field is presented. This new approach is based on the acoustic energy decay model that acoustic energy decays inverse of distance square under the conditions that the sound propagates in the free and homogenous space and the targets are pre-detected to be in a certain region of the sensor field. This new approach is power efficient and needs low communication bandwidth and therefore, is suitable for the source localization in the distributed sensor network system. Maximum Likelihood (ML) estimation with Expectation Maximization (EM) solution and projection solution are proposed to solve this energy based source location (EBL) problem. Cramer-Rao Bound (CRB) is derived and used for the sensor deployment analysis. Experiments and simulations are conducted to evaluate ML algorithm with different solutions and to compare it with the Nonlinear Least Square (NLS) algorithm using energy ratio function that we proposed previously. Results show that energy based acoustic source localization algorithms are accurate and robust.

...read moreread less

Proceedings Article•DOI•

Affine-invariant local descriptors and neighborhood statistics for texture recognition

[...]

Lazebnik¹, Schmid, Ponce¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

13 Oct 2003

TL;DR: A framework for texture recognition based on local affine-invariant descriptors and their spatial layout is presented and initial probabilities computed from the generative model are refined using a relaxation step that incorporates co-occurrence statistics.

...read moreread less

Abstract: We present a framework for texture recognition based on local affine-invariant descriptors and their spatial layout. At modelling time, a generative model of local descriptors is learned from sample images using the EM algorithm. The EM framework allows the incorporation of unsegmented multitexture images into the training set. The second modelling step consists of gathering co-occurrence statistics of neighboring descriptors. At recognition time, initial probabilities computed from the generative model are refined using a relaxation step that incorporates co-occurrence statistics. Performance is evaluated on images of an indoor scene and pictures of wild animals.

...read moreread less

Collapse