scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 1997"


Journal ArticleDOI
TL;DR: A ordered sequence of events or observations having a time component is called as a time series, and some good examples are daily opening and closing stock prices, daily humidity, temperature, pressure, annual gross domestic product of a country and so on.
Abstract: Preface1Difference Equations12Lag Operators253Stationary ARMA Processes434Forecasting725Maximum Likelihood Estimation1176Spectral Analysis1527Asymptotic Distribution Theory1808Linear Regression Models2009Linear Systems of Simultaneous Equations23310Covariance-Stationary Vector Processes25711Vector Autoregressions29112Bayesian Analysis35113The Kalman Filter37214Generalized Method of Moments40915Models of Nonstationary Time Series43516Processes with Deterministic Time Trends45417Univariate Processes with Unit Roots47518Unit Roots in Multivariate Time Series54419Cointegration57120Full-Information Maximum Likelihood Analysis of Cointegrated Systems63021Time Series Models of Heteroskedasticity65722Modeling Time Series with Changes in Regime677A Mathematical Review704B Statistical Tables751C Answers to Selected Exercises769D Greek Letters and Mathematical Symbols Used in the Text786Author Index789Subject Index792

10,011 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a technique for constructing random fields from a set of training samples, where each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data.
Abstract: We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated. Relations to other learning approaches, including decision trees, are given. As a demonstration of the method, we describe its application to the problem of automatic word classification in natural language processing.

998 citations


Journal ArticleDOI
TL;DR: The maximum likelihood method is described and how likelihood ratio tests of a variety of biological hypotheses can be formulated and tested using computer simulation to generate the null distribution of the likelihood ratio test statistic is described.
Abstract: One of the strengths of the maximum likelihood method of phylogenetic estimation is the ease with which hypotheses can be formulated and tested. Maximum likelihood analysis of DNA and amino acid sequence data has been made practical with recent advances in models of DNA substitution, computer programs, and computational speed. Here, we describe the maximum likelihood method and the recent improvements in models of substitution. We also describe how likelihood ratio tests of a variety of biological hypotheses can be formulated and tested using computer simulation to generate the null distribution of the likelihood ratio test statistic.

967 citations


Journal ArticleDOI
TL;DR: A general alternating expectation–conditional maximization algorithm AECM is formulated that couples flexible data augmentation schemes with model reduction schemes to achieve efficient computations and shows the potential for a dramatic reduction in computational time with little increase in human effort.
Abstract: Celebrating the 20th anniversary of the presentation of the paper by Dempster, Laird and Rubin which popularized the EM algorithm, we investigate, after a brief historical account, strategies that aim to make the EM algorithm converge faster while maintaining its simplicity and stability (e.g. automatic monotone convergence in likelihood). First we introduce the idea of a ‘working parameter’ to facilitate the search for efficient data augmentation schemes and thus fast EM implementations. Second, summarizing various recent extensions of the EM algorithm, we formulate a general alternating expectation–conditional maximization algorithm AECM that couples flexible data augmentation schemes with model reduction schemes to achieve efficient computations. We illustrate these methods using multivariate t-models with known or unknown degrees of freedom and Poisson models for image reconstruction. We show, through both empirical and theoretical evidence, the potential for a dramatic reduction in computational time with little increase in human effort. We also discuss the intrinsic connection between EM-type algorithms and the Gibbs sampler, and the possibility of using the techniques presented here to speed up the latter. The main conclusion of the paper is that, with the help of statistical considerations, it is possible to construct algorithms that are simple, stable and fast.

775 citations


Journal ArticleDOI
TL;DR: In this article, a Monte Carlo version of the EM algorithm was proposed and evaluated for a wide variety of problems where they were not previously, and the authors used the Newton-Raphson algorithm as a framework to compare maximum likelihood to the joint-maximization or penalized quasi-likelihood methods and explain why the latter can perform poorly.
Abstract: Maximum likelihood algorithms are described for generalized linear mixed models. I show how to construct a Monte Carlo version of the EM algorithm, propose a Monte Carlo Newton-Raphson algorithm, and evaluate and improve the use of importance sampling ideas. Calculation of the maximum likelihood estimates is feasible for a wide variety of problems where they were not previously. I also use the Newton-Raphson algorithm as a framework to compare maximum likelihood to the “joint-maximization” or penalized quasi-likelihood methods and explain why the latter can perform poorly.

765 citations


Journal ArticleDOI
TL;DR: A posteriori blockmodeling for graphs is proposed and it is shown that when the number of vertices tends to infinity while the probabilities remain constant, the block structure can be recovered correctly with probability tending to 1.
Abstract: A statistical approach to a posteriori blockmodeling for graphs is proposed. The model assumes that the vertices of the graph are partitioned into two unknown blocks and that the probability of an edge between two vertices depends only on the blocks to which they belong. Statistical procedures are derived for estimating the probabilities of edges and for predicting the block structure from observations of the edge pattern only. ML estimators can be computed using the EM algorithm, but this strategy is practical only for small graphs. A Bayesian estimator, based on Gibbs sampling, is proposed. This estimator is practical also for large graphs. When ML estimators are used, the block structure can be predicted based on predictive likelihood. When Gibbs sampling is used, the block structure can be predicted from posterior predictive probabilities. A side result is that when the number of vertices tends to infinity while the probabilities remain constant, the block structure can be recovered correctly with probability tending to 1. (Less)

697 citations


Journal ArticleDOI
TL;DR: In this article, a latent variable model for mixed discrete and continuous outcomes is proposed to model any mixture of outcomes from an exponential family and allow for arbitrary covariate effects, as well as direct modelling of covariates on the latent variable.
Abstract: We propose a latent variable model for mixed discrete and continuous outcomes. The model accommodates any mixture of outcomes from an exponential family and allows for arbitrary covariate effects, as well as direct modelling of covariates on the latent variable. An EM algorithm is proposed for parameter estimation and estimates of the latent variables are produced as a by-product of the analysis. A generalized likelihood ratio test can be used to test the significance of covariates affecting the latent outcomes. This method is applied to birth defects data, where the outcomes of interest are continuous measures of size and binary indicators of minor physical anomalies. Infants who were exposed in utero to anticonvulsant medications are compared with controls.

332 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the maximum likelihood estimation of the parameters in the proportional odds model with right-censored data and show that the estimator of the regression coefficient is asymptotically normal with efficient variance.
Abstract: We consider maximum likelihood estimation of the parameters in the proportional odds model with right-censored data. The estimator of the regression coefficient is shown to be asymptotically normal with efficient variance. The maximum likelihood estimator of the unknown monotonic transformation of the survival time converges uniformly at a parametric rate to the true transformation. Estimates for the standard errors of the estimated regression coefficients are obtained by differentiation of the profile likelihood and are shown to be consistent. A likelihood ratio test for the regression coefficient is also considered.

278 citations


Journal ArticleDOI
TL;DR: A forward–backward recursive procedure is developed for efficient computation of the likelihood function and its derivatives with respect to the model parameters based on the calculated forward and backward vectors.
Abstract: We present a maximum likelihood method for the modelling of aggregated Markov processes. The method utilizes the joint probability density of the observed dwell time sequence as likelihood. A forwa...

277 citations


Journal ArticleDOI
TL;DR: This paper investigates the application of the EM algorithm to sequence estimation in the presence of random disturbances and additive white Gaussian noise, and shows that a formulation of the sequence estimation problem can provide a means of obtaining ML sequence estimates.
Abstract: The expectation-maximization (EM) algorithm was first introduced in the statistics literature as an iterative procedure that under some conditions produces maximum-likelihood (hit) parameter estimates. In this paper we investigate the application of the EM algorithm to sequence estimation in the presence of random disturbances and additive white Gaussian noise. As examples of the use of the EM algorithm, we look at the random-phase and fading channels, and show that a formulation of the sequence estimation problem based on the EM algorithm can provide a means of obtaining ML sequence estimates, a task that has been previously too complex to perform.

248 citations


Journal ArticleDOI
TL;DR: It is proposed to approximate the inverse of the observed information matrix by using auxiliary output from the new hybrid accelerator and a numerical evaluation of these approximations indicates that they may be useful at least for exploratory purposes.
Abstract: The EM algorithm is a popular method for maximum likelihood estimation. Its simplicity in many applications and desirable convergence properties make it very attractive. Its sometimes slow convergence, however, has prompted researchers to propose methods to accelerate it. We review these methods, classifying them into three groups: pure, hybrid and EM-type accelerators. We propose a new pure and a new hybrid accelerator both based on quasi-Newton methods and numerically compare these and two other quasi-Newton accelerators. For this we use examples in each of three areas: Poisson mixtures, the estimation of covariance from incomplete data and multivariate normal mixtures. In these comparisons, the new hybrid accelerator was fastest on most of the examples and often dramatically so. In some cases it accelerated the EM algorithm by factors of over 100. The new pure accelerator is very simple to implement and competed well with the other accelerators. It accelerated the EM algorithm in some cases by factors of over 50. To obtain standard errors, we propose to approximate the inverse of the observed information matrix by using auxiliary output from the new hybrid accelerator. A numerical evaluation of these approximations indicates that they may be useful at least for exploratory purposes.

Journal ArticleDOI
TL;DR: In this article, the authors consider fitting categorical regression models to data obtained by either stratified or nonstratified case-control, or response selective, sampling from a finite population with known population totals in each response category.
Abstract: SUMMARY We consider fitting categorical regression models to data obtained by either stratified or nonstratified case-control, or response selective, sampling from a finite population with known population totals in each response category. With certain models, such as the logistic with appropriate constant terms, a method variously known as conditional maximum likelihood (Breslow & Cain, 1988) or pseudo-conditional likelihood (Wild, 1991), which involves the prospective fitting of a pseudo-model, results in maximum likelihood estimates of case-control data. We extend these results by showing the maximum likelihood estimates for any model can be found by iterating this process with a simple updating of offset parameters. Attention is also paid to estimation of the asymptotic covariance matrix. One benefit of the results of this paper is the ability to obtain maximum likelihood estimates of the parameters of logistic models for stratified case-control studies, compare Breslow & Cain (1988), Scott & Wild (1991), using an ordinary logistic regression program, even when the stratum constants are modelled.

Journal ArticleDOI
TL;DR: In this paper, various types of finite mixtures of confirmatory factor-analysis models are proposed for handling data heterogeneity, and three different sampling schemes for these mixture models are distinguished.
Abstract: In this paper, various types of finite mixtures of confirmatory factor-analysis models are proposed for handling data heterogeneity. Under the proposed mixture approach, observations are assumed to be drawn from mixtures of distinct confirmatory factor-analysis models. But each observation does not need to be identified to a particular model prior to model fitting. Several classes of mixture models are proposed. These models differ by their unique representations of data heterogeneity. Three different sampling schemes for these mixture models are distinguished. A mixed type of the these three sampling schemes is considered throughout this article. The proposed mixture approach reduces to regular multiple-group confirmatory factor-analysis under a restrictive sampling scheme, in which the structural equation model for each observation is assumed to be known. By assuming a mixture of multivariate normals for the data, maximum likelihood estimation using the EM (Expectation-Maximization) algorithm and the AS (Approximate-Scoring) method are developed, respectively. Some mixture models were fitted to a real data set for illustrating the application of the theory. Although the EM algorithm and the AS method gave similar sets of parameter estimates, the AS method was found computationally more efficient than the EM algorithm. Some comments on applying the mixture approach to structural equation modeling are made.

Journal ArticleDOI
TL;DR: The hybrid algorithm uses a composite algorithmic mapping combining the expectation-maximization algorithm and the (modified) iterative convex minorant algorithm to estimate nonparametric maximum likelihood estimation from censored data when the log-likelihood is concave.
Abstract: We present a hybrid algorithm for nonparametric maximum likelihood estimation from censored data when the log-likelihood is concave. The hybrid algorithm uses a composite algorithmic mapping combining the expectation-maximization (EM) algorithm and the (modified) iterative convex minorant (ICM) algorithm. Global convergence of the hybrid algorithm is proven; the iterates generated by the hybrid algorithm are shown to converge to the nonparametric maximum likelihood estimator (NPMLE) unambiguously. Numerical simulations demonstrate that the hybrid algorithm converges more rapidly than either of the EM or the naive ICM algorithm for doubly censored data. The speed of the hybrid algorithm makes it possible to accompany the NPMLE with bootstrap confidence bands.

Proceedings ArticleDOI
07 Jul 1997
TL;DR: This paper shows how PCA can be derived from a maximum-likelihood procedure, based on a specialisation of factor analysis, to develop a well-defined mixture model of principal component analyzers, and an expectation-maximisation algorithm for estimating all the model parameters is given.
Abstract: Principal component analysis (PCA) is a ubiquitous technique for data analysis but one whose effective application is restricted by its global linear character. While global nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data nonlinearity by a mixture of local PCA models. However, existing techniques are limited by the absence of a probabilistic formalism with an appropriate likelihood measure and so require an arbitrary choice of implementation strategy. This paper shows how PCA can be derived from a maximum-likelihood procedure, based on a specialisation of factor analysis. This is then extended to develop a well-defined mixture model of principal component analyzers, and an expectation-maximisation algorithm for estimating all the model parameters is given.

Journal ArticleDOI
TL;DR: In this paper, the equations of the maximum likelihood principle have been rewritten in a suitable generalized form to allow the use of any number of implicit constraints in the determination of model parameters from experimental data and from the associated experimental uncertainties.
Abstract: The equations of the method based on the maximum likelihood principle have been rewritten in a suitable generalized form to allow the use of any number of implicit constraints in the determination of model parameters from experimental data and from the associated experimental uncertainties. In addition to the use of any number of constraints, this method also allows data, with different numbers of constraints, to be reduced simultaneously. Application of the method is illustrated in the reduction of liquid-liquid equilibrium data of binary, ternary and quaternary systems simultaneously

Journal ArticleDOI
TL;DR: In this paper, the authors extended the random-effects model for a single characteristic to the case of multiple characteristics, allowing for arbitrary patterns of observed data, and derived the set of equations for this estimation procedure, appropriately modified to deal with missing data.
Abstract: The use of random-effects models for the analysis of longitudinal data with missing responses has been discussed by several authors. This article extends the random-effects model for a single characteristic to the case of multiple characteristics, allowing for arbitrary patterns of observed data. Two different structures for the covariance matrix of measurement error are considered: uncorrelated error between responses and correlation of error terms at the same measurement times. Parameters for this model are estimated via the EM algorithm. The set of equations for this estimation procedure is derived; these equations are appropriately modified to deal with missing data. The methodology is illustrated with an example from clinical trials.

Journal ArticleDOI
TL;DR: The I(2) model as discussed by the authors is defined as a submodel of the general vector autoregressive model, by two reduced rank conditions, and describes stochastic processes with stationary second difference.
Abstract: The I(2) model is defined as a submodel of the general vector autoregressive model, by two reduced rank conditions. The model describes stochastic processes with stationary second difference. A parametrization is suggested which makes likelihood inference feasible. Consistency of the maximum likelihood estimator is proved, and the asymptotic distribution of the maximum likelihood estimator is given. It is shown that the asymptotic distribution is either Gaussian, mixed Gaussian or, in some cases, even more complicated.

Journal ArticleDOI
TL;DR: It is shown that a particular case of the Bayesian Ying–Yang learning system and theory reduces to the maximum likelihood learning of a finite mixture, from which the EM algorithm for its parameter estimation and its various approximate but fast algorithms for clustering in general cases are obtained.

Journal ArticleDOI
TL;DR: The notion of a generalized mixture is introduced and some methods for estimating it are proposed, along with applications to unsupervised statistical image segmentation and adaptations of traditional parameter estimation algorithms allowing the estimation of generalized mixtures corresponding to Pearson's system.
Abstract: We introduce the notion of a generalized mixture and propose some methods for estimating it, along with applications to unsupervised statistical image segmentation. A distribution mixture is said to be "generalized" when the exact nature of the components is not known, but each belongs to a finite known set of families of distributions. For instance, we can consider a mixture of three distributions, each being exponential or Gaussian. The problem of estimating such a mixture contains thus a new difficulty: we have to label each of three components (there are eight possibilities). We show that the classical mixture estimation algorithms-expectation-maximization (EM), stochastic EM (SEM), and iterative conditional estimation (ICE)-can be adapted to such situations once as we dispose of a method of recognition of each component separately. That is, when we know that a sample proceeds from one family of the set considered, we have a decision rule for what family it belongs to. Considering the Pearson system, which is a set of eight families, the decision rule above is defined by the use of "skewness" and "kurtosis". The different algorithms so obtained are then applied to the problem of unsupervised Bayesian image segmentation, We propose the adaptive versions of SEM, EM, and ICE in the case of "blind", i.e., "pixel by pixel", segmentation. "Global" segmentation methods require modeling by hidden random Markov fields, and we propose adaptations of two traditional parameter estimation algorithms: Gibbsian EM (GEM) and ICE allowing the estimation of generalized mixtures corresponding to Pearson's system. The efficiency of different methods is compared via numerical studies, and the results of unsupervised segmentation of three real radar images by different methods are presented.

Book
24 Jan 1997
TL;DR: Causality and Path Models: Embedding common factors in a Path Model, Measurement, Causation and Local Independence in Latent Variable Models, On the Identifiability of Nonparametric Structural Models, Estimating the Causal effects of Time Varying Endogeneous Treatments by G-Estimation of Structural Nested Models, Latent Variables- Model as Instruments, with Applications to Moment Structure Analysis as discussed by the authors.
Abstract: Causality and Path Models- Embedding Common factors in a Path Model- Measurement, Causation and Local Independence in Latent Variable Models- On the Identifiability of Nonparametric Structural Models- Estimating the Causal effects of Time Varying Endogeneous Treatments by G-Estimation of Structural Nested Models- Latent Variables- Model as Instruments, with Applications to Moment Structure Analysis- Bias and Mean Square Error of the Maximum Likelihood Estimators of the Parameters of the Intraclass Correlation Model- Latent Variable Growth Modeling with Multilevel Data- High-Dimensional Full-Information Item Factor Analysis- Dynamic Factor Models for the Analysis of Ordered Categorical Panel data- Model Fitting Procedures for Nonlinear Factor Analysis Using the Errors-in-Variables Parameterization- Multivariate Regression with Errors in Variables: Issues on Asymptotic Robustness- Non-Iterative fitting of the Direct Product Model for Multitrait-Multimethod Correlation Matrices- An EM Algorithm for ML Factor Analysis with Missing Data- Optimal Conditionally Unbiased Equivariant Factor Score Estimators

Proceedings Article
21 Jun 1997
TL;DR: The specificity of the approach lies in the neighborhood structure used in the local search algorithms which has been inspired by an analogy between the marker ordering problem and the famous traveling salesman problem.
Abstract: Genetic mapping is an important step in the study of any organism. An accurate genetic map is extremely valuable for locating genes or more generally either qualitative or quantitative trait loci (QTL). This paper presents a new approach to two important problems in genetic mapping: automatically ordering markers to obtain a multipoint maximum likelihood map and building a multipoint maximum likelihood map using pooled data from several crosses. The approach is embodied in an hybrid algorithm that mixes the statistical optimization algorithm EM with local search techniques which have been developed in the artificial intelligence and operations research communities. An efficient implementation of the EM algorithm provides maximum likelihood recombination fractions, while the local search techniques look for orders that maximize this maximum likelihood. The specificity of the approach lies in the neighborhood structure used in the local search algorithms which has been inspired by an analogy between the marker ordering problem and the famous traveling salesman problem. The approach has been used to build joined maps for the wasp Trichogramma brassicae and on random pooled data sets. In both cases, it compares quite favorably with existing softwares as far as maximum likelihood is considered as a significant criteria.

Journal ArticleDOI
TL;DR: Of the two algorithms tested, the Gauss-EM method is superior in noise reduction (up to 50%), whereas the Markov-GEM algorithm proved to be stable with a small change of recovery coefficients between 0.5 and 3%.
Abstract: Using statistical methods the reconstruction of positron emission tomography (PET) images can be improved by high-resolution anatomical information obtained from magnetic resonance (MR) images. The authors implemented two approaches that utilize MR data for PET reconstruction. The anatomical MR information is modeled as a priori distribution of the PET image and combined with the distribution of the measured PET data to generate the a posteriori function from which the expectation maximization (EM)-type algorithm with a maximum a posteriori (MAP) estimator is derived. One algorithm (Markov-GEM) uses a Gibbs function to model interactions between neighboring pixels within the anatomical regions. The other (Gauss-EM) applies a Gauss function with the same mean for all pixels in a given anatomical region. A basic assumption of these methods is that the radioactivity is homogeneously distributed inside anatomical regions. Simulated and phantom data are investigated under the following aspects: count density, object size, missing anatomical information, and misregistration of the anatomical information. Compared with the maximum likelihood-expectation maximization (ML-EM) algorithm the results of both algorithms show a large reduction of noise with a better delineation of borders. Of the two algorithms tested, the Gauss-EM method is superior in noise reduction (up to 50%). Regarding incorrect a priori information the Gauss-EM algorithm is very sensitive, whereas the Markov-GEM algorithm proved to be stable with a small change of recovery coefficients between 0.5 and 3%.

Journal ArticleDOI
TL;DR: A framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities with simple forgetting mechanism to adjust the contribution of previously observed sample utterances is presented.
Abstract: We present a framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities. The QB formulation is based on the theory of recursive Bayesian inference. The QB algorithm is designed to incrementally update the hyperparameters of the approximate posterior distribution and the CDHMM parameters simultaneously. By further introducing a simple forgetting mechanism to adjust the contribution of previously observed sample utterances, the algorithm is adaptive in nature and capable of performing an online adaptive learning using only the current sample utterance. It can, thus, be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, and transducers. As an example, the QB learning framework is applied to on-line speaker adaptation and its viability is confirmed in a series of comparative experiments using a 26-letter English alphabet vocabulary.

01 Jan 1997
TL;DR: In this article, the anatomical MR information is modeled as a priori distribution of the PET image and combined with the distribution of measured PET data to generate the a posteriori function from which the expectation maximization (EM)-type algorithm with a maximum a posteriora (MAP) estimator is derived.
Abstract: Using statistical methods the reconstruction of positron emission tomography (PET) images can be improved by high-resolution anatomical information obtained from magnetic resonance (MR) images. We implemented two approaches that utilize MR data for PET reconstruction. The anatomical MR information is modeled as a priori distribution of the PET image and combined with the distribution of the measured PET data to generate the a posteriori function from which the expectation maximization (EM)-type algorithm with a maximum a posteriori (MAP) estimator is derived. One algorithm (Markov-GEM) uses a Gibbs function to model interactions between neighboring pixels within the anatomical regions. The other (Gauss-EM) applies a Gauss function with the same mean for all pixels in a given anatomical region. A basic assumption of these methods is that the radioactivity is homogeneously distributed inside anatomical regions. Simulated and phantom data are investigated under the following aspects: count density, object size, missing anatomical information, and misregistration of the anatomical information. Compared with the maximum likelihood-expectation maximization (ML-EM) algorithm the results of both algorithms show a large reduction of noise with a better delineation of borders. Of the two algorithms tested, the Gauss-EM method is superior in noise reduction (up to 50%). Regarding incorrect a priori information the Gauss-EM algorithm is very sensitive, whereas the Markov-GEM algorithm proved to be stable with a small change of recovery coefficients between 0.5 and 3%.

Patent
22 Sep 1997
TL;DR: In this paper, an iterative process is provided for cone-beam tomography (parallel-beam and fan-beam geometries are considered as its special cases), and applied to metal artifact reduction and local reconstruction from truncated data, as well as image noise reduction.
Abstract: In the present invention, an iterative process is provided for cone-beam tomography (parallel-beam and fan-beam geometries are considered as its special cases), and applied to metal artifact reduction and local reconstruction from truncated data, as well as image noise reduction. In different embodiments, these iterative processes may be based upon the emission computerized tomography (CT) expectation maximization (EM) formula and/or the algebraic reconstruction technique (ART). In one embodiment, generation of a projection mask and computation of a 3D spatially varying relaxation factor are utilized to compensate for beam divergence, data inconsistence and incompleteness.

Journal ArticleDOI
TL;DR: Two new approaches to multivariate calibration are described that, for the first time, allow information on measurement uncertainties to be included in the calibration process in a statistically meaningful way, based on principles of maximum likelihood parameter estimation.
Abstract: Two new approaches to multivariate calibration are described that, for the first time, allow information on measurement uncertainties to be included in the calibration process in a statistically meaningful way. The new methods, referred to as maximum likelihood principal components regression (MLPCR) and maximum likelihood latent root regression (MLLRR), are based on principles of maximum likelihood parameter estimation. MLPCR and MLLRR are generalizations of principal components regression (PCR), which has been widely used in chemistry, and latent root regression (LRR), which has been virtually ignored in this field. Both of the new methods are based on decomposition of the calibration data matrix by maximum likelihood principal component analysis (MLPCA), which has been recently described (Wentzell, P. D.; et al. J. Chemom., in press). By using estimates of the measurement error variance, MLPCR and MLLRR are able to extract the optimum amount of information from each measurement and, thereby, exhibit superior performance over conventional multivariate calibration methods such as PCR and partial least-squares regression (PLS) when there is a nonuniform error structure. The new techniques reduce to PCR and LRR when assumptions of uniform noise are valid. Comparisons of MLPCR, MLLRR, PCR, and PLS are carried out using simulated and experimental data sets consisting of three-component mixtures. In all cases of nonuniform errors examined, the predictive ability of the maximum likelihood methods is superior to that of PCR and PLS, with PLS performing somewhat better than PCR. MLLRR generally performed better than MLPCR, but in most cases the improvement was marginal. The differences between PCR and MLPCR are elucidated by examining the multivariate sensitivity of the two methods.

Proceedings Article
01 Aug 1997
TL;DR: In this paper, a unified framework for parameter estimation in Bayesian networks with missing values and hidden variables is proposed, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process.
Abstract: This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [13]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process. In the batch case, our framework encompasses both the gradient projection algorithm [2, 3] and the EM algorithm [15] for Bayesian networks. The framework also leads to new on-line and batch parameter update schemes, including a parameterized version of EM. We provide both empirical and theoretical results indicating that parameterized EM allows faster convergence to the maximum likelihood parameters than does standard EM.

Journal Article
TL;DR: In this paper, the problem of providing standard errors of the component means in normal mixture models fitted to univariate or multivariate data by maximum likelihood via the EM algorithm is considered.
Abstract: In this paper use consider the problem of providing standard errors of the component means in normal mixture models fitted to univariate or multivariate data by maximum likelihood via the EM algorithm. Two methods of estimation of the standard errors are considered: the standard information-based method and the computationally-intensive bootstrap method. They are compared empirically by their application to three real data sets and by a small-scale Monte Carlo experiment.

Journal ArticleDOI
TL;DR: A theoretical perspective clarifies the operation of the EM algorithm and suggests novel generalizations that lead to highly stable algorithms with well-understood local and global convergence properties in medical statistics.
Abstract: Most problems in computational statistics involve optimization of an objective function such as a loglikelihood, a sum of squares, or a log posterior function. The EM algorithm is one of the most effective algorithms for maximization because it iteratively transfers maximization from a complex function to a simple, surrogate function. This theoretical perspective clarifies the operation of the EM algorithm and suggests novel generalizations. Besides simplifying maximization, optimization transfer usually leads to highly stable algorithms with well-understood local and global convergence properties. Although convergence can be excruciatingly slow, various devices exist for accelerating it. Beginning with the EM algorithm, we review in this paper several optimization transfer algorithms of substantial utility in medical statistics.