scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 1998"


Journal Article
TL;DR: In this paper, the authors describe the EM algorithm for finding the parameters of a mixture of Gaussian densities and a hidden Markov model (HMM) for both discrete and Gaussian mixture observation models.
Abstract: We describe the maximum-likelihood parameter estimation problem and how the ExpectationMaximization (EM) algorithm can be used for its solution. We first describe the abstract form of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the parameters of a hidden Markov model (HMM) (i.e., the Baum-Welch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any convergence properties. We try to emphasize intuition rather than mathematical rigor.

2,455 citations


Book ChapterDOI
26 Mar 1998
TL;DR: In this paper, an incremental variant of the EM algorithm is proposed, in which the distribution for only one of the unobserved variables is recalculated in each E step, which is shown empirically to give faster convergence in a mixture estimation problem.
Abstract: The EM algorithm performs maximum likelihood estimation for data in which some variables are unobserved. We present a function that resembles negative free energy and show that the M step maximizes this function with respect to the model parameters and the E step maximizes it with respect to the distribution over the unobserved variables. From this perspective, it is easy to justify an incremental variant of the EM algorithm in which the distribution for only one of the unobserved variables is recalculated in each E step. This variant is shown empirically to give faster convergence in a mixture estimation problem. A variant of the algorithm that exploits sparse conditional distributions is also described, and a wide range of other variant algorithms are also seen to be possible.

2,093 citations


Journal ArticleDOI
TL;DR: A form of nonlinear latent variable model called the generative topographic mapping, for which the parameters of the model can be determined using the expectation-maximization algorithm, is introduced.
Abstract: Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline.

1,469 citations


Journal ArticleDOI
TL;DR: In this paper, the posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm using the multivariate probit model.
Abstract: SUMMARY This paper provides a practical simulation-based Bayesian and non-Bayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm. A practical approach for the computation of Bayes factors from the simulation output is also developed. The methods are applied to a dataset with a bivariate binary response, to a four-year longitudinal dataset from the Six Cities study of the health effects of air pollution and to a sevenvariate binary response dataset on the labour supply of married women from the Panel Survey of Income Dynamics.

782 citations


Proceedings Article
24 Jul 1998
TL;DR: This paper extends Structural EM to deal directly with Bayesian model selection and proves the convergence of the resulting algorithm and shows how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.
Abstract: In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data--that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.

637 citations


Journal ArticleDOI
TL;DR: A deterministic annealing EM (DAEM) algorithm for maximum likelihood estimation problems to overcome a local maxima problem associated with the conventional EM algorithm is presented and can obtain better estimates free of the initial parameter values.

503 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of detecting features, such as minefields or seismic faults, in spatial point processes when there is substantial clutter is considered, and model-based clustering based on a mixture model for the process, in which features are assumed to generate points according to highly linear multivariate normal densities.
Abstract: We consider the problem of detecting features, such as minefields or seismic faults, in spatial point processes when there is substantial clutter. We use model-based clustering based on a mixture model for the process, in which features are assumed to generate points according to highly linear multivariate normal densities, and the clutter arises according to a spatial Poisson process. Nonlinear features are represented by several densities, giving a piecewise linear representation. Hierarchical model-based clustering provides a first estimate of the features, and this is then refined using the EM algorithm. The number of features is estimated from an approximation to its posterior distribution. The method gives good results for the minefield and seismic fault problems. Software to implement it is available on the World Wide Web.

441 citations


Journal ArticleDOI
TL;DR: In this paper, a class of models for an additive decomposition of groups of curves stratified by crossed and nested factors is introduced, and the model parameters are estimated using a highly efficient implementation of the EM algorithm for restricted maximum likelihood (REML) estimation based on a preliminary eigenvector decomposition.
Abstract: We introduce a class of models for an additive decomposition of groups of curves stratified by crossed and nested factors, generalizing smoothing splines to such samples by associating them with a corresponding mixed-effects model. The models are also useful for imputation of missing data and exploratory analysis of variance. We prove that the best linear unbiased predictors (BLUPs) from the extended mixed-effects model correspond to solutions of a generalized penalized regression where smoothing parameters are directly related to variance components, and we show that these solutions are natural cubic splines. The model parameters are estimated using a highly efficient implementation of the EM algorithm for restricted maximum likelihood (REML) estimation based on a preliminary eigenvector decomposition. Variability of computed estimates can be assessed with asymptotic techniques or with a novel hierarchical bootstrap resampling scheme for nested mixed-effects models. Our methods are applied to me...

425 citations


Journal ArticleDOI
01 Dec 1998
TL;DR: A split-and-merge expectation-maximization algorithm to overcome the local maxima problem in parameter estimation of finite mixture models and is applied to the training of gaussian mixtures and mixtures of factor analyzers and shows the practical usefulness by applying it to image compression and pattern recognition problems.
Abstract: We present a split-and-merge expectation-maximization (SMEM) algorithm to overcome the local maxima problem in parameter estimation of finite mixture models. In the case of mixture models, local maxima often involve having too many components of a mixture model in one part of the space and too few in another, widely separated part of the space. To escape from such configurations, we repeatedly perform simultaneous split-and-merge operations using a new criterion for efficiently selecting the split-and-merge candidates. We apply the proposed algorithm to the training of gaussian mixtures and mixtures of factor analyzers using synthetic and real data and show the effectiveness of using the split- and-merge operations to improve the likelihood of both the training data and of held-out test data. We also show the practical usefulness of the proposed algorithm by applying it to image compression and pattern recognition problems.

422 citations


Journal ArticleDOI
TL;DR: This parameter-expanded Ei M, PX-EM, algorithm shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis.
Abstract: SUMMARY The EM algorithm and its extensions are popular tools for modal estimation but ar-e often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment' to correct the analysis of the M step, capitalising on extra information captured in the imputed complete data. The way we accomplish this is by parameter expansion; we expand the complete-data model while preserving the observed-data model and use the expanded complete-data model to generate EM. This parameter-expanded Ei M, PX-EM, algorithm shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis. The PX-EM algorithm is illustrated for the multivariate t distribution, a random effects model, factor analysis, probit regression and a Poisson imaging model.

393 citations


Journal ArticleDOI
TL;DR: In this paper, a two-parameter distribution with decreasing failure rate is introduced and various properties are discussed and the estimation of parameters is studied by the method of maximum likelihood, which is attained by the EM algorithm and expressions for their asymptotic variances and covariances are obtained.

Journal ArticleDOI
TL;DR: This paper formulates a corresponding expectation-maximization (EM) algorithm, as well as a method for estimating noise properties at the ML estimate, for an idealized two-dimensional positron emission tomography [2-D PET] detector.
Abstract: Using a theory of list-mode maximum-likelihood (ML) source reconstruction presented recently by Barrett et al. (1997), this paper formulates a corresponding expectation-maximization (EM) algorithm, as well as a method for estimating noise properties at the ML estimate. List-mode ML is of interest in cases where the dimensionality of the measurement space impedes a binning of the measurement data. It can be advantageous in cases where a better forward model can be obtained by including more measurement coordinates provided by a given detector. Different figures of merit for the detector performance can be computed from the Fisher information matrix (FIM). This paper uses the observed FIM, which requires a single data set, thus, avoiding costly ensemble statistics. The proposed techniques are demonstrated for an idealized two-dimensional (2-D) positron emission tomography (PET) [2-D PET] detector. The authors compute from simulation data the improved image quality obtained by including the time of flight of the coincident quanta.

Journal ArticleDOI
TL;DR: McLachlan and Krishnan as discussed by the authors presented a unified account of the theory, methodology, and applications of the Expectation-Maximization (EM) algorithm and its extensions, and illustrated applications in many statistical contexts.
Abstract: The first unified account of the theory, methodology, and applications of the EM algorithm and its extensionsSince its inception in 1977, the Expectation-Maximization (EM) algorithm has been the subject of intense scrutiny, dozens of applications, numerous extensions, and thousands of publications. The algorithm and its extensions are now standard tools applied to incomplete data problems in virtually every field in which statistical methods are used. Until now, however, no single source offered a complete and unified treatment of the subject.The EM Algorithm and Extensions describes the formulation of the EM algorithm, details its methodology, discusses its implementation, and illustrates applications in many statistical contexts. Employing numerous examples, Geoffrey McLachlan and Thriyambakam Krishnan examine applications both in evidently incomplete data situations-where data are missing, distributions are truncated, or observations are censored or grouped-and in a broad variety of situations in which incompleteness is neither natural nor evident. They point out the algorithm's shortcomings and explain how these are addressed in the various extensions.Areas of application discussed include: Regression Medical imaging Categorical data analysis Finite mixture analysis Factor analysis Robust statistical modeling Variance-components estimation Survival analysis Repeated-measures designs For theoreticians, practitioners, and graduate students in statistics as well as researchers in the social and physical sciences, The EM Algorithm and Extensions opens the door to the tremendous potential of this remarkably versatile statistical tool.

Proceedings ArticleDOI
TL;DR: Experimental results show that the estimated Gaussian mixture model fits skin images from a large database and applications of the estimated density function in image and video databases are presented.
Abstract: This paper is concerned with estimating a probability density function of human skin color, using a finite Gaussian mixture model, whose parameters are estimated through the EM algorithm. Hawkins' statistical test on the normality and homoscedasticity (common covariance matrix) of the estimated Gaussian mixture models is performed and McLachlan's bootstrap method is used to test the number of components in a mixture. Experimental results show that the estimated Gaussian mixture model fits skin images from a large database. Applications of the estimated density function in image and video databases are presented.

Proceedings Article
01 Dec 1998
TL;DR: A generalization of the EM algorithm for parameter estimation in nonlinear dynamical systems if Gaussian radial basis function (RBF) approximators are used to model the nonlinearities, the integrals become tractable and the maximization step can be solved via systems of linear equations.
Abstract: The Expectation-Maximization (EM) algorithm is an iterative procedure for maximum likelihood parameter estimation from data sets with missing or hidden variables [2]. It has been applied to system identification in linear stochastic state-space models, where the state variables are hidden from the observer and both the state and the parameters of the model have to be estimated simultaneously [9]. We present a generalization of the EM algorithm for parameter estimation in nonlinear dynamical systems. The "expectation" step makes use of Extended Kalman Smoothing to estimate the state, while the "maximization" step re-estimates the parameters using these uncertain state estimates. In general, the nonlinear maximization step is difficult because it requires integrating out the uncertainty in the states. However, if Gaussian radial basis function (RBF) approximators are used to model the nonlinearities, the integrals become tractable and the maximization step can be solved via systems of linear equations.

Journal Article
TL;DR: In this paper, a probabilistic framework for learning models of temporal data is presented, which uses the Bayesian network formalism, a marriage of probability theory and graph theory in which dependencies between variables are expressed graphically.
Abstract: This paper presents a probabilistic framework for learning models of temporal data. We express these models using the Bayesian network formalism, a marriage of probability theory and graph theory in which dependencies between variables are expressed graphically. The graph not only allows the user to understand which variables affect which other ones, but also serves as the backbone for efficiently computing marginal and conditional probabilities that may be required for inference and learning.

Journal ArticleDOI
TL;DR: A new approach to matching geometric structure in 2D point-sets is described to unify the tasks of estimating transformation geometry and identifying point-correspondence matches by constructing a mixture model over the bipartite graph representing the correspondence match.
Abstract: This paper describes a new approach to matching geometric structure in 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying point-correspondence matches. Unification is realized by constructing a mixture model over the bipartite graph representing the correspondence match and by affecting optimization using the EM algorithm. According to our EM framework, the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood transformation parameters. These gating probabilities measure the consistency of the matched neighborhoods in the graphs. The recovery of transformational geometry and hard correspondence matches are interleaved and are realized by applying coupled update operations to the expected log-likelihood function. We evaluate the technique on two real-world problems.

Journal ArticleDOI
TL;DR: This paper addresses the problem of building large-scale geometric maps of indoor environments with mobile robots by posing the map building problem as a constrained, probabilistic maximum-likeliho...
Abstract: This paper addresses the problem of building large-scale geometric maps of indoor environments with mobile robots. It poses the map building problem as a constrained, probabilistic maximum-likeliho...

Journal ArticleDOI
TL;DR: A spatially variant finite mixture model is proposed for pixel labeling and image segmentation and an expectation-maximization (EM) algorithm is derived for maximum likelihood estimation of the pixel labels and the parameters of the mixture densities.
Abstract: A spatially variant finite mixture model is proposed for pixel labeling and image segmentation. For the case of spatially varying mixtures of Gaussian density functions with unknown means and variances, an expectation-maximization (EM) algorithm is derived for maximum likelihood estimation of the pixel labels and the parameters of the mixture densities, An a priori density function is formulated for the spatially variant mixture weights. A generalized EM algorithm for maximum a posteriori estimation of the pixel labels based upon these prior densities is derived. This algorithm incorporates a variation of gradient projection in the maximization step and the resulting algorithm takes the form of grouped coordinate ascent. Gaussian densities have been used for simplicity, but the algorithm can easily be modified to incorporate other appropriate models for the mixture model component densities. The accuracy of the algorithm is quantitatively evaluated through Monte Carlo simulation, and its performance is qualitatively assessed via experimental images from computerized tomography (CT) and magnetic resonance imaging (MRI).

Journal ArticleDOI
TL;DR: In this article, a unified approach to selecting a bandwidth and constructing confidence intervals in local maximum likelihood estimation is presented, which is then applied to least squares nonparametric regression and to non-parametric logistic regression.
Abstract: Local maximum likelihood estimation is a nonparametric counterpart of the widely used parametric maximum likelihood technique. It extends the scope of the parametric maximum likelihood method to a much wider class of parametric spaces. Associated with this nonparametric estimation scheme is the issue of bandwidth selection and bias and variance assessment. This paper provides a unified approach to selecting a bandwidth and constructing confidence intervals in local maximum likelihood estimation. The approach is then applied to least squares nonparametric regression and to nonparametric logistic regression. Our experiences in these two settings show that the general idea outlined here is powerful and encouraging.

Journal ArticleDOI
TL;DR: A computationally efficient scheme to address both direct parameter estimation and parameter estimates for a general class of MRF models, and specific methods of parameter estimation for the MRF model known as generalized Gaussian MRF (GGMRF).
Abstract: Markov random fields (MRFs) have been widely used to model images in Bayesian frameworks for image reconstruction and restoration. Typically, these MRF models have parameters that allow the prior model to be adjusted for best performance. However, optimal estimation of these parameters (sometimes referred to as hyperparameters) is difficult in practice for two reasons: (i) direct parameter estimation for MRFs is known to be mathematically and numerically challenging; (ii) parameters can not be directly estimated because the true image cross section is unavailable. We propose a computationally efficient scheme to address both these difficulties for a general class of MRF models, and we derive specific methods of parameter estimation for the MRF model known as generalized Gaussian MRF (GGMRF). We derive methods of direct estimation of scale and shape parameters for a general continuously valued MRF. For the GGMRF case, we show that the ML estimate of the scale parameter, /spl sigma/, has a simple closed-form solution, and we present an efficient scheme for computing the ML estimate of the shape parameter, p, by an off-line numerical computation of the dependence of the partition function on p. We present a fast algorithm for computing ML parameter estimates when the true image is unavailable. To do this, we use the expectation maximization (EM) algorithm. We develop a fast simulation method to replace the E-step, and a method to improve the parameter estimates when the simulations are terminated prior to convergence. Experimental results indicate that our fast algorithms substantially reduce the computation and result in good scale estimates for real tomographic data sets.

Journal ArticleDOI
TL;DR: A fast accurate iterative reconstruction (FAIR) method suitable for low-statistics positron volume imaging has been developed and is shown to offer improved resolution, contrast and noise properties as a direct result of using improved spatial sampling, limited only by hardware specifications.
Abstract: A fast accurate iterative reconstruction (FAIR) method suitable for low-statistics positron volume imaging has been developed. The method, based on the expectation maximization-maximum likelihood (EM-ML) technique, operates on list-mode data rather than histogrammed projection data and can, in just one pass through the data, generate images with the same characteristics as several ML iterations. Use of list-mode data preserves maximum sampling accuracy and implicitly ignores lines of response (LORs) in which no counts were recorded. The method is particularly suited to systems where sampling accuracy can be lost by histogramming events into coarse LOR bins, and also to sparse data situations such as fast whole-body and dynamic imaging where sampling accuracy may be compromised by storage requirements and where reconstruction time can be wasted by including LORs with no counts. The technique can be accelerated by operating on subsets of list-mode data which also allows scope for simultaneous data acquisition and iterative reconstruction. The method is compared with a standard implementation of the EM-ML technique and is shown to offer improved resolution, contrast and noise properties as a direct result of using improved spatial sampling, limited only by hardware specifications.

Journal ArticleDOI
TL;DR: In this paper an exposition is given of likelihood based frequentist inference that shows in particular which aspects of such inference cannot be separated from consideration of the missing value mechanism.
Abstract: One of the most often quoted results from the original work of Rubin and Little on the classification of missing value processes is the validity of likelihood based inferences under missing at random (MAR) mechanisms. Although the sense in which this result holds was precisely defined by Rubin, and explored by him in later work, it appears to be now used by some authors in a general and rather imprecise way, particularly with respect to the use of frequentist modes of inference. In this paper an exposition is given of likelihood based frequentist inference under an MAR mechanism that shows in particular which aspects of such inference cannot be separated from consideration of the missing value mechanism. The development is illustrated with three simple setups: a bivariate binary outcome, a bivariate Gaussian outcome and a two-stage sequential procedure with Gaussian outcome and with real longitudinal examples, involving both categorical and continuous outcomes. In particular, it is shown that the classical expected information matrix is biased and the use of the observed information matrix is recommended.

Book ChapterDOI
TL;DR: In this paper, a more robust approach is proposed to fit mixtures of multivariate t-distributions, which have longer tails than the normal components, using the expectation-maximization (EM) algorithm.
Abstract: Normal mixture models are being increasingly used as a way of clustering sets of continuous multivariate data. They provide a probabilistic (soft) clustering of the data in terms of their fitted posterior probabilities of membership of the mixture components corresponding to the clusters. An outright (hard) clustering can be subsequently obtained by assigning each observation to the component to which it has the highest fitted posterior probability of belonging. However, outliers in the data can affect the estimates of the parameters in the normal component densities, and hence the implied clustering. A more robust approach is to fit mixtures of multivariate t-distributions, which have longer tails than the normal components. The expectation-maximization (EM) algorithm can be used to fit mixtures of t-distributions by maximum likelihood. The application of this model to provide a robust approach to clustering is illustrated on a real data set. It is demonstrated how the use of t-components provides less extreme estimates of the posterior probabilities of cluster membership.

Proceedings Article
01 Dec 1998
TL;DR: This paper proposes an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Abstract: Dyadzc data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This type of data arises naturally in many application ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework of learning from dyadic data by statistical mixture models. Our approach covers different models with fiat and hierarchical latent class structures. We propose an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.

Journal ArticleDOI
TL;DR: An algorithm is suggested and it is shown that this algorithm converges to the solution of the minimization problem and a simulation study is presented, showing the superiority of the algorithm compared to the EM algorithm in the interval censoring case 2 setting.
Abstract: The problem of minimizing a smooth convex function over a specific cone in IRn is frequently encountered in nonparametric statistics. For that type of problem we suggest an algorithm and show that this algorithm converges to the solution of the minimization problem. Moreover, a simulation study is presented, showing the superiority of our algorithm compared to the EM algorithm in the interval censoring case 2 setting.

Journal ArticleDOI
TL;DR: In this paper, a finite mixed Poisson regression model with covariates in both Poisson rates and mixing probabilities is used to analyze the relationship between patents and research and development spending at the firm level.
Abstract: Count-data models are used to analyze the relationship between patents and research and development spending at the firm level, accounting for overdispersion using a finite mixed Poisson regression model with covariates in both Poisson rates and mixing probabilities. Maximum likelihood estimation using the EM and quasi-Newton algorithms is discussed. Monte Carlo studies suggest that (a) penalized likelihood criteria are a reliable basis for model selection and can be used to determine whether continuous or finite support for the mixing distribution is more appropriate and (b) when the mixing distribution is incorrectly specified, parameter estimates remain unbiased but have inflated variances.

Journal ArticleDOI
TL;DR: In this article, a state space model for long-range dependent data is developed and the exact likelihood function can be computed recursively in a finite number of steps using the Kalman filter, and an approximation to the likelihood function based on truncated state space equation is considered.
Abstract: This paper develops a state space modeling for long-range dependent data. Although a long-range dependent process has an infinite-dimensional state space representation, it is shown that by using the Kalman filter, the exact likelihood function can be computed recursively in a finite number of steps. Furthermore, an approximation to the likelihood function based on the truncated state space equation is considered. Asymptotic properties of these approximate maximum likelihood estimates are established for a class of long-range dependent models, namely, the fractional autoregressive moving average models. Simulation studies show rapid converging properties of the approximate maximum likelihood approach.

Journal ArticleDOI
TL;DR: This note compares two choices of basis for models parameterized by probabilities, showing that it is possible to improve on the traditional choice, the probability simplex, by transforming to the 'softmax' basis.
Abstract: Maximum a posteriori optimization of parameters and the Laplace approximation for the marginal likelihood are both basis-dependent methods. This note compares two choices of basis for models parameterized by probabilities, showing that it is possible to improve on the traditional choice, the probability simplex, by transforming to the ‘softmax’ basis.

Proceedings ArticleDOI
08 Nov 1998
TL;DR: Reconstructions of computer simulations and patient data show that the proposed iterative Bayesian reconstruction algorithm has the capacity to smooth the noise and maintain sharp edges without introducing over/under shoots and ripples around the edges.
Abstract: An iterative Bayesian reconstruction algorithm based on the total variation (TV) norm constraint is proposed. The motivation for using TV regularization is that it is extremely effective for recovering edges of images. The TV norm minimization, introduced in 1992 was shown to be effective for restoring blurred images with a Gaussian noise model and was demonstrated to be effective for noise suppression and edge preservation. The images were diffused according to a set of nonlinear anisotropic diffusion partial differential equations, which suffered from computational difficulties. This paper extends the TV norm minimization constraint to the field of SPECT image reconstruction with a Poisson noise model. The regularization norm is included in the ML-EM (maximum likelihood expectation maximization) algorithm. The partial differential equation approach is not utilized here. Reconstructions of computer simulations and patient data show that the proposed algorithm has the capacity to smooth the noise and maintain sharp edges without introducing over/under shoots and ripples around the edges.