scispace - formally typeset
Search or ask a question

Showing papers on "Maximum a posteriori estimation published in 2006"


Journal ArticleDOI
TL;DR: It is found that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
Abstract: Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by maximum likelihood (ML) and maximum a posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.

2,976 citations


Journal ArticleDOI
TL;DR: Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance.
Abstract: Comparison of the performance and accuracy of different inference methods, such as maximum likelihood (ML) and Bayesian inference, is difficult because the inference methods are implemented in different programs, often written by different authors. Both methods were implemented in the program MIGRATE, that estimates population genetic parameters, such as population sizes and migration rates, using coalescence theory. Both inference methods use the same Markov chain Monte Carlo algorithm and differ from each other in only two aspects: parameter proposal distribution and maximization of the likelihood function. Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance. Motivation: The Markov chain Monte Carlo-based ML framework can fail on sparse data and can deliver non-conservative support intervals. A Bayesian framework with appropriate prior distribution is able to remedy some of these problems. Results: The program MIGRATE was extended to allow not only for ML(-) maximum likelihood estimation of population genetics parameters but also for using a Bayesian framework. Comparisons between the Bayesian approach and the ML approach are facilitated because both modes estimate the same parameters under the same population model and assumptions. Availability: The program is available from http://popgen.csit.fsu.edu/ Contact: beerli@csit.fsu.edu

811 citations


Journal ArticleDOI
TL;DR: A fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori estimation technique by minimizing a multiterm cost function is proposed.
Abstract: In the last two decades, two related categories of problems have been studied independently in image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and, as conventional color digital cameras suffer from both low-spatial resolution and color-filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori estimation technique by minimizing a multiterm cost function. The L/sub 1/ norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for spatially regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the chrominance components. Finally, an additional regularization term is used to force similar edge location and orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.

459 citations


Journal ArticleDOI
TL;DR: It is shown that the maximum likelihood estimator (MLE) using only LOS estimates and the maximum a posteriori probability (MAP) estimator using both LOS and NLOS data can asymptotically achieve the CRLB and the G-CRLB, respectively.
Abstract: We present an analysis of the time-of-arrival (TOA), time-difference-of-arrival (TDOA), angle-of-arrival (AOA) and signal strength (SS) based positioning methods in a non-line-of-sight (NLOS) environment. Single path (line-of-sight (LOS) or NLOS) propagation is assumed. The best geolocation accuracy is evaluated in terms of the Cramer-Rao lower bound (CRLB) or the generalized CRLB (G-CRLB), depending on whether prior statistics of NLOS induced errors are unavailable or available. We then show that the maximum likelihood estimator (MLE) using only LOS estimates and the maximum a posteriori probability (MAP) estimator using both LOS and NLOS data can asymptotically achieve the CRLB and the G-CRLB, respectively. Hybrid schemes that adopt more than one type of position-pertaining data and the relationship among the four methods in terms of their positioning accuracy are also investigated.

428 citations


Journal ArticleDOI
TL;DR: The methodology is based on the maximum a posteriori estimate, which mathematically requires the minimization of the difference between observed spectral radiances and a nonlinear model of radiative transfer of the atmospheric state subject to the constraint that the estimated state must be consistent with an a priori probability distribution for that state.
Abstract: We describe the approach for the estimation of the atmospheric state, e.g., temperature, water, ozone, from calibrated, spectral radiances measured from the Tropospheric Emission Spectrometer (TES) onboard the Aura spacecraft. The methodology is based on the maximum a posteriori estimate, which mathematically requires the minimization of the difference between observed spectral radiances and a nonlinear model of radiative transfer of the atmospheric state subject to the constraint that the estimated state must be consistent with an a priori probability distribution for that state. The minimization techniques employed here are based on the trust-region Levenberg-Marquardt algorithm. An analysis of the errors for this estimate include smoothing, random, spectroscopic, "cross-state", representation, and systematic errors. In addition, several metrics and diagnostics are introduced that assess the resolution, quality, and statistical significance of the retrievals. We illustrate this methodology for the retrieval of atmospheric and surface temperature, water vapor, and ozone over the Gulf of Mexico on November 3, 2004.

267 citations


Journal ArticleDOI
TL;DR: In this paper, a flexible non-parametric inversion method for the interpretation of the integrated light spectra of galaxies, based on synthetic spectra (SSPs) of single stellar populations, is described.
Abstract: In this paper we describe STECMAP (STEllar Content via Maximum A Posteriori), a flexible, non-parametric inversion method for the interpretation of the integrated light spectra of galaxies, based on synthetic spectra of single stellar populations (SSPs). We focus on the recovery of a galaxy's star formation history and stellar age–metallicity relation. We use the high-resolution SSPs produced by pegase-hr to quantify the informational content of the wavelength range λλ= 4000–6800. Regularization of the inversion is achieved by requiring that the solutions are relatively smooth functions of age. The smoothness parameter is set automatically via generalized cross validation. A detailed investigation of the properties of the corresponding simplified linear problem is performed using singular value decomposition. It turns out to be a powerful tool for explaining and predicting the behaviour of the inversion, and may help designing SSP models in the future. We provide means of quantifying the fundamental limitations of the problem considering the intrinsic properties of the SSPs in the spectral range of interest, as well as the noise in these models and in the data. We demonstrate that the information relative to the stellar content is relatively evenly distributed within the optical spectrum. We show that one should not attempt to recover more than about eight characteristic episodes in the star formation history from the wavelength domain we consider. STECMAP preserves optimal (in the cross validation sense) freedom in the characterization of these episodes for each spectrum. We performed a systematic simulation campaign and found that, when the time elapsed between two bursts of star formation is larger than 0.8 dex, the properties of each episode can be constrained with a precision of 0.02 dex in age and 0.04 dex in metallicity from high-quality data [R= 10 000, signal-to-noise ratio (SNR) = 100 per pixel], not taking model errors into account. We also found that the spectral resolution has little effect on population separation provided low- and high-resolution experiments are performed with the same SNR per A. However, higher spectral resolution does improve the accuracy of metallicity and age estimates in double-burst separation experiments. When the fluxes of the data are properly calibrated, extinction can be estimated; otherwise the continuum can be discarded or used to estimate flux correction factors. The described methods and error estimates will be useful in the design and in the analysis of extragalactic spectroscopic surveys.

254 citations


Journal ArticleDOI
TL;DR: The asymptotic distribution of the maximum likelihood estimator of R is obtained and the confidence interval of R can be obtained, and two bootstrap confidence intervals are proposed.
Abstract: This paper deals with the estimation of R=P[Y

226 citations


Book ChapterDOI
01 Oct 2006
TL;DR: A strategy for filtering diffusion tensor magnetic resonance images that accounts for Rician noise through a data likelihood term that is combined with a spatial smoothing prior and compares favorably with several other approaches from the literature.
Abstract: Rician noise introduces a bias into MRI measurements that can have a significant impact on the shapes and orientations of tensors in diffusion tensor magnetic resonance images. This is less of a problem in structural MRI, because this bias is signal dependent and it does not seriously impair tissue identification or clinical diagnoses. However, diffusion imaging is used extensively for quantitative evaluations, and the tensors used in those evaluations are biased in ways that depend on orientation and signal levels. This paper presents a strategy for filtering diffusion tensor magnetic resonance images that addresses these issues. The method is a maximum a posteriori estimation technique that operates directly on the diffusion weighted images and accounts for the biases introduced by Rician noise. We account for Rician noise through a data likelihood term that is combined with a spatial smoothing prior. The method compares favorably with several other approaches from the literature, including methods that filter diffusion weighted imagery and those that operate directly on the diffusion tensors.

222 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of audio source separation with one single sensor, using a statistical model of the sources, based on a learning step from samples of each source separately, during which Gaussian scaled mixture models (GSMM) are trained.
Abstract: In this paper, we address the problem of audio source separation with one single sensor, using a statistical model of the sources. The approach is based on a learning step from samples of each source separately, during which we train Gaussian scaled mixture models (GSMM). During the separation step, we derive maximum a posteriori (MAP) and/or posterior mean (PM) estimates of the sources, given the observed audio mixture (Bayesian framework). From the experimental point of view, we test and evaluate the method on real audio examples.

204 citations


Journal ArticleDOI
TL;DR: A new generalized expectation maximization (GEM) algorithm, where the missing variables are the scale factors of the GSM densities, and the maximization step of the underlying expectation maximizations algorithm is replaced with a linear stationary second-order iterative method.
Abstract: Image deconvolution is formulated in the wavelet domain under the Bayesian framework. The well-known sparsity of the wavelet coefficients of real-world images is modeled by heavy-tailed priors belonging to the Gaussian scale mixture (GSM) class; i.e., priors given by a linear (finite of infinite) combination of Gaussian densities. This class includes, among others, the generalized Gaussian, the Jeffreys , and the Gaussian mixture priors. Necessary and sufficient conditions are stated under which the prior induced by a thresholding/shrinking denoising rule is a GSM. This result is then used to show that the prior induced by the "nonnegative garrote" thresholding/shrinking rule, herein termed the garrote prior, is a GSM. To compute the maximum a posteriori estimate, we propose a new generalized expectation maximization (GEM) algorithm, where the missing variables are the scale factors of the GSM densities. The maximization step of the underlying expectation maximization algorithm is replaced with a linear stationary second-order iterative method. The result is a GEM algorithm of O(NlogN) computational complexity. In a series of benchmark tests, the proposed approach outperforms or performs similarly to state-of-the art methods, demanding comparable (in some cases, much less) computational complexity.

201 citations


Proceedings ArticleDOI
25 Jun 2006
TL;DR: This work proposes efficient particle smoothing methods for generalized state-spaces models by integrating dual tree recursions and fast multipole techniques with forward-backward smoothers, a new generalized two-filter smoother and a maximum a posteriori (MAP) smoother.
Abstract: We propose efficient particle smoothing methods for generalized state-spaces models. Particle smoothing is an expensive O(N2) algorithm, where N is the number of particles. We overcome this problem by integrating dual tree recursions and fast multipole techniques with forward-backward smoothers, a new generalized two-filter smoother and a maximum a posteriori (MAP) smoother. Our experiments show that these improvements can substantially increase the practicality of particle smoothing.

Journal ArticleDOI
TL;DR: This paper proposes a novel adaptive despeckling filter and derives a maximum a posteriori (MAP) estimator for the radar cross section (RCS) using the recently introduced heavy-tailed Rayleigh density function.
Abstract: Synthetic aperture radar (SAR) images are inherently affected by a signal dependent noise known as speckle, which is due to the radar wave coherence. In this paper, we propose a novel adaptive despeckling filter and derive a maximum a posteriori (MAP) estimator for the radar cross section (RCS). We first employ a logarithmic transformation to change the multiplicative speckle into additive noise. We model the RCS using the recently introduced heavy-tailed Rayleigh density function, which was derived based on the assumption that the real and imaginary parts of the received complex signal are best described using the alpha-stable family of distribution. We estimate model parameters from noisy observations by means of second-kind statistics theory, which relies on the Mellin transform. Finally, we compare the proposed algorithm with several classical speckle filters applied on actual SAR images. Experimental results show that the homomorphic MAP filter based on the heavy-tailed Rayleigh prior for the RCS is among the best for speckle removal

Journal ArticleDOI
TL;DR: A Bayesian probabilistic approach is introduced, in which a shape is assumed to have “grown” from a skeleton by a stochastic generative process, and Bayesian estimation is used to identify the skeleton most likely to have produced the shape, called the maximum a posteriori skeleton.
Abstract: Skeletal representations of shape have attracted enormous interest ever since their introduction by Blum [Blum H (1973) J Theor Biol 38:205-287], because of their potential to provide a compact, but meaningful, shape representation, suitable for both neural modeling and computational applications. But effective computation of the shape skeleton remains a notorious unsolved problem; existing approaches are extremely sensitive to noise and give counterintuitive results with simple shapes. In conventional approaches, the skeleton is defined by a geometric construction and computed by a deterministic procedure. We introduce a Bayesian probabilistic approach, in which a shape is assumed to have "grown" from a skeleton by a stochastic generative process. Bayesian estimation is used to identify the skeleton most likely to have produced the shape, i.e., that best "explains" it, called the maximum a posteriori skeleton. Even with natural shapes with substantial contour noise, this approach provides a robust skeletal representation whose branches correspond to the natural parts of the shape.

Journal ArticleDOI
TL;DR: An optimal (maximum a posteriori) joint estimator for the channel impulse response (CIR), CFO, and PHN is introduced, utilizing prior statistical knowledge of PHN that can be obtained from measurements or data sheets.
Abstract: Accurate channel estimates are needed in orthogonal frequency-division multiplexing (OFDM), and easily obtained under the assumption of perfect phase and frequency synchronization. However, the practical receiver encounters nonnegligible phase noise (PHN) and carrier frequency offset (CFO), which create substantial intercarrier interference that a conventional OFDM channel estimator cannot account for. In this paper, we introduce an optimal (maximum a posteriori) joint estimator for the channel impulse response (CIR), CFO, and PHN, utilizing prior statistical knowledge of PHN that can be obtained from measurements or data sheets. In addition, in cases where a training symbol consists of two identical halves in the time domain, we propose a variant to Moose's CFO estimation algorithm that optimally removes the effect of PHN with lower complexity than with a nonrepeating training symbol. To further reduce the complexity of the proposed algorithms, simplified implementations based on the conjugate gradient method are also introduced such that the estimators studied in this paper can be realized efficiently using the fast Fourier transform with only minor performance degradation. EDICS: SPC-MULT, SPC-CEST, SPC-DETC

Journal ArticleDOI
TL;DR: This work proposes two algorithms for the problem of obtaining a single high-resolution image from multiple noisy, blurred, and undersampled images based on a Bayesian formulation that is implemented via the expectation maximization algorithm and a maximum a posteriori formulation.
Abstract: Using a stochastic framework, we propose two algorithms for the problem of obtaining a single high-resolution image from multiple noisy, blurred, and undersampled images. The first is based on a Bayesian formulation that is implemented via the expectation maximization algorithm. The second is based on a maximum a posteriori formulation. In both of our formulations, the registration, noise, and image statistics are treated as unknown parameters. These unknown parameters and the high-resolution image are estimated jointly based on the available observations. We present an efficient implementation of these algorithms in the frequency domain that allows their application to large images. Simulations are presented that test and compare the proposed algorithms.

Book ChapterDOI
22 Jun 2006
TL;DR: In this paper, the dual of approximate maximum entropy estimation is maximum a posteriori estimation as a special case, which leads to stability and convergence bounds for many statistical learning problems, and can be used to solve this class of optimization problems efficiently.
Abstract: In this paper we unify divergence minimization and statistical inference by means of convex duality. In the process of doing so, we prove that the dual of approximate maximum entropy estimation is maximum a posteriori estimation as a special case. Moreover, our treatment leads to stability and convergence bounds for many statistical learning problems. Finally, we show how an algorithm by Zhang can be used to solve this class of optimization problems efficiently.

Journal ArticleDOI
TL;DR: This article evaluates the performance, robustness and complexity of GMM and HMM-based approaches, using both manual and automatic face localization on the relatively difficult BANCA database, and extends the GMM approach through the use of local features with embedded positional information, increasing performance without sacrificing its low complexity.
Abstract: It has been previously demonstrated that systems based on local features and relatively complex statistical models, namely, one-dimensional (1-D) hidden Markov models (HMMs) and pseudo-two-dimensional (2-D) HMMs, are suitable for face recognition. Recently, a simpler statistical model, namely, the Gaussian mixture model (GMM), was also shown to perform well. In much of the literature devoted to these models, the experiments were performed with controlled images (manual face localization, controlled lighting, background, pose, etc). However, a practical recognition system has to be robust to more challenging conditions. In this article we evaluate, on the relatively difficult BANCA database, the performance, robustness and complexity of GMM and HMM-based approaches, using both manual and automatic face localization. We extend the GMM approach through the use of local features with embedded positional information, increasing performance without sacrificing its low complexity. Furthermore, we show that the traditionally used maximum likelihood (ML) training approach has problems estimating robust model parameters when there is only a few training images available. Considerably more precise models can be obtained through the use of Maximum a posteriori probability (MAP) training. We also show that face recognition techniques which obtain good performance on manually located faces do not necessarily obtain good performance on automatically located faces, indicating that recognition techniques must be designed from the ground up to handle imperfect localization. Finally, we show that while the pseudo-2-D HMM approach has the best overall performance, authentication time on current hardware makes it impractical. The best tradeoff in terms of authentication time, robustness and discrimination performance is achieved by the extended GMM approach.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that MAP filtering can be successfully applied to SAR images represented in the shift-invariant wavelet domain, without resorting to a logarithmic transformation.
Abstract: In this paper, a new despeckling method based on undecimated wavelet decomposition and maximum a posteriori (MAP) estimation is proposed. Such a method relies on the assumption that the probability density function (pdf) of each wavelet coefficient is generalized Gaussian (GG). The major novelty of the proposed approach is that the parameters of the GG pdf are taken to be space-varying within each wavelet frame. Thus, they may be adjusted to spatial image context, not only to scale and orientation. Since the MAP equation to be solved is a function of the parameters of the assumed pdf model, the variance and shape factor of the GG function are derived from the theoretical moments, which depend on the moments and joint moments of the observed noisy signal and on the statistics of speckle. The solution of the MAP equation yields the MAP estimate of the wavelet coefficients of the noise-free image. The restored SAR image is synthesized from such coefficients. Experimental results, carried out on both synthetic speckled images and true SAR images, demonstrate that MAP filtering can be successfully applied to SAR images represented in the shift-invariant wavelet domain, without resorting to a logarithmic transformation

Journal ArticleDOI
TL;DR: Experimental studies show that the RiIG MAP filter has excellent filtering performance in the sense that it smooths homogeneous regions, and at the same time preserves details.
Abstract: In this paper, a new statistical model for representing the amplitude statistics of ultrasonic images is presented. The model is called the Rician inverse Gaussian (RiIG) distribution, due to the fact that it is constructed as a mixture of the Rice distribution and the Inverse Gaussian distribution. The probability density function (pdf) of the RiIG model is given in closed form as a function of three parameters. Some theoretical background on this new model is discussed, and an iterative algorithm for estimating its parameters from data is given. Then, the appropriateness of the RiIG distribution as a model for the amplitude statistics of medical ultrasound images is experimentally studied. It is shown that the new distribution can fit to the various shapes of local histograms of linearly scaled ultrasound data better than existing models. A log-likelihood cross-validation comparison of the predictive performance of the RiIG, the K, and the generalized Nakagami models turns out in favor of the new model. Furthermore, a maximum a posteriori (MAP) filter is developed based on the RiIG distribution. Experimental studies show that the RiIG MAP filter has excellent filtering performance in the sense that it smooths homogeneous regions, and at the same time preserves details.

Journal ArticleDOI
TL;DR: The 3-D pose tracking task is formulated in a Bayesian framework which fuses feature correspondence information from both previous frame and some selected key-frames into the posterior distribution of pose and the maximum a posteriori estimation of pose is obtained via stochastic sampling to achieve stable and drift-free tracking.
Abstract: In this paper, we propose a novel approach for real-time 3-D tracking of object pose from a single camera. We formulate the 3-D pose tracking task in a Bayesian framework which fuses feature correspondence information from both previous frame and some selected key-frames into the posterior distribution of pose. We also developed an inter-frame motion inference algorithm which can get reliable inter-frame feature correspondences and relative pose. Finally, the maximum a posteriori estimation of pose is obtained via stochastic sampling to achieve stable and drift-free tracking. Experiments show significant improvement of our algorithm over existing algorithms especially in the cases of tracking agile motion, severe occlusion, drastic illumination change, and large object scale change

Journal ArticleDOI
TL;DR: The combination of region segmentation and edge detection proved to be a robust technique, as adequate clusters were automatically identified, regardless of the noise level and bias, in this work.
Abstract: Brain magnetic resonance imaging segmentation is accomplished in this work by applying nonparametric density estimation, using the mean shift algorithm in the joint spatial-range domain. The quality of the class boundaries is improved by including an edge confidence map, that represents the confidence of truly being in the presence of a border between adjacent regions; an adjacency graph is then constructed with the labeled regions, and analyzed and pruned to merge adjacent regions. In order to assign image regions to a cerebral tissue type, a spatial normalization between image data and standard probability maps is carried out, so that for each structure a maximum a posteriori probability criterion is applied. The method was applied to synthetic and real images, keeping all parameters constant throughout the process for each type of data. The combination of region segmentation and edge detection proved to be a robust technique, as adequate clusters were automatically identified, regardless of the noise level and bias. In a comparison with reference segmentations, average Tanimoto indexes of 0.90-0.99 were obtained for synthetic data and of 0.59-0.99 for real data, considering gray matter, white matter, and background.

Journal ArticleDOI
TL;DR: The maximum a posteriori (MAP) adaptation is introduced to the problem of SIS estimation, and it is demonstrated that SIS varies significantly from person to person, and most SISs are not similar to AIS.

Journal ArticleDOI
TL;DR: A class of image restoration algorithms based on the Bayesian approach and a new hierarchical spatially adaptive image prior that preserves edges and generalizes the on/off (binary) line process idea used in previous image priors within the context of Markov random fields (MRFs).
Abstract: In this paper, we propose a class of image restoration algorithms based on the Bayesian approach and a new hierarchical spatially adaptive image prior. The proposed prior has the following two desirable features. First, it models the local image discontinuities in different directions with a model which is continuous valued. Thus, it preserves edges and generalizes the on/off (binary) line process idea used in previous image priors within the context of Markov random fields (MRFs). Second, it is Gaussian in nature and provides estimates that are easy to compute. Using this new hierarchical prior, two restoration algorithms are derived. The first is based on the maximum a posteriori principle and the second on the Bayesian methodology. Numerical experiments are presented that compare the proposed algorithms among themselves and with previous stationary and non stationary MRF-based with line process algorithms. These experiments demonstrate the advantages of the proposed prior

Journal ArticleDOI
01 Jul 2006
TL;DR: The reconstruction performance of the considered ML and MAP statistical height estimation methods are evaluated in terms of the Cramer-Rao Lower Bounds (CRLB) of the estimated height values.
Abstract: Interferometric synthetic aperture radar (InSAR) systems allow the estimation of the height profile of the Earth surface. When the height profile of the observed scene is characterized by high slopes or exhibits strong height discontinuities, the height reconstruction obtained from a single interferogram is ambiguous, since the solution of the estimation problem is not unique. To solve this ambiguity and restore the solution uniqueness, multiple interferograms, obtained with different baselines and/or with different frequencies, have to be used (multichannel InSAR). The height profile can then be estimated from multiple interferograms using maximum likelihood (ML) estimation techniques or by means of maximum a posteriori (MAP) estimation techniques, which take into account the relation between adjacent pixels. In this paper, the height estimation accuracy achievable with a given multibaseline interferometric configuration and using the aforementioned estimation techniques in terms of Cramer-Rao lower bound for the ML and of error lower bound for the MAP, is analyzed and discussed. It is shown that the MAP technique outperforms the ML one and that its attainable accuracy is not sensitive to the baselines choice, while mainly depends on the ground slopes.

Journal ArticleDOI
TL;DR: A combined discriminative/generative formulation is derived that leverages the complimentary strengths of both models in a principled framework for articulated pose inference, and two efficient MAP pose estimation algorithms are derived from this formulation.
Abstract: We develop a method for the estimation of articulated pose, such as that of the human body or the human hand, from a single (monocular) image. Pose estimation is formulated as a statistical inference problem, where the goal is to find a posterior probability distribution over poses as well as a maximum a posteriori (MAP) estimate. The method combines two modeling approaches, one discriminative and the other generative. The discriminative model consists of a set of mapping functions that are constructed automatically from a labeled training set of body poses and their respective image features. The discriminative formulation allows for modeling ambiguous, one-to-many mappings (through the use of multi-modal distributions) that may yield multiple valid articulated pose hypotheses from a single image. The generative model is defined in terms of a computer graphics rendering of poses. While the generative model offers an accurate way to relate observed (image features) and hidden (body pose) random variables, it is difficult to use it directly in pose estimation, since inference is computationally intractable. In contrast, inference with the discriminative model is tractable, but considerably less accurate for the problem of interest. A combined discriminative/generative formulation is derived that leverages the complimentary strengths of both models in a principled framework for articulated pose inference. Two efficient MAP pose estimation algorithms are derived from this formulation; the first is deterministic and the second non-deterministic. Performance of the framework is quantitatively evaluated in estimating articulated pose of both the human hand and human body.

Journal ArticleDOI
TL;DR: A variational Bayes (VB) framework for learning continuous hidden Markov models (CHMMs), and the VB framework within active learning is examined, demonstrating that all of these active learning methods can significantly reduce the amount of required labeling, compared to random selection of samples for labeling.
Abstract: In this paper, we present a variational Bayes (VB) framework for learning continuous hidden Markov models (CHMMs), and we examine the VB framework within active learning. Unlike a maximum likelihood or maximum a posteriori training procedure, which yield a point estimate of the CHMM parameters, VB-based training yields an estimate of the full posterior of the model parameters. This is particularly important for small training sets since it gives a measure of confidence in the accuracy of the learned model. This is utilized within the context of active learning, for which we acquire labels for those feature vectors for which knowledge of the associated label would be most informative for reducing model-parameter uncertainty. Three active learning algorithms are considered in this paper: 1) query by committee (QBC), with the goal of selecting data for labeling that minimize the classification variance, 2) a maximum expected information gain method that seeks to label data with the goal of reducing the entropy of the model parameters, and 3) an error-reduction-based procedure that attempts to minimize classification error over the test data. The experimental results are presented for synthetic and measured data. We demonstrate that all of these active learning methods can significantly reduce the amount of required labeling, compared to random selection of samples for labeling.

Proceedings Article
04 Dec 2006
TL;DR: This paper presents a new method, called COMPOSE, for exploiting combinatorial optimization for sub-networks within the context of a max-product belief propagation algorithm, and describes highly efficient methods for computing max-marginals for subnetworks corresponding both to bipartite matchings and to regular networks.
Abstract: In general, the problem of computing a maximum a posteriori (MAP) assignment in a Markov random field (MRF) is computationally intractable. However, in certain subclasses of MRF, an optimal or close-to-optimal assignment can be found very efficiently using combinatorial optimization algorithms: certain MRFs with mutual exclusion constraints can be solved using bipartite matching, and MRFs with regular potentials can be solved using minimum cut methods. However, these solutions do not apply to the many MRFs that contain such tractable components as sub-networks, but also other non-complying potentials. In this paper, we present a new method, called COMPOSE, for exploiting combinatorial optimization for sub-networks within the context of a max-product belief propagation algorithm. COMPOSE uses combinatorial optimization for computing exact max-marginals for an entire sub-network; these can then be used for inference in the context of the network as a whole. We describe highly efficient methods for computing max-marginals for subnetworks corresponding both to bipartite matchings and to regular networks. We present results on both synthetic and real networks encoding correspondence problems between images, which involve both matching constraints and pairwise geometric constraints. We compare to a range of current methods, showing that the ability of COMPOSE to transmit information globally across the network leads to improved convergence, decreased running time, and higher-scoring assignments.

Journal ArticleDOI
TL;DR: A decomposition-enabled edge-preserving image restoration algorithm for maximizing the likelihood function that exploits the sparsity of edges to define an FFT-based iteration that requires few iterations and is guaranteed to converge to the MAP estimate.
Abstract: The regularization of the least-squares criterion is an effective approach in image restoration to reduce noise amplification. To avoid the smoothing of edges, edge-preserving regularization using a Gaussian Markov random field (GMRF) model is often used to allow realistic edge modeling and provide stable maximum a posteriori (MAP) solutions. However, this approach is computationally demanding because the introduction of a non-Gaussian image prior makes the restoration problem shift-variant. In this case, a direct solution using fast Fourier transforms (FFTs) is not possible, even when the blurring is shift-invariant. We consider a class of edge-preserving GMRF functions that are convex and have nonquadratic regions that impose less smoothing on edges. We propose a decomposition-enabled edge-preserving image restoration algorithm for maximizing the likelihood function. By decomposing the problem into two subproblems, with one shift-invariant and the other shift-variant, our algorithm exploits the sparsity of edges to define an FFT-based iteration that requires few iterations and is guaranteed to converge to the MAP estimate

Journal ArticleDOI
TL;DR: In this paper, the authors compare the quality of various types of posterior mode point and interval estimates for the parameters of latent class models with both the classical maximum likelihood estimates and the bootstrap estimates proposed by De Menezes.
Abstract: In maximum likelihood estimation of latent class models, it often occurs that one or more of the parameter estimates are on the boundary of the parameter space; that is, that estimated probabilities equal 0 (or 1) or, equivalently, that logit coefficients equal minus (or plus) infinity. This not only causes numerical problems in the computation of the variance-covariance matrix, it also makes the reported confidence intervals and significance tests for the parameters concerned meaningless. Boundary estimates can, however, easily be prevented by the use of prior distributions for the model parameters, yielding a Bayesian procedure called posterior mode or maximum a posteriori estimation. This approach is implemented in, for example, the Latent GOLD software packages for latent class analysis (Vermunt & Magidson, 2005). Little is, however, known about the quality of posterior mode estimates of the parameters of latent class models, nor about their sensitivity for the choice of the prior distribution. In this paper, we compare the quality of various types of posterior mode point and interval estimates for the parameters of latent class models with both the classical maximum likelihood estimates and the bootstrap estimates proposed by De Menezes (1999). Our simulation study shows that parameter estimates and standard errors obtained by the Bayesian approach are more reliable than the corresponding parameter estimates and standard errors obtained by maximum likelihood and parametric bootstrapping.

Journal ArticleDOI
TL;DR: This work addresses the problem of robust normal reconstruction by dense photometric stereo by forming the problem as a Markov network and investigates two important inference algorithms for Markov random fields (MRFs) - graph cuts and belief propagation - to optimize for the most likely setting for each node in the network.
Abstract: We address the problem of robust normal reconstruction by dense photometric stereo, in the presence of complex geometry, shadows, highlight, transparencies, variable attenuation in light intensities, and inaccurate estimation in light directions. The input is a dense set of noisy photometric images, conveniently captured by using a very simple set-up consisting of a digital video camera, a reflective mirror sphere, and a handheld spotlight. We formulate the dense photometric stereo problem as a Markov network and investigate two important inference algorithms for Markov random fields (MRFs) - graph cuts and belief propagation - to optimize for the most likely setting for each node in the network. In the graph cut algorithm, the MRF formulation is translated into one of energy minimization. A discontinuity-preserving metric is introduced as the compatibility function, which allows a-expansion to efficiently perform the maximum a posteriori (MAP) estimation. Using the identical dense input and the same MRF formulation, our tensor belief propagation algorithm recovers faithful normal directions, preserves underlying discontinuities, improves the normal estimation from one of discrete to continuous, and drastically reduces the storage requirement and running time. Both algorithms produce comparable and very faithful normals for complex scenes. Although the discontinuity-preserving metric in graph cuts permits efficient inference of optimal discrete labels with a theoretical guarantee, our estimation algorithm using tensor belief propagation converges to comparable results, but runs faster because very compact messages are passed and combined. We present very encouraging results on normal reconstruction. A simple algorithm is proposed to reconstruct a surface from a normal map recovered by our method. With the reconstructed surface, an inverse process, known as relighting in computer graphics, is proposed to synthesize novel images of the given scene under user-specified light source and direction. The synthesis is made to run in real time by exploiting the state-of-the-art graphics processing unit (GPU). Our method offers many unique advantages over previous relighting methods and can handle a wide range of novel light sources and directions