scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian probability published in 2002"


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations


Journal Article
29 Nov 2002-Genomics
TL;DR: In this paper, the authors proposed a new method for approximate Bayesian statistical inference on the basis of summary statistics, which is suited to complex problems that arise in population genetics, extending ideas developed in this setting by earlier authors.

2,218 citations


01 Jul 2002
TL;DR: In this paper, a parallel implementation of Metropolis-Coupled Markov Chain Monte Carlo (MCMC) has been proposed to explore multiple peaks in the posterior distribution of trees while maintaining a fast execution time.
Abstract: Bayesian estimation of phylogeny is based on the posterior probability distribution of trees. Currently, the only numerical method that can effectively approximate posterior probabilities of trees is Markov Chain Monte Carlo (MCMC). Standard implementations of MCMC can be prone to entrapment in local optima. A variant of MCMC, known as Metropolis-Coupled MCMC, allows multiple peaks in the landscape of trees to be more readily explored, but at the cost of increased execution time. This paper presents a parallel algorithm for Metropolis-Coupled MCMC. The proposed parallel algorithm retains the ability to explore multiple peaks in the posterior distribution of trees while maintaining a fast execution time. The algorithm has been implemented using two parallel programming models: the Message Passing Interface (MPI) and the Cashmere software distributed shared memory protocol. Performance results indicate nearly linear speed improvement in both programming models for small and large data sets. (MrBayes v3.0 is available at http://morphbank.ebc.uu.se/mrbayes/.)

965 citations


Journal ArticleDOI
TL;DR: An approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets is developed based on a combination of a Bayesian on-line algorithm and a sequential construction of a relevant subsample of data that fully specifies the prediction of the GP model.
Abstract: We develop an approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a Combination of a Bayesian on-line algorithm, together with a sequential construction of a relevant subsample of the data that fully specifies the prediction of the GP model. By using an appealing parameterization and projection techniques in a reproducing kernel Hilbert space, recursions for the effective parameters and a sparse gaussian approximation of the posterior process are obtained. This allows for both a propagation of predictions and Bayesian error measures. The significance and robustness of our approach are demonstrated on a variety of experiments.

802 citations


Journal ArticleDOI
TL;DR: A series of models that exemplify the diversity of problems that can be addressed within the empirical Bayesian framework are presented, using PET data to show how priors can be derived from the between-voxel distribution of activations over the brain.

744 citations


Journal ArticleDOI
TL;DR: The procedures used in conventional data analysis are formulated in terms of hierarchical linear models and a connection between classical inference and parametric empirical Bayes (PEB) through covariance component estimation is established through covariances component estimation.

647 citations


Journal ArticleDOI
TL;DR: It is shown by computer simulation that posterior probabilities in Bayesian analysis can be excessively liberal when concatenated gene sequences are used, whereas bootstrap probabilities in neighbor-joining and maximum likelihood analyses are generally slightly conservative.
Abstract: Bayesian phylogenetics has recently been proposed as a powerful method for inferring molecular phylogenies, and it has been reported that the mammalian and some plant phylogenies were resolved by using this method. The statistical confidence of interior branches as judged by posterior probabilities in Bayesian analysis is generally higher than that as judged by bootstrap probabilities in maximum likelihood analysis, and this difference has been interpreted as an indication that bootstrap support may be too conservative. However, it is possible that the posterior probabilities are too high or too liberal instead. Here, we show by computer simulation that posterior probabilities in Bayesian analysis can be excessively liberal when concatenated gene sequences are used, whereas bootstrap probabilities in neighbor-joining and maximum likelihood analyses are generally slightly conservative. These results indicate that bootstrap probabilities are more suitable for assessing the reliability of phylogenetic trees than posterior probabilities and that the mammalian and plant phylogenies may not have been fully resolved.

624 citations



Book
06 May 2002
TL;DR: In this paper, Bayesian tree models are used for classification using Generalised Nonlinear Models (GNNs) and Bayesian Modelling (BMMs) for nearest neighbor models.
Abstract: Preface. Acknowledgements. Introduction. Bayesian Modelling. Curve Fitting. Surface Fitting. Classification using Generalised Nonlinear Models. Bayesian Tree Models. Partition Models. Nearest-Neighbour Models. Multiple Response Models. Appendix A: Probability Distributions. Appendix B: Inferential Processes. References Index. Author Index.

423 citations


Journal ArticleDOI
TL;DR: In this paper, a Markov chain Monte Carlo (MCMC) algorithm is applied to the nonlinear problem of inverting DC resistivity sounding data to infer characteristics of a 1-D earth model.
Abstract: Summary A key element in the solution of a geophysical inverse problem is the quantification of non-uniqueness, that is, how much parameters of an inferred earth model can vary while fitting a set of measurements. A widely used approach is that of Bayesian inference, where Bayes' rule is used to determine the uncertainty of the earth model parameters a posteriori given the data. I describe here, a natural extension of Bayesian parameter estimation that accounts for the posterior probability of how complex an earth model is (specifically, how many layers it contains). This approach has a built-in parsimony criterion: among all earth models that fit the data, those with fewer parameters (fewer layers) have higher posterior probabilities. To implement this approach in practice, I use a Markov chain Monte Carlo (MCMC) algorithm applied to the nonlinear problem of inverting DC resistivity sounding data to infer characteristics of a 1-D earth model. The earth model is parametrized as a layered medium, where the number of layers and their resistivities and thicknesses are poorly known a priori. The algorithm obtains a sample of layered media from the posterior distribution; this sample measures non-uniqueness in terms of how many layers are effectively resolved by the data and of the range of layer thicknesses and resistivities consistent with the data. Because the complexity of the model is effectively determined by the data, the solution does not need to be regularized. This is a desirable feature, because requiring the solution to be smooth beyond what is implied by prior information can lead to underestimating posterior uncertainty. Letting the number of layers be a free parameter, as done here, broadens the space of earth models possible a priori and makes the determination of posterior uncertainty less dependent on the parametrization.

391 citations



Journal ArticleDOI
TL;DR: The stated objectives—to offer statistical methodology for use by laymen outside the grasp of supporting principles—are achieved commendably by the authors, and the extensive tables are the result of computer-intensive optimization algorithms seeking optimal precision.
Abstract: implementing these tools. Supporting developments are given in Part II. The printed tables and access to the CD-ROM are given in Part III as needed to implement the methods. Detailed case studies are developed in Part IV, illustrating the range of data analyses supported by the tables. The stated objectives—to offer statistical methodology for use by laymen outside the grasp of supporting principles—are achieved commendably by the authors. The tables, both printed and electronic, are easily accessed by the novice through a self-paced study following step-by-step examples, especially as given in Part IV. At the same time, knowledgeable users deserve some explanation as to the statistical principles on which the methodology rests. Comments on these issues constitute much of the remainder of this review. Let X be the sample space containing outcomes X4n5 D 6X11 : : : 1Xn7 of independent Bernoulli trials having parameter M p 2 601 17 with value p to be determined, and let XS D X1 C X2 C ¢ ¢ ¢ C Xn. Procedures are based in principle on either X4n5 or XS , but in practice on the latter. The authors trace developments back to Jacob Bernoulli, and draw heavily on the foundations of Jerzy Neyman for “scientiŽ c statistics.” The authors break ranks with conventional statistics on two essential grounds: (1) the range of M p and (2) the use of asymptotics in a practice often typiŽ ed by small to moderate samples. Regarding (1), they conŽ dently assert that users can accurately stipulate a proper subset 6p1 N p7 of 601 17, called the measurement space, wherein M p is known to lie with certainty. Accordingly, they seek solutions to problems of inference that conform to the measurement space. Classical estimation using XS=n is faulted heavily here in giving often nonconforming values. With regard to (2), the computerintensive developments on which the tables rest are essentially exact for small samples. In rough analogy with scales for weighing mice and elephants, to each measurement space there corresponds a variety of data—analytic tools, listed as (a)–(d) in the second paragraph of this report, that are supported through the printed tables and the CD-ROM. Technical developments begin with the dual issues of ‚-measurement intervals for M p in 6p1 N p7 and ‚-prediction regions in X, with conŽ dence level ¶ ‚ as a gauge of their reliability. These constitute the “‚-measurement & prediction space.” Other methods build on these. Point estimation focuses on ‚-estimators in 6p1 N p7 depending on the sample size as well as on the conŽ dence level ‚. These include (a) the “minimum MSE ‚-estimator,” designed to minimize the conditional mean squared error, given the ‚-measurement & prediction space, and (b) the “midpoint ‚-estimator” as the midpoint of the ‚-measurement interval for M p in 6p1 N p7. The aim of exclusion, in lieu of hypothesis testing, is “to show that the actual value p of M p is different from any value in 6p1 N p7 6p1 N p7, where the reliability of the exclusion procedure is speciŽ ed by the signiŽ cance level .” Thus H0 2 M p 2 6p1 N p7 is excluded from 6p1 N p7. There appear to be no errors of the second kind, because M p always belongs to 6p1 N p7, and thus no concept of the power of an exclusion procedure to exclude. The extensive tables are the result of computer-intensive optimization algorithms seeking optimal precision for each nominal reliability level, while reducing excess reliability arising from discreteness of the problem. Procedures based on X4n5 are shown by inclusion to be superior to ones based on XS . Nonetheless, the available tables use XS owing to apparent practical constraints. In particular, typical input variables are the measurement range 6p1 N p7, the sample size n, the conŽ dence level ‚, the realization XS , and allied information pertaining to exclusion, for example. Output in turn consists of ‚-measurement intervals and other quantities of use in assessing the data. Principles undergirding the analyses are ostensibly non-Bayesian. Nonetheless, M p does become a random variable during the course of the authors’ developments, essentially through the assignment of a Bayesian uniform prior over 6p1 N p7. Despite the careful development of these methodologies, and extensive tables for their implementation, this reviewer sees serious impediments to their effective use. These reservations focus largely on the assumption that the measurement range 6p1 N p7 can itself be stipulated accurately by users. This concern pervades every stage of the scientiŽ c method. New experiments, unless strictly conŽ rmatory, do chart new paths, so that past experience regarding earlier measurement spaces need not carry over without modiŽ cation. At issue are problems with misspeciŽ cation of the range, the consequences of such misspeciŽ cation, and possible robustness of procedures to such misspeciŽ cation. The authors essentially remain mute on these critical issues. For if the parameter range is cast too wide, then the authors’ objections to classical methods (based on M p 2 601175 apply verbatim to their own methods, but now with regard to nonconformity with the actual (now smaller) measurement space. Consequences of prescribing too narrow a range remain to be studied. On the other hand, if the range supported by prior user knowledge is sufŽ ciently narrow, then all statistical procedures become moot. Early in their monograph the authors appear to subscribe to the following point of view: “ If statistics is an applied Ž eld and not a minor branch of mathematics, then more than ninety-nine percent of the published papers are useless exercises.” Apparently, Binomial Distribution Handbook for Scientists and Engineers represents their efforts to be included in the other 1%. I must leave it to the experience of other users to judge how well this objective has been met.

Journal ArticleDOI
TL;DR: This paper presents a new wavelet-based image denoising method, which extends a "geometrical" Bayesian framework and combines three criteria for distinguishing supposedly useful coefficients from noise: coefficient magnitudes, their evolution across scales and spatial clustering of large coefficients near image edges.
Abstract: This paper presents a new wavelet-based image denoising method, which extends a "geometrical" Bayesian framework. The new method combines three criteria for distinguishing supposedly useful coefficients from noise: coefficient magnitudes, their evolution across scales and spatial clustering of large coefficients near image edges. These three criteria are combined in a Bayesian framework. The spatial clustering properties are expressed in a prior model. The statistical properties concerning coefficient magnitudes and their evolution across scales are expressed in a joint conditional model. The three main novelties with respect to related approaches are (1) the interscale-ratios of wavelet coefficients are statistically characterized and different local criteria for distinguishing useful coefficients from noise are evaluated, (2) a joint conditional model is introduced, and (3) a novel anisotropic Markov random field prior model is proposed. The results demonstrate an improved denoising performance over related earlier techniques.

Journal ArticleDOI
Philip H. S. Torr1
TL;DR: This paper explores ways of automating the model selection process with specific emphasis on the least squares problem of fitting manifolds to data points, illustrated with respect to epipolar geometry.
Abstract: Computer vision often involves estimating models from visual input. Sometimes it is possible to fit several different models or hypotheses to a set of data, and a decision must be made as to which is most appropriate. This paper explores ways of automating the model selection process with specific emphasis on the least squares problem of fitting manifolds (in particular algebraic varieties e.g. lines, algebraic curves, planes etc.) to data points, illustrated with respect to epipolar geometry. The approach is Bayesian and the contribution three fold, first a new Bayesian description of the problem is laid out that supersedes the author's previous maximum likelihood formulations, this formulation will reveal some hidden elements of the problem. Second an algorithm, ‘MAPSAC’, is provided to obtain the robust MAP estimate of an arbitrary manifold. Third, a Bayesian model selection paradigm is proposed, the Bayesian formulation of the manifold fitting problem uncovers an elegant solution to this problem, for which a new method ‘GRIC’ for approximating the posterior probability of each putative model is derived. This approximations bears some similarity to the penalized likelihoods used by AIC, BIC and MDL however it is far more accurate in situations involving large numbers of latent variables whose number increases with the data. This will be empirically and theoretically demonstrated.

Journal ArticleDOI
TL;DR: A Bayesian approach to tracking the direction-of-arrival (DOA) of multiple moving targets using a passive sensor array using a collection of target states that can be viewed as samples from the posterior of interest.
Abstract: We present a Bayesian approach to tracking the direction-of-arrival (DOA) of multiple moving targets using a passive sensor array. The prior is a description of the dynamic behavior we expect for the targets which is modeled as constant velocity motion with a Gaussian disturbance acting on the target's heading direction. The likelihood function is arrived at by defining an uninformative prior for both the signals and noise variance and removing these parameters from the problem by marginalization. Advances in sequential Monte Carlo (SMC) techniques, specifically the particle filter algorithm, allow us to model and track the posterior distribution defined by the Bayesian model using a collection of target states that can be viewed as samples from the posterior of interest. We describe two versions of this algorithm and finally present results obtained using synthetic data.

Journal ArticleDOI
TL;DR: This paper demonstrates how the fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, can be extended to perform analyses on both the absolute and relative risk scales.
Abstract: When conducting a meta-analysis of clinical trials with binary outcomes, a normal approximation for the summary treatment effect measure in each trial is inappropriate in the common situation where some of the trials in the meta-analysis are small, or the observed risks are close to 0 or 1. This problem can be avoided by making direct use of the binomial distribution within trials. A fully Bayesian method has already been developed for random effects meta-analysis on the log-odds scale using the BUGS implementation of Gibbs sampling. In this paper we demonstrate how this method can be extended to perform analyses on both the absolute and relative risk scales. Within each approach we exemplify how trial-level covariates, including underlying risk, can be considered. Data from 46 trials of the effect of single-dose ibuprofen on post-operative pain are analysed and the results contrasted with those derived from classical and Bayesian summary statistic methods. The clinical interpretation of the odds ratio scale is not straightforward. The advantages and flexibility of a fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, are now available.

Journal ArticleDOI
TL;DR: The results suggest that the SOWH test may accord overconfidence in the true topology when the null hypothesis is in fact correct, and that the SH test was observed to be much more conservative, even under high substitution rates and branch length heterogeneity.
Abstract: Probabilistic tests of topology offer a powerful means of evaluating competing phylogenetic hypotheses. The performance of the nonparametric Shimodaira-Hasegawa (SH) test, the parametric Swofford-Olsen-Waddell-Hillis (SOWH) test, and Bayesian posterior probabilities were explored for five data sets for which all the phylogenetic relationships are known with a very high degree of certainty. These results are consistent with previous simulation studies that have indicated a tendency for the SOWH test to be prone to generating Type 1 errors because of model misspecification coupled with branch length heterogeneity. These results also suggest that the SOWH test may accord overconfidence in the true topology when the null hypothesis is in fact correct. In contrast, the SH test was observed to be much more conservative, even under high substitution rates and branch length heterogeneity. For some of those data sets where the SOWH test proved misleading, the Bayesian posterior probabilities were also misleading. The results of all tests were strongly influenced by the exact substitution model assumptions. Simple models, especially those that assume rate homogeneity among sites, had a higher Type 1 error rate and were more likely to generate misleading posterior probabilities. For some of these data sets, the commonly used substitution models appear to be inadequate for estimating appropriate levels of uncertainty with the SOWH test and Bayesian methods. Reasons for the differences in statistical power between the two maximum likelihood tests are discussed and are contrasted with the Bayesian approach.

Journal ArticleDOI
TL;DR: A stochastic frontier model with random coefficients is proposed to separate technical inefficiency from technological differences across firms, and free the frontier model from the restrictive assumption that all firms must share exactly the same technological possibilities.
Abstract: The paper proposes a stochastic frontier model with random coefficients to separate technical inefficiency from technological differences across firms, and free the frontier model from the restrictive assumption that all firms must share exactly the same technological possibilities. Inference procedures for the new model are developed based on Bayesian techniques, and computations are performed using Gibbs sampling with data augmentation to allow finite-sample inference for underlying parameters and latent efficiencies. An empirical example illustrates the procedure. Copyright © 2002 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: This article presents a Bayesian phylogenetic method that evaluates the adequacy of evolutionary models using posterior predictive distributions and, unlike the likelihood-ratio test and parametric bootstrap, accounts for uncertainty in the phylogeny and model parameters.
Abstract: Bayesian inference is becoming a common statistical approach to phylogenetic estimation because, among other reasons, it allows for rapid analysis of large data sets with complex evolutionary models. Conveniently, Bayesian phylogenetic methods use currently available stochastic models of sequence evolution. However, as with other model-based approaches, the results of Bayesian inference are conditional on the assumed model of evolution: inadequate models (models that poorly fit the data) may result in erroneous inferences. In this article, I present a Bayesian phylogenetic method that evaluates the adequacy of evolutionary models using posterior predictive distributions. By evaluating a model's posterior predictive performance, an adequate model can be selected for a Bayesian phylogenetic study. Although I present a single test statistic that assesses the overall (global) performance of a phylogenetic model, a variety of test statistics can be tailored to evaluate specific features (local performance) of evolutionary models to identify sources failure. The method presented here, unlike the likelihood-ratio test and parametric bootstrap, accounts for uncertainty in the phylogeny and model parameters.

Journal ArticleDOI
TL;DR: A framework for interpreting Support Vector Machines as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors is described, which allows Bayesian methods to be used for tackling two of the outstanding challenges in SVM classification: how to tune hyperparameters and how to obtain predictive class probabilities.
Abstract: I describe a framework for interpreting Support Vector Machines (SVMs) as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors. This probabilistic interpretation can provide intuitive guidelines for choosing a ‘good’ SVM kernel. Beyond this, it allows Bayesian methods to be used for tackling two of the outstanding challenges in SVM classification: how to tune hyperparameters—the misclassification penalty C, and any parameters specifying the ernel—and how to obtain predictive class probabilities rather than the conventional deterministic class label predictions. Hyperparameters can be set by maximizing the evidences I explain how the latter can be defined and properly normalized. Both analytical approximations and numerical methods (Monte Carlo chaining) for estimating the evidence are discussed. I also compare different methods of estimating class probabilities, ranging from simple evaluation at the MAP or at the posterior average to full averaging over the posterior. A simple toy application illustrates the various concepts and techniques.

Book
01 Jan 2002
TL;DR: This paper presents a meta-analysis of the statistical matching process and three approaches to merge data sets and concludes that current approaches to merging data sets are inadequate and require a new approach.
Abstract: Introduction * Principles of the Statistical Matching Process * Traditional Appraoches to Data Function * Alternative Approaches to Merge Data Sets * Empirical Evaluation of Alternative Approaches * Synopsis and Outlook

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate that model selection is more easily performed using the deviance information criterion (DIC), which combines a Bayesian measure-of-fit with a measure of model complexity.
Abstract: Bayesian methods have been efficient in estimating parameters of stochastic volatility models for analyzing financial time series. Recent advances made it possible to fit stochastic volatility models of increasing complexity, including covariates, leverage effects, jump components and heavy-tailed distributions. However, a formal model comparison via Bayes factors remains difficult. The main objective of this paper is to demonstrate that model selection is more easily performed using the deviance information criterion (DIC). It combines a Bayesian measure-of-fit with a measure of model complexity. We illustrate the performance of DIC in discriminating between various different stochastic volatility models using simulated data and daily returns data on the S&P100 index.

Patent
01 May 2002
TL;DR: A system and method of diagnosing diseases from biological data is disclosed in this article, where a system for automated disease diagnostics prediction can be generated using a database of clinical test data.
Abstract: A system and method of diagnosing diseases from biological data is disclosed. A system for automated disease diagnostics prediction can be generated using a database of clinical test data. The diagnostics prediction can also be used to develop screening tests to screen for one or more inapparent diseases. The prediction method can be implemented with Bayesian probability estimation techniques. The system and method permit clinical test data to be analyzed and mined for improved disease diagnosis.

Journal ArticleDOI
TL;DR: In this paper, a comparison of probabilities of liquefaction calculated with two different probabilistic approaches, logistic regression and Bayesian mapping, is made, and it is shown that the Bayesian map-based approach is preferred over the Logistic Regression approach for estimating the site-specific probability of liquidation, although both methods yield comparable probabilities.
Abstract: This paper presents an assessment of existing and new probabilistic methods for liquefaction potential evaluation. Emphasis is placed on comparison of probabilities of liquefaction calculated with two different approaches, logistic regression and Bayesian mapping. Logistic regression is a well-established statistical procedure, whereas Bayesian mapping is a relatively new application of the Bayes’ theorem to the evaluation of soil liquefaction. In the present study, simplified procedures for soil liquefaction evaluation, including the Seed–Idriss, Robertson–Wride, and Andrus–Stokoe methods, based on the standard penetration test, cone penetration test, and shear wave velocity measurement, respectively, are used as the basis for developing Bayesian mapping functions. The present study shows that the Bayesian mapping approach is preferred over the logistic regression approach for estimating the site-specific probability of liquefaction, although both methods yield comparable probabilities. The paper also co...

Journal ArticleDOI
TL;DR: This work proposes an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions, and discusses the probabilistic assumptions made and properties of two practical cross- validate methods, importance sampling and k-fold cross- validation.
Abstract: In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate because it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical cross-validation methods, importance sampling and k-fold cross-validation. As illustrative examples, we use multilayer perceptron neural networks and gaussian processes with Markov chain Monte Carlo sampling in one toy problem and two challenging real-world problems.

Journal ArticleDOI
TL;DR: This paper proposes a Bayesian approach for finding and fitting parametric treed models, in particular focusing on Bayesian treed regression, and illustrates the potential of this approach by a cross-validation comparison of predictive performance with neural nets, MARS, and conventional trees on simulated and real data sets.
Abstract: When simple parametric models such as linear regression fail to adequately approximate a relationship across an entire set of data, an alternative may be to consider a partition of the data, and then use a separate simple model within each subset of the partition. Such an alternative is provided by a treed model which uses a binary tree to identify such a partition. However, treed models go further than conventional trees (e.g. CART, C4.5) by fitting models rather than a simple mean or proportion within each subset. In this paper, we propose a Bayesian approach for finding and fitting parametric treed models, in particular focusing on Bayesian treed regression. The potential of this approach is illustrated by a cross-validation comparison of predictive performance with neural nets, MARS, and conventional trees on simulated and real data sets.

Journal ArticleDOI
TL;DR: Some Bayesian methods to address the problem of fitting a signal modeled by a sequence of piecewise constant linear regression models, for example, autoregressive or Volterra models are proposed.
Abstract: We propose some Bayesian methods to address the problem of fitting a signal modeled by a sequence of piecewise constant linear (in the parameters) regression models, for example, autoregressive or Volterra models A joint prior distribution is set up over the number of the changepoints/knots, their positions, and over the orders of the linear regression models within each segment if these are unknown Hierarchical priors are developed and, as the resulting posterior probability distributions and Bayesian estimators do not admit closed-form analytical expressions, reversible jump Markov chain Monte Carlo (MCMC) methods are derived to estimate these quantities Results are obtained for standard denoising and segmentation of speech data problems that have already been examined in the literature These results demonstrate the performance of our methods

Journal ArticleDOI
TL;DR: Bayesian methods have become widespread in the marketing literature as mentioned in this paper, and they have been shown to be useful in situations in which there is limited information about a large number of units or where the information comes from different sources.
Abstract: Bayesian methods have become widespread in the marketing literature. We review the essence of the Bayesian approach and explain why it is particularly useful for marketing problems. While the appeal of the Bayesian approach has long been noted by researchers, recent developments in computational methods and expanded availability of detailed marketplace data has fueled the Bayesian growth in marketing. We emphasize the modularity and flexibility of modern Bayesian approaches. Finally, the usefulness of Bayesian methods in situations in which there is limited information about a large number of units or where the information comes from different sources is noted.

Book
01 Jan 2002

Journal ArticleDOI
TL;DR: This work describes the Bayesian phylogenetic method which uses a Markov chain Monte Carlo algorithm to provide samples from the posterior distribution of tree topologies and discusses some issues arising when using Bayesian techniques on RNA sequence data.
Abstract: We study the phylogeny of the placental mammals using molecular data from all mitochondrial tRNAs and rRNAs of 54 species We use probabilistic substitution models specific to evolution in base paired regions of RNA A number of these models have been implemented in a new phylogenetic inference software package for carrying out maximum likelihood and Bayesian phylogenetic inferences We describe our Bayesian phylogenetic method which uses a Markov chain Monte Carlo algorithm to provide samples from the posterior distribution of tree topologies Our results show support for four primary mammalian clades, in agreement with recent studies of much larger data sets mainly comprising nuclear DNA We discuss some issues arising when using Bayesian techniques on RNA sequence data