scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 2001"


Journal ArticleDOI
TL;DR: A simple nonparametric empirical Bayes model is introduced, which is used to guide the efficient reduction of the data to a single summary statistic per gene, and also to make simultaneous inferences concerning which genes were affected by the radiation.
Abstract: Microarrays are a novel technology that facilitates the simultaneous measurement of thousands of gene expression levels. A typical microarray experiment can produce millions of data points, raising serious problems of data reduction, and simultaneous inference. We consider one such experiment in which oligonucleotide arrays were employed to assess the genetic effects of ionizing radiation on seven thousand human genes. A simple nonparametric empirical Bayes model is introduced, which is used to guide the efficient reduction of the data to a single summary statistic per gene, and also to make simultaneous inferences concerning which genes were affected by the radiation. Although our focus is on one specific experiment, the proposed methods can be applied quite generally. The empirical Bayes inferences are closely related to the frequentist false discovery rate (FDR) criterion.

1,868 citations


Journal ArticleDOI
TL;DR: A Bayesian probabilistic framework for microarray data analysis is developed that derives point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes.
Abstract: Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t-test, provide a systematic inference approach that compares favorably with simple t-test or fold methods, and partly compensate for the lack of replication.

1,763 citations


Book ChapterDOI
01 Jan 2001
TL;DR: Many real-world data analysis tasks involve estimating unknown quantities from some given observations, and all inference on the unknown quantities is based on the posterior distribution obtained from Bayes’ theorem.
Abstract: Many real-world data analysis tasks involve estimating unknown quantities from some given observations. In most of these applications, prior knowledge about the phenomenon being modelled is available. This knowledge allows us to formulate Bayesian models, that is prior distributions for the unknown quantities and likelihood functions relating these quantities to the observations. Within this setting, all inference on the unknown quantities is based on the posterior distribution obtained from Bayes’ theorem. Often, the observations arrive sequentially in time and one is interested in performing inference on-line. It is therefore necessary to update the posterior distribution as data become available. Examples include tracking an aircraft using radar measurements, estimating a digital communications signal using noisy measurements, or estimating the volatility of financial instruments using stock market data. Computational simplicity in the form of not having to store all the data might also be an additional motivating factor for sequential methods.

1,232 citations


Book
01 Jan 2001
TL;DR: This paperback edition, a reprint of the 2001 edition, is a graduate-level textbook that introduces Bayesian statistics and decision theory and is a worthy successor to DeGroot's and Berger's earlier texts.
Abstract: Decision-Theoretic Foundations.- From Prior Information to Prior Distributions.- Bayesian Point Estimation.- Tests and Confidence Regions.- Bayesian Calculations.- Model Choice.- Admissibility and Complete Classes.- Invariance, Haar Measures, and Equivariant Estimators.- Hierarchical and Empirical Bayes Extensions.- A Defense of the Bayesian Choice.

1,083 citations


Dissertation
01 Jan 2001
TL;DR: This thesis presents an approximation technique that can perform Bayesian inference faster and more accurately than previously possible, and is found to be convincingly better than rival approximation techniques: Monte Carlo, Laplace's method, and variational Bayes.
Abstract: One of the major obstacles to using Bayesian methods for pattern recognition has been its computational expense. This thesis presents an approximation technique that can perform Bayesian inference faster and more accurately than previously possible. This method, “Expectation Propagation,” unifies and generalizes two previous techniques: assumed-density filtering, an extension of the Kalman filter, and loopy belief propagation, an extension of belief propagation in Bayesian networks. The unification shows how both of these algorithms can be viewed as approximating the true posterior distribution with simpler distribution, which is close in the sense of KL-divergence. Expectation Propagation exploits the best of both algorithms: the generality of assumed-density filtering and the accuracy of loopy belief propagation. Loopy belief propagation, because it propagates exact belief states, is useful for limited types of belief networks, such as purely discrete networks. Expectation Propagation approximates the belief states with expectations, such as means and variances, giving it much wider scope. Expectation Propagation also extends belief propagation in the opposite direction—propagating richer belief states which incorporate correlations between variables. This framework is demonstrated in a variety of statistical models using synthetic and real-world data. On Gaussian mixture problems, Expectation Propagation is found, for the same amount of computation, to be convincingly better than rival approximation techniques: Monte Carlo, Laplace's method, and variational Bayes. For pattern recognition, Expectation Propagation provides an algorithm for training Bayes Point Machine classifiers that is faster and more accurate than any previously known. The resulting classifiers outperform Support Vector Machines on several standard datasets, in addition to having a comparable training time. Expectation Propagation can also be used to choose an appropriate feature set for classification, via Bayesian model selection. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

1,036 citations


Journal ArticleDOI
TL;DR: A Bayesian approach is used to draw inferences about the disease prevalence and test properties while adjusting for the possibility of conditional dependence between tests, particularly when it is not always feasible to have results from this many tests.
Abstract: Many analyses of results from multiple diagnostic tests assume the tests are statistically independent conditional on the true disease status of the subject. This assumption may be violated in practice, especially in situations where none of the tests is a perfectly accurate gold standard. Classical inference for models accounting for the conditional dependence between tests requires that results from at least four different tests be used in order to obtain an identifiable solution, but it is not always feasible to have results from this many tests. We use a Bayesian approach to draw inferences about the disease prevalence and test properties while adjusting for the possibility of conditional dependence between tests, particularly when we have only two tests. We propose both fixed and random effects models. Since with fewer than four tests the problem is nonidentifiable, the posterior distributions are strongly dependent on the prior information about the test properties and the disease prevalence, even with large sample sizes. If the degree of correlation between the tests is known a priori with high precision, then our methods adjust for the dependence between the tests. Otherwise, our methods provide adjusted inferences that incorporate all of the uncertainty inherent in the problem, typically resulting in wider interval estimates. We illustrate our methods using data from a study on the prevalence of Strongyloides infection among Cambodian refugees to Canada.

434 citations


Journal ArticleDOI
TL;DR: The Bayesian methods discussed are illustrated by means of a meta-analysis examining the evidence relating to electronic fetal heart rate monitoring and perinatal mortality in which evidence is available from a variety of sources.
Abstract: This paper reviews the use of Bayesian methods in meta-analysis. Whilst there has been an explosion in the use of meta-analysis over the last few years, driven mainly by the move towards evidence-based healthcare, so too Bayesian methods are being used increasingly within medical statistics. Whilst in many meta-analysis settings the Bayesian models used mirror those previously adopted in a frequentist formulation, there are a number of specific advantages conferred by the Bayesian approach. These include: full allowance for all parameter uncertainty in the model, the ability to include other pertinent information that would otherwise be excluded, and the ability to extend the models to accommodate more complex, but frequently occurring, scenarios. The Bayesian methods discussed are illustrated by means of a meta-analysis examining the evidence relating to electronic fetal heart rate monitoring and perinatal mortality in which evidence is available from a variety of sources.

427 citations


Journal ArticleDOI
TL;DR: By combining sequential model selection procedures, the online VB method provides a fully online learning method with a model selection mechanism and was able to adapt the model structure to dynamic environments.
Abstract: The Bayesian framework provides a principled way of model selection. This framework estimates a probability distribution over an ensemble of models, and the prediction is done by averaging over the ensemble of models. Accordingly, the uncertainty of the models is taken into account, and complex models with more degrees of freedom are penalized. However, integration over model parameters is often intractable, and some approximation scheme is needed. Recently, a powerful approximation scheme, called the variational bayes (VB) method, has been proposed. This approach defines the free energy for a trial probability distribution, which approximates a joint posterior probability distribution over model parameters and hidden variables. The exact maximization of the free energy gives the true posterior distribution. The VB method uses factorized trial distributions. The integration over model parameters can be done analytically, and an iterative expectation-maximization-like algorithm, whose convergence is guaranteed, is derived. In this article, we derive an online version of the VB algorithm and prove its convergence by showing that it is a stochastic approximation for finding the maximum of the free energy. By combining sequential model selection procedures, the online VB method provides a fully online learning method with a model selection mechanism. In preliminary experiments using synthetic data, the online VB method was able to adapt the model structure to dynamic environments.

415 citations


Proceedings ArticleDOI
01 Dec 2001
TL;DR: A new method for constructing genetic network from gene expression data by using Bayesian networks is proposed, which uses nonparametric regression for capturing nonlinear relationships between genes and derives a new criterion for choosing the network in general situations.
Abstract: We propose a new method for constructing genetic network from gene expression data by using Bayesian networks. We use nonparametric regression for capturing nonlinear relationships between genes and derive a new criterion for choosing the network in general situations. In a theoretical sense, our proposed theory and methodology include previous methods based on Bayes approach. We applied the proposed method to the S. cerevisiae cell cycle data and showed the effectiveness of our method by comparing with previous methods.

332 citations


Book ChapterDOI
TL;DR: The results show that the proposed BN and Bayes multinet classifiers are competitive with (or superior to) the best known classifiers; and that the computational time for learning and using these classifiers is relatively small, arguing that BN-based classifiers deserve more attention in the data mining community.
Abstract: This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN) - primarily unrestricted Bayesian networks and Bayesian multi-nets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting problem and provide a natural method for feature subset selection. Using a set of standard classification problems, we empirically evaluate the performance of various BN-based classifiers. The results show that the proposed BN and Bayes multinet classifiers are competitive with (or superior to) the best known classifiers, based on both BN and other formalisms; and that the computational time for learning and using these classifiers is relatively small. These results argue that BN-based classifiers deserve more attention in the data mining community.

312 citations


Book
01 Jan 2001
TL;DR: Glymour as discussed by the authors provides an informal introduction to the basic assumptions, algorithms, and techniques of causal Bayes nets and graphical causal models in the context of psychological examples and demonstrates their potential as a powerful tool for guiding experimental inquiry and for interpreting results in developmental psychology, cognitive neuropsychology, psychometrics, social psychology, and studies of adult judgment.
Abstract: In recent years, small groups of statisticians, computer scientists, and philosophers have developed an account of how partial causal knowledge can be used to compute the effect of actions and how causal relations can be learned, at least by computers. The representations used in the emerging theory are causal Bayes nets or graphical causal models. In his new book, Clark Glymour provides an informal introduction to the basic assumptions, algorithms, and techniques of causal Bayes nets and graphical causal models in the context of psychological examples. He demonstrates their potential as a powerful tool for guiding experimental inquiry and for interpreting results in developmental psychology, cognitive neuropsychology, psychometrics, social psychology, and studies of adult judgment. Using Bayes net techniques, Glymour suggests novel experiments to distinguish among theories of human causal learning and reanalyzes various experimental results that have been interpreted or misinterpreted -- without the benefit of Bayes nets and graphical causal models. The capstone illustration is an analysis of the methods used in Herrnstein and Murray's book The Bell Curve; Glymour argues that new, more reliable methods of data analysis, based on Bayes nets representations, would lead to very different conclusions from those advocated by Herrnstein and Murray.

Journal ArticleDOI
TL;DR: This paper proposes a wavelet-based denoising technique without any free parameters, and uses empirical Bayes estimation based on a Jeffreys' noninformative prior to produce a remarkably simple fixed nonlinear shrinkage/thresholding rule which performs better than other more computationally demanding methods.
Abstract: The sparseness and decorrelation properties of the discrete wavelet transform have been exploited to develop powerful denoising methods. However, most of these methods have free parameters which have to be adjusted or estimated. In this paper, we propose a wavelet-based denoising technique without any free parameters; it is, in this sense, a "universal" method. Our approach uses empirical Bayes estimation based on a Jeffreys' noninformative prior; it is a step toward objective Bayesian wavelet-based denoising. The result is a remarkably simple fixed nonlinear shrinkage/thresholding rule which performs better than other more computationally demanding methods.

Journal ArticleDOI
TL;DR: A general Bayesian approach for using a computer model or simulator of a complex system to forecast system outcomes is described, based on constructing beliefs derived from a combination of expert judgments and experiments on the computer model.
Abstract: Although computer models are often used for forecasting future outcomes of complex systems, the uncertainties in such forecasts are not usually treated formally. We describe a general Bayesian approach for using a computer model or simulator of a complex system to forecast system outcomes. The approach is based on constructing beliefs derived from a combination of expert judgments and experiments on the computer model. These beliefs, which are systematically updated as we make runs of the computer model, are used for either Bayesian or Bayes linear forecasting for the system. Issues of design and diagnostics are described in the context of forecasting. The methodology is applied to forecasting for an active hydrocarbon reservoir.

Journal ArticleDOI
TL;DR: Bayesian model averaging is proposed as a formal way of taking account of model uncertainty in case-control studies, and this yields an easily interpreted summary, the posterior probability that a variable is a risk factor, and is indicated to be reasonably well calibrated in the situations simulated.
Abstract: Covariate and confounder selection in case-control studies is often carried out using a statistical variable selection method, such as a two-step method or a stepwise method in logistic regression. Inference is then carried out conditionally on the selected model, but this ignores the model uncertainty implicit in the variable selection process, and so may underestimate uncertainty about relative risks. We report on a simulation study designed to be similar to actual case-control studies. This shows that p-values computed after variable selection can greatly overstate the strength of conclusions. For example, for our simulated case-control studies with 1000 subjects, of variables declared to be 'significant' with p-values between 0.01 and 0.05, only 49 per cent actually were risk factors when stepwise variable selection was used. We propose Bayesian model averaging as a formal way of taking account of model uncertainty in case-control studies. This yields an easily interpreted summary, the posterior probability that a variable is a risk factor, and our simulation study indicates this to be reasonably well calibrated in the situations simulated. The methods are applied and compared in the context of a case-control study of cervical cancer.

Journal ArticleDOI
TL;DR: Gigerenzer and Hoffrage as discussed by the authors used a computerized tutorial program to train people to construct frequency representations (representation training) rather than to insert probabilities into Bayes's rule (rule training).
Abstract: The authors present and test a new method of teaching Bayesian reasoning, something about which previous teaching studies reported little success. Based on G. Gigerenzer and U. Hoffrage's (1995) ecological framework, the authors wrote a computerized tutorial program to train people to construct frequency representations (representation training) rather than to insert probabilities into Bayes's rule (rule training). Bayesian computations are simpler to perform with natural frequencies than with probabilities, and there are evolutionary reasons for assuming that cognitive algorithms have been developed to deal with natural frequencies. In 2 studies, the authors compared representation training with rule training; the criteria were an immediate learning effect, transfer to new problems, and long-term temporal stability. Rule training was as good in transfer as representation training, but representation training had a higher immediate learning effect and greater temporal stability.

Journal ArticleDOI
TL;DR: A maximum likelihood method is presented, and then a more powerful Bayesian approach for estimating this sample partition, and a hierarchical clustering algorithm is applied to identify clusters of individuals whose assignment together is well supported by the posterior distribution.
Abstract: We present likelihood-based methods for assigning the individuals in a sample to source populations, on the basis of their genotypes at co-dominant marker loci. The source populations are assumed to be at Hardy-Weinberg and linkage equilibrium, but the allelic composition of these source populations and even the number of source populations represented in the sample are treated as uncertain. The parameter of interest is the partition of the set of sampled individuals, induced by the assignment of individuals to source populations. We present a maximum likelihood method, and then a more powerful Bayesian approach for estimating this sample partition. In general, it will not be feasible to evaluate the evidence supporting each possible partition of the sample. Furthermore, when the number of individuals in the sample is large, it may not even be feasible to evaluate the evidence supporting, individually, each of the most plausible partitions because there may be many individuals which are difficult to assign. To overcome these problems, we use low-dimensional marginals (the 'co-assignment probabilities') of the posterior distribution of the sample partition as measures of 'similarity', and then apply a hierarchical clustering algorithm to identify clusters of individuals whose assignment together is well supported by the posterior distribution. A binary tree provides a visual representation of how well the posterior distribution supports each cluster in the hierarchy. These methods are applicable to other problems where the parameter of interest is a partition of a set. Because the co-assignment probabilities are independent of the arbitrary labelling of source populations, we avoid the label-switching problem of previous Bayesian methods.

Journal ArticleDOI
TL;DR: In this paper, the similarities and differences between classical and Bayesian methods are explored, and it is shown that they result in virtually equivalent conditional estimates of partworths for customers.
Abstract: An exciting development in modeling has been the ability to estimate reliable individual-level parameters for choice models. Individual partworths derived from these parameters have been very useful in segmentation, identifying extreme individuals, and in creating appropriate choice simulators. In marketing, hierarchical Bayes models have taken the lead in combining information about the aggregate distribution of tastes with the individual's choices to arrive at a conditional estimate of the individual's parameters. In economics, the same behavioral model has been derived from a classical rather than a Bayesian perspective. That is, instead of Gibbs sampling, the method of maximum simulated likelihood provides estimates of both the aggregate and the individual parameters. This paper explores the similarities and differences between classical and Bayesian methods and shows that they result in virtually equivalent conditional estimates of partworths for customers. Thus, the choice between Bayesian and classical estimation becomes one of implementation convenience and philosophical orientation, rather than pragmatic usefulness.

Journal ArticleDOI
TL;DR: It is concluded that both sensitivity analyses and MCRA should begin with the same type of prior specification effort as Bayesian analysis.
Abstract: Standard statistical methods understate the uncertainty one should attach to effect estimates obtained from observational data. Among the methods used to address this problem are sensitivity analysis, Monte Carlo risk analysis (MCRA), and Bayesian uncertainty assessment. Estimates from MCRAs have been presented as if they were valid frequentist or Bayesian results, but examples show that they need not be either in actual applications. It is concluded that both sensitivity analyses and MCRA should begin with the same type of prior specification effort as Bayesian analysis.

Journal ArticleDOI
TL;DR: Using an empirical Bayes approach, it is shown how the hyperparameters can be estimated in a way that is both computationally feasible and statistically valid.
Abstract: The wide applicability of Gibbs sampling has increased the use of more complex and multi-level hierarchical models To use these models entails dealing with hyperparameters in the deeper levels of a hierarchy There are three typical methods for dealing with these hyperparameters: specify them, estimate them, or use a 'flat' prior Each of these strategies has its own associated problems In this paper, using an empirical Bayes approach, we show how the hyperparameters can be estimated in a way that is both computationally feasible and statistically valid

Journal ArticleDOI
TL;DR: Both Bayesian and frequentist schools of inference are established, and now neither of them has operational difficulties, with the exception of some complex cases.
Abstract: Frequentist and Bayesian approaches to scientific inference in animal breeding are discussed. Routine methods in animal breeding (selection index, BLUP, ML, REML) are presented under the hypotheses of both schools of inference, and their properties are examined in both cases. The Bayesian approach is discussed in cases in which prior information is available, prior information is available under certain hypotheses, prior information is vague, and there is no prior information. Bayesian prediction of genetic values and genetic parameters are presented. Finally, the frequentist and Bayesian approaches are compared from a theoretical and a practical point of view. Some problems for which Bayesian methods can be particularly useful are discussed. Both Bayesian and frequentist schools of inference are established, and now neither of them has operational difficulties, with the exception of some complex cases. There is software available to analyze a large variety of problems from either point of view. The choice of one school or the other should be related to whether there are solutions in one school that the other does not offer, to how easily the problems are solved, and to how comfortable scientists feel with the way they convey their results.

Posted Content
TL;DR: This chapter provides an analytical framework to quantify the improvements in classification results due to combining and derives expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance.
Abstract: Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

Book
01 Jan 2001
TL;DR: Microarrays, a new biogenetic technology for the simultaneous measurement of thousands of gene expression levels, and Empirical Bayes, Herbert Robbins' most influential contribution to statistical theory are realized.
Abstract: Empirical Bayes was Herbert Robbins' most influential contribution to statistical theory. It is also an idea of great practical potential. That potential is realized in the analysis of microarrays, a new biogenetic technology for the simultaneous measurement of thousands of gene expression levels.

Journal ArticleDOI
TL;DR: The authors show that when enough training data are present, excess hidden units do not substantially degrade the accuracy of Bayesian ANNs, and the minimum number of hidden units required to best model the optimal mapping function varies with the complexity of the data.
Abstract: It is well understood that the optimal classification decision variable is the likelihood ratio or any monotonic transformation of the likelihood ratio. An automated classifier which maps from an input space to one of the likelihood ratio family of decision variables is an optimal classifier or "ideal observer." Artificial neural networks (ANNs) are frequently used as classifiers for many problems. In the limit of large training sample sizes, an ANN approximates a mapping function which is a monotonic transformation of the likelihood ratio, i.e., it estimates an ideal observer decision variable. A principal disadvantage of conventional ANNs is the potential over-parameterization of the mapping function which results in a poor approximation of an optimal mapping function for smaller training samples. Recently, Bayesian methods have been applied to ANNs in order to regularize training to improve the robustness of the classifier. The goal of training a Bayesian ANN with finite sample sizes is, as with unlimited data, to approximate the ideal observer. The authors have evaluated the accuracy of Bayesian ANN models of ideal observer decision variables as a function of the number of hidden units used, the signal-to-noise ratio of the data and the number of features or dimensionality of the data. The authors show that when enough training data are present, excess hidden units do not substantially degrade the accuracy of Bayesian ANNs. However, the minimum number of hidden units required to best model the optimal mapping function varies with the complexity of the data.

MonographDOI
TL;DR: In this paper, the authors present a presentation enriched with examples drawn from all manner of applications, e.g., genetics, filtering, the Black-Scholes option-pricing formula, quantum probability and computing, and classical and modern statistical models.
Abstract: Statistics do not lie, nor is probability paradoxical. You just have to have the right intuition. In this lively look at both subjects, David Williams convinces mathematics students of the intrinsic interest of statistics and probability, and statistics students that the language of mathematics can bring real insight and clarity to their subject. He helps students build the intuition needed, in a presentation enriched with examples drawn from all manner of applications, e.g., genetics, filtering, the Black–Scholes option-pricing formula, quantum probability and computing, and classical and modern statistical models. Statistics chapters present both the Frequentist and Bayesian approaches, emphasising Confidence Intervals rather than Hypothesis Test, and include Gibbs-sampling techniques for the practical implementation of Bayesian methods. A central chapter gives the theory of Linear Regression and ANOVA, and explains how MCMC methods allow greater flexibility in modelling. C or WinBUGS code is provided for computational examples and simulations. Many exercises are included; hints or solutions are often provided.

Journal ArticleDOI
TL;DR: Bayesian computations for this curve in the case where data on both costs and efficacy are available from a clinical trial are presented, leading to a more conclusive assessment of cost‐effectiveness.
Abstract: A key tool for assessing the relative cost-effectiveness of two treatments in health economics is the incremental C/E acceptability curve. We present Bayesian computations for this curve in the case where data on both costs and efficacy are available from a clinical trial. Analysis is given under various formulations of prior information. A case study is analysed in which reasonable prior information is shown to strengthen substantially the posterior inference, leading to a more conclusive assessment of cost-effectiveness. Calculations can be performed using readily available Bayesian software.

Journal ArticleDOI
TL;DR: This paper introduces a Robust Bayes Classifier able to handle incomplete databases with no assumption about the pattern of missing data and provides two scoring methods to rank intervals and a decision theoretic approach to trade off the risk of an erroneous classification and the choice of not classifying unequivocally a case.

Journal ArticleDOI
TL;DR: The authors present an analysis of the choice of sample sizes for demonstrating cost-effectiveness of a new treatment or procedure, when data on both cost and efficacy will be collected in a clinical trial.
Abstract: The authors present an analysis of the choice of sample sizes for demonstrating cost-effectiveness of a new treatment or procedure, when data on both cost and efficacy will be collected in a clinical trial. The Bayesian approach to statistics is employed, as well as a novel Bayesian criterion that provides insight into the sample size problem and offers a very flexible formulation.

Journal ArticleDOI
01 Nov 2001-Genetics
TL;DR: In this article, an approximate method for the analysis of quantitative trait loci based on model selection from multiple regression models with trait values regressed on marker genotypes, using a modification of the easily calculated Bayesian information criterion to estimate the posterior probability of models with various subsets of markers as variables.
Abstract: We describe an approximate method for the analysis of quantitative trait loci (QTL) based on model selection from multiple regression models with trait values regressed on marker genotypes, using a modification of the easily calculated Bayesian information criterion to estimate the posterior probability of models with various subsets of markers as variables. The BIC-delta criterion, with the parameter delta increasing the penalty for additional variables in a model, is further modified to incorporate prior information, and missing values are handled by multiple imputation. Marginal probabilities for model sizes are calculated, and the posterior probability of nonzero model size is interpreted as the posterior probability of existence of a QTL linked to one or more markers. The method is demonstrated on analysis of associations between wood density and markers on two linkage groups in Pinus radiata. Selection bias, which is the bias that results from using the same data to both select the variables in a model and estimate the coefficients, is shown to be a problem for commonly used non-Bayesian methods for QTL mapping, which do not average over alternative possible models that are consistent with the data.

Journal ArticleDOI
TL;DR: A semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution is proposed and it is shown that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates.
Abstract: We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates. Several novel properties of the proposed model are derived. In addition, we propose a class of improper noninformative priors based on this model and examine the properties of the implied posterior. Also, a class of informative priors based on historical data is proposed and its theoretical properties are investigated. A case study involving a melanoma clinical trial is discussed in detail to demonstrate the proposed methodology.

Journal ArticleDOI
TL;DR: Probability is a guide to life partly because it is a good guide to causality as mentioned in this paper, but it is not a guide for causality in the sense that probability is a sure guide to causal knowledge.
Abstract: Probability is a guide to life partly because it is a guide to causality. Work over the last two decades using Bayes nets supposes that probability is a very sure guide to causality. I think not, and I shall argue that here. Almost all the objections I list are well-known. But I have come to see them in a different light by reflecting again on the original work in this area by Wolfgang Spohn and his recent defense of it in a paper titled \"Bayesian Nets Are All There Is to Causality\".[1]