scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1995"


Journal ArticleDOI
TL;DR: By analyzing several thousand solutions to Bayesian problems, the authors found that when information was presented in frequency formats, statistically naive participants derived up to 50% of all inferences by Bayesian algorithms.
Abstract: Is the mind, by design, predisposed against performing Bayesian inference? Previous research on base rate neglect suggests that the mind lacks the appropriate cognitive algorithms. However, any claim against the existence of an algorithm, Bayesian or otherwise, is impossible to evaluate unless one specifies the information format in which it is designed to operate. The authors show that Bayesian algorithms are computationally simpler in frequency formats than in the probability formats used in previous research. Frequency formats correspond to the sequential way information is acquired in natural sampling, from animal foraging to neural networks. By analyzing several thousand solutions to Bayesian problems, the authors found that when information was presented in frequency formats, statistically naive participants derived up to 50% of all inferences by Bayesian algorithms. Non-Bayesian algorithms included simple versions of Fisherian and Neyman-Pearsonian inference. Is the mind, by design, predisposed against performing Bayesian inference? The classical probabilists of the Enlightenment, including Condorcet, Poisson, and Laplace, equated probability theory with the common sense of educated people, who were known then as "hommes eclaires." Laplace (1814/ 1951) declared that "the theory of probability is at bottom nothing more than good sense reduced to a calculus which evaluates that which good minds know by a sort of instinct, without being able to explain how with precision" (p. 196). The available mathematical tools, in particular the theorems of Bayes and Bernoulli, were seen as descriptions of actual human judgment (Daston, 1981,1988). However, the years of political upheaval during the French Revolution prompted Laplace, unlike earlier writers such as Condorcet, to issue repeated disclaimers that probability theory, because of the interference of passion and desire, could not account for all relevant factors in human judgment. The Enlightenment view—that the laws of probability are the laws of the mind—moderated as it was through the French Revolution, had a profound influence on 19th- and 20th-century science. This view became the starting point for seminal contributions to mathematics, as when George Boole

1,873 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed an iterative method to unfold experimental distributions in order to get the best estimates of the true ones, where the weak point of the Bayes approach, namely the need of the knowledge of the initial distribution, can be overcome by an iteration procedure.
Abstract: Bayes' theorem offers a natural way to unfold experimental distributions in order to get the best estimates of the true ones. The weak point of the Bayes approach, namely the need of the knowledge of the initial distribution, can be overcome by an iterative procedure. Since the method proposed here does not make use of continuous variables, but simply of cells in the spaces of the true and of the measured quantities, it can be applied in multidimensional problems.

1,102 citations


Journal ArticleDOI
TL;DR: In this paper, a new variant of the partial Bayes factor, the fractional Bayes Factor (FBPF), is proposed to deal with weak prior information for model comparison.
Abstract: Bayesian comparison of models is achieved simply by calculation of posterior probabilities of the models themselves. However, there are difficulties with this approach when prior information about the parameters of the various models is weak. Partial Bayes factors offer a resoIution of the problem by setting aside part of the data as a training sampIe. The training sampIe is used to obtain an initiaI informative posterior distribution of the parameters in each model. Model comparison is then based on a Bayes factor calculated from the remaining data. Properties of partial Bayes factors are discussed, particularly in the context of weak prior information, and they are found to have advantages over other proposed methods of model comparison. A new variant of the partial Bayes factor, the fractional Bayes factor, is advocated on grounds of consistency, simplicity, robustness and coherence

693 citations


Journal ArticleDOI
TL;DR: It is described how a full Bayesian analysis can deal with unresolved issues, such as the choice between fixed- and random-effects models, the choice of population distribution in a random- effects analysis, the treatment of small studies and extreme results, and incorporation of study-specific covariates.
Abstract: Current methods for meta-analysis still leave a number of unresolved issues, such as the choice between fixed- and random-effects models, the choice of population distribution in a random-effects analysis, the treatment of small studies and extreme results, and incorporation of study-specific covariates. We describe how a full Bayesian analysis can deal with these and other issues in a natural way, illustrated by a recent published example that displays a number of problems. Such analyses are now generally available using the BUGS implementation of Markov chain Monte Carlo numerical integration techniques. Appropriate proper prior distributions are derived, and sensitivity analysis to a variety of prior assumptions carried out. Current methods are briefly summarized and compared to the full Bayes analysis.

535 citations


Journal ArticleDOI
TL;DR: The method derives from observing that in general, a Bayes factor can be written as the product of a quantity called the Savage-Dickey density ratio and a correction factor; both terms are easily estimated from posterior simulation.
Abstract: We present a simple method for computing Bayes factors. The method derives from observing that in general, a Bayes factor can be written as the product of a quantity called the Savage-Dickey density ratio and a correction factor; both terms are easily estimated from posterior simulation. In some cases it is possible to do these computations without ever evaluating the likelihood.

502 citations


Journal ArticleDOI
TL;DR: A Bayesian model in which both area-specific intercept and trend are modelled as random effects and correlation between them is allowed for is proposed, an extension of that originally proposed for disease mapping.
Abstract: The analysis of variation of risk for a given disease in space and time is a key issue in descriptive epidemiology. When the data are scarce, maximum likelihood estimates of the area-specific risk and of its linear time-trend can be seriously affected by random variation. In this paper, we propose a Bayesian model in which both area-specific intercept and trend are modelled as random effects and correlation between them is allowed for. This model is an extension of that originally proposed for disease mapping. It is illustrated by the analysis of the cumulative prevalence of insulin dependent diabetes mellitus as observed at the military examination of 18-year-old conscripts born in Sardinia during the period 1936-1971. Data concerning the genetic differentiation of the Sardinian population are used to interpret the results.

446 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose a procedure that finds a collection of decision rules that best explain the behavior of experimental subjects, and apply their procedure to data on probabilistic updating by subjects in four different universities.
Abstract: Economists and psychologists have recently been developing new theories of decision making under uncertainty that can accommodate the observed violations of standard statistical decision theoretic axioms by experimental subjects. We propose a procedure that finds a collection of decision rules that best explain the behavior of experimental subjects. The procedure is a combination of maximum likelihood estimation of the rules together with an implicit classification of subjects to the various rules and a penalty for having too many rules. We apply our procedure to data on probabilistic updating by subjects in four different universities. We get remarkably robust results showing that the most important rules used by the subjects (in order of importance) are Bayes's rule, a representativeness rule (ignoring the prior), and, to a lesser extent, conservatism (overweighting the prior).

392 citations


Book
16 Nov 1995
TL;DR: In this paper, the authors present and summarize the scientific method for displaying and summarizing data, and provide answers to Selected Exercises (see Section 5.1.1).
Abstract: 1. Statistics and the Scientific Method 2. Displaying and Summarizing Data 3. Designing Experiments 4. Probability and Uncertainty 5. Conditional Probability and Bayes' Rule 6. Models for Proportions 7. Densities for Proportions 8. Comparing Two Proportions 9. Densities for Two Proportions 10. General Samples and Population Means 11. Densities for Means 12. Comparing Two or More Means 13. Data Transformations and Nonparametric Methods 14. Regression Analysis Answers to Selected Exercises

341 citations



Journal ArticleDOI
TL;DR: This work proposes an approach based on multiple imputation of theMissing responses, using the approximate Bayesian bootstrap to draw ignorable repeated imputations from the posterior predictive distribution of the missing data, stratifying by a balancing score for the observed responses prior to withdrawal.
Abstract: Clinical trials of drug treatments for psychiatric disorders commonly employ the parallel groups, placebo-controlled, repeated measure randomized comparison. When patients stop adhering to their originally assigned treatment, investigators often abandon data collection. Thus, non-adherence produces a monotone pattern of unit-level missing data, disabling the analysis by intent-to-treat. We propose an approach based on multiple imputation of the missing responses, using the approximate Bayesian bootstrap to draw ignorable repeated imputations from the posterior predictive distribution of the missing data, stratifying by a balancing score for the observed responses prior to withdrawal. We apply the method and some variations to data from a large randomized trial of treatments for panic disorder, and compare the results to those obtained by the original analysis that used the standard (endpoint) method.

282 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated the sensitivity of the rate ratio estimates to the choice of the hyperprior distribution of the dispersion parameter via a simulation study and compared the performance of the FB approach to mapping disease risk to the conventional approach of mapping maximum likelihood (ML) estimates and p-values.
Abstract: In the fully Bayesian (FB) approach to disease mapping the choice of the hyperprior distribution of the dispersion parameter is a key issue. In this context we investigated the sensitivity of the rate ratio estimates to the choice of the hyperprior via a simulation study. We also compared the performance of the FB approach to mapping disease risk to the conventional approach of mapping maximum likelihood (ML) estimates and p-values. The study was modelled on the incidence data of insulin dependent diabetes mellitus (IDDM) as observed in the communes of Sardinia.

Journal ArticleDOI
TL;DR: A Bayesian approach for monitoring multiple outcomes in single-arm cancer trials, including bio-chemotherapy acute leukaemia trials, bone marrow transplantation trials, and an anti-infection trial is presented.
Abstract: We present a Bayesian approach for monitoring multiple outcomes in single-arm clinical trials. Each patient's response may include both adverse events and efficacy outcomes, possibly occurring at different study times. We use a Dirichlet-multinomial model to accommodate general discrete multivariate responses. We present Bayesian decision criteria and monitoring boundaries for early termination of studies with unacceptably high rates of adverse outcomes or with low rates of desirable outcomes. Each stopping rule is constructed either to maintain equivalence or to achieve a specified level of improvement of a particular event rate for the experimental treatment, compared with that of standard therapy. We avoid explicit specification of costs and a loss function. We evaluate the joint behaviour of the multiple decision rules using frequentist criteria. One chooses a design by considering several parameterizations under relevant fixed values of the multiple outcome probability vector. Applications include trials where response is the cross-product of multiple simultaneous binary outcomes, and hierarchical structures that reflect successive stages of treatment response, disease progression and survival. We illustrate the approach with a variety of single-arm cancer trials, including bio-chemotherapy acute leukaemia trials, bone marrow transplantation trials, and an anti-infection trial. The number of elementary patient outcomes in each of these trials varies from three to seven, with as many as four monitoring boundaries running simultaneously. We provide general guidelines for eliciting and parameterizing Dirichlet priors and for specifying design parameters.

Proceedings ArticleDOI
20 Jun 1995
TL;DR: It is shown that existing techniques in early vision such as, snake/balloon models, region growing, and Bayes/MDL are addressing different aspects of the same problem and they can be unified within a common statistical framework which combines their advantages.
Abstract: We present a novel statistical and variational approach to image segmentation based on a new algorithm named region competition. This algorithm is derived by minimizing a generalized Bayes/MDL (Minimum Description Length) criterion using the variational principle. We show that existing techniques in early vision such as, snake/balloon models, region growing, and Bayes/MDL are addressing different aspects of the same problem and they can be unified within a common statistical framework which combines their advantages. We analyze how to optimize the precision of the resulting boundary location by studying the statistical properties of the region competition algorithm and discuss what are good initial conditions for the algorithm. Our method is generalized to color and texture segmentation and is demonstrated on grey level images, color images and texture images. >

Journal ArticleDOI
TL;DR: Finite sample estimators for entropy and other functions of a discrete probability distribution when the data is a finite sample drawn from that probability distribution are presented.
Abstract: This paper addresses the problem of estimating a function of a probability distribution from a finite set of samples of that distribution. A Bayesian analysis of this problem is presented, the optimal properties of the Bayes estimators are discussed, and as an example of the formalism, closed form expressions for the Bayes estimators for the moments of the Shannon entropy function are derived. Then numerical results are presented that compare the Bayes estimator to the frequency-counts estimator for the Shannon entropy. We also present the closed form estimators, all derived elsewhere, for the mutual information, ${\mathrm{\ensuremath{\chi}}}^{2}$ covariance, and some other statistics. (c) 1995 The American Physical Society

Journal ArticleDOI
15 Mar 1995-JAMA
TL;DR: This analysis suggests that the clinical superiority of tissue-type plasminogen activator over streptokinase remains uncertain, and the usefulness of the Bayesian approach is demonstrated using the results of the recent GUSTO study of various thrombolytic strategies in acute myocardial infarction.
Abstract: Standard statistical analyses of randomized clinical trials fail to provide a direct assessment of which treatment is superior or the probability of a clinically meaningful difference. A Bayesian analysis permits the calculation of the probability that a treatment is superior based on the observed data and prior beliefs. The subjectivity of prior beliefs in the Bayesian approach is not a liability, but rather explicitly allows different opinions to be formally expressed and evaluated. The usefulness of this approach is demonstrated using the results of the recent GUSTO study of various thrombolytic strategies in acute myocardial infarction. This analysis suggests that the clinical superiority of tissue-type plasminogen activator over streptokinase remains uncertain.

Journal ArticleDOI
TL;DR: A variety of examples demonstrate that the proposed method can provide classification ability close to or superior to learning VQ while simultaneously providing superior compression performance.
Abstract: We describe a method of combining classification and compression into a single vector quantizer by incorporating a Bayes risk term into the distortion measure used in the quantizer design algorithm. Once trained, the quantizer can operate to minimize the Bayes risk weighted distortion measure if there is a model providing the required posterior probabilities, or it can operate in a suboptimal fashion by minimizing the squared error only. Comparisons are made with other vector quantizer based classifiers, including the independent design of quantization and minimum Bayes risk classification and Kohonen's LVQ. A variety of examples demonstrate that the proposed method can provide classification ability close to or superior to learning VQ while simultaneously providing superior compression performance. >

Journal ArticleDOI
TL;DR: This article argued that the essence of "Bayesian rationality" is the assignment, correct manipulation, and proper updating of subjective event probabilities when evaluating and comparing uncertain prospects, regardless of whether attitudes toward risk satisfy the expected utility property.


Journal ArticleDOI
TL;DR: A Bayesian approach is taken for fitting the usual proportional hazards model in the case where the baseline hazard, the covariate link, and the covariATE coefficients are all unknown.
Abstract: We consider the usual proportional hazards model in the case where the baseline hazard, the covariate link, and the covariate coefficients are all unknown. Both the baseline hazard and the covariate link are monotone functions and thus are characterized using a dense class of such functions which arises, upon transformation, as a mixture of Beta distribution functions. We take a Bayesian approach for fitting such a model. Since interest focuses more upon the likelihood, we consider vague prior specifications including Jeffreys's prior. Computations are carried out using sampling-based methods. Model criticism is also discussed. Finally, a data set studying survival of a sample of lung cancer patients is analyzed.

Journal ArticleDOI
17 Sep 1995
TL;DR: The method of complexity regularization is shown to automatically find a good balance between the approximation error and the estimation error and to decrease as O(/spl radic/(logn/n) to the achievable optimum, for large nonparametric classes of distributions, as the sample size n grows.
Abstract: In pattern recognition or, as it has also been called, concept learning, the value of a { 0,1}-valued random variable Y is to be predicted based upon observing an R/sup d/-valued random variable X. We apply the method of complexity regularization to learn concepts from large concept classes. The method is shown to automatically find a good balance between the approximation error and the estimation error. In particular, the error probability of the obtained classifier is shown to decrease as O(/spl radic/(logn/n)) to the achievable optimum, for large nonparametric classes of distributions, as the sample size n grows. We also show that if the Bayes error probability is zero and the Bayes rule is in a known family of decision rules, the error probability is O(logn/n) for many large families, possibly with infinite VC dimension.

Journal ArticleDOI
TL;DR: Bayesian methods for the Jelinski and Moranda and the Littlewood and Verrall models in software reliability are studied and model selection based on the mean squared prediction error and the prequential likelihood of the conditional predictive ordinates is developed.
Abstract: Bayesian methods for the Jelinski and Moranda and the Littlewood and Verrall models in software reliability are studied. A Gibbs sampling approach is employed to compute the Bayes estimates. In addition, prediction of future failure times and future reliabilities is examined. Model selection based on the mean squared prediction error and the prequential likelihood of the conditional predictive ordinates is developed.

Journal ArticleDOI
01 Jan 1995-Genetica
TL;DR: The Dirichlet distribution provides a convenient conjugate prior for Bayesian analyses involving multinomial proportions and can be employed to model the contributions from different ancestral populations in computing forensic match probabilities.
Abstract: The Dirichlet distribution provides a convenient conjugate prior for Bayesian analyses involving multinomial proportions. In particular, allele frequency estimation can be carried out with a Dirichlet prior. If data from several distinct populations are available, then the parameters characterizing the Dirichlet prior can be estimated by maximum likelihood and then used for allele frequency estimation in each of the separate populations. This empirical Bayes procedure tends to moderate extreme multinomial estimates based on sample proportions. The Dirichlet distribution can also be employed to model the contributions from different ancestral populations in computing forensic match probabilities. If the ancestral populations are in genetic equilibrium, then the product rule for computing match probabilities is valid conditional on the ancestral contributions to a typical person of the reference population. This fact facilitates computation of match probabilities and tight upper bounds to match probabilities.

Book ChapterDOI
01 Jan 1995
TL;DR: In this paper, an overview of person parameter estimation in the Rasch model is given, including four types of estimators: the maximum likelihood, the Bayes modal, the weighted maximum likelihood and the expected a posteriori estimator.
Abstract: An overview is given of person parameter estimation in the Rasch model. In Section 4.2 some notation is introduced. Section 4.3 presents four types of estimators: the maximum likelihood, the Bayes modal, the weighted maximum likelihood, and the Bayes expected a posteriori estimator. In Section 4.4 a simulation study is presented in which properties of the estimators are evaluated. Section 4.5 covers randomized confidence intervals for person parameters. In Section 4.6 some sample statistics are mentioned that were computed using estimates of θ. Finally, a short discussion of the estimators is given in Section 4.7.

Journal ArticleDOI
TL;DR: This paper argues that the requirement that the need for a subjectively determined prior distribution, likelihood, and loss function be explicitly stated is a distinct Bayesian advantage.

Posted Content
TL;DR: In this article, the authors show that the Bayesian approach is the natural one for data analysis in the most general sense, and for assigning uncertainties to the results of physical measurements - while at the same time resolving philosophical aspects of the problems.
Abstract: Bayesian statistics is based on the subjective definition of probability as {\it ``degree of belief''} and on Bayes' theorem, the basic tool for assigning probabilities to hypotheses combining {\it a priori} judgements and experimental information. This was the original point of view of Bayes, Bernoulli, Gauss, Laplace, etc. and contrasts with later ``conventional'' (pseudo-)definitions of probabilities, which implicitly presuppose the concept of probability. These notes show that the Bayesian approach is the natural one for data analysis in the most general sense, and for assigning uncertainties to the results of physical measurements - while at the same time resolving philosophical aspects of the problems. The approach, although little known and usually misunderstood among the High Energy Physics community, has become the standard way of reasoning in several fields of research and has recently been adopted by the international metrology organizations in their recommendations for assessing measurement uncertainty. These notes describe a general model for treating uncertainties originating from random and systematic errors in a consistent way and include examples of applications of the model in High Energy Physics, e.g. ``confidence intervals'' in different contexts, upper/lower limits, treatment of ``systematic errors'', hypothesis tests and unfolding.

Journal ArticleDOI
TL;DR: The Bayesian statistical approach to the design and analysis of research studies in the health sciences is reviewed including a study of disease progression in AIDS, a comparison of two therapies in a clinical trial, and a case-control study investigating the link between dietary factors and breast cancer.
Abstract: This article reviews the Bayesian statistical approach to the design and analysis of research studies in the health sciences. The central idea of the Bayesian method is the use of study data to update the state of knowledge about a quantity of interest. In study design, the Bayesian approach explicitly incorporates expressions for the loss resulting from an incorrect decision at the end of the study. The Bayesian method also provides a flexible framework for the monitoring of sequential clinical trials. We present several examples of Bayesian methods in practice including a study of disease progression in AIDS, a comparison of two therapies in a clinical trial, and a case-control study investigating the link between dietary factors and breast cancer.

Journal ArticleDOI
TL;DR: It is shown that various strategies that have been proposed for smoothing the table have close ties to other areas of statistical methodology, including shrinkage estimation, Bayes methods, penalized likelihood, spline estimation, and kernel density and regression estimation.

Journal ArticleDOI
TL;DR: The role of Bayesian inference networks for updating student models in intelligent tutoring systems (ITSs) and the interplay among inferential issues, the psychology of learning in the domain, and the instructional approach upon which the ITS is based are highlighted.
Abstract: Probability-based inference in complex networks of interdependent variables is an active topic in statistical research, spurred by such diverse applications as forecasting, pedigree analysis, troubleshooting, and medical diagnosis. This paper concerns the role of Bayesian inference networks for updating student models in intelligent tutoring systems (ITSs). Basic concepts of the approach are briefly reviewed, but the emphasis is on the considerations that arise when one attempts to operationalize the abstract framework of probability-based reasoning in a practical ITS context. The discussion revolves around HYDRIVE, an ITS for learning to troubleshoot an aircraft hydraulics system. HYDRIVE supports generalized claims about aspects of student proficiency through probabilitybased combination of rule-based evaluations of specific actions. The paper highlights the interplay among inferential issues, the psychology of learning in the domain, and the instructional approach upon which the ITS is based.

Journal ArticleDOI
TL;DR: The Bayesian procedures for comparing proportions, especially suitable for accepting (or rejecting) the equivalence of two population proportions, are investigated to prevent further occurrences of thromboses.
Abstract: This paper investigates the Bayesian procedures for comparing proportions. These procedures are especially suitable for accepting (or rejecting) the equivalence of two population proportions. Furthermore the Bayesian predictive probabilities provide a natural and flexible tool in monitoring trials, especially for choosing a sample size and for conducting interim analyses. These methods are illustrated with two examples where antithrombotic treatments are administrated to prevent further occurrences of thromboses.

Journal ArticleDOI
TL;DR: This work discusses the design of loss functions with a local structure that depend only on a binary misclassification vector and calculates the Bayes estimate using Markov chain Monte Carlo and simulated annealing algorithms.
Abstract: Unlike the development of more accurate prior distributions for use in Bayesian imaging, the design of more sensible estimators through loss functions has been neglected in the literature. We discuss the design of loss functions with a local structure that depend only on a binary misclassification vector. The proposed approach is similar to modeling with a Markov random field. The Bayes estimate is calculated in a two-step algorithm using Markov chain Monte Carlo and simulated annealing algorithms. We present simulation experiments with the Ising model, where the observations are corrupted with Gaussian and flip noise.