scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian probability published in 1990"


Journal ArticleDOI
TL;DR: In this paper, three sampling-based approaches, namely stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm, are compared and contrasted in relation to various joint probability structures frequently encountered in applications.
Abstract: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions. The three approaches will be reviewed, compared, and contrasted in relation to various joint probability structures frequently encountered in applications. In particular, the relevance of the approaches to calculating Bayesian posterior densities for a variety of structured models will be discussed and illustrated.

6,294 citations


Journal Article
TL;DR: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions.
Abstract: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions. The three approaches will be reviewed, compared, and contrasted in relation to various joint probability structures frequently encountered in applications. In particular, the relevance of the approaches to calculating Bayesian posterior densities for a variety of structured models will be discussed and illustrated.

6,223 citations


Journal ArticleDOI
TL;DR: The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior and predictive densities is reviewed and illustrated with a range of normal data models, including variance components, unordered and ordered means, hierarchical growth curves, and missing data in a crossover trial.
Abstract: The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior and predictive densities is reviewed and illustrated with a range of normal data models, including variance components, unordered and ordered means, hierarchical growth curves, and missing data in a crossover trial. In all cases the approach is straightforward to specify distributionally and to implement computationally, with output readily adapted for required inference summaries.

1,020 citations


Journal ArticleDOI
TL;DR: A general approach to hierarchical Bayes changepoint models is presented, including an application to changing regressions, changing Poisson processes and changing Markov chains, which avoids sophisticated analytic and numerical high dimensional integration procedures.
Abstract: SUMMARY A general approach to hierarchical Bayes changepoint models is presented. In particular, desired marginal posterior densities are obtained utilizing the Gibbs sampler, an iterative Monte Carlo method. This approach avoids sophisticated analytic and numerical high dimensional integration procedures. We include an application to changing regressions, changing Poisson processes and changing Markov chains. Within these contexts we handle several previously inaccessible problems.

585 citations


Journal ArticleDOI
TL;DR: In this article, the problem of finding Bayes estimators for cumulative hazard rates and related quantities, w.r.t. prior distributions that correspond to cumulative hazard rate processes with nonnegative independent increments was studied.
Abstract: Several authors have constructed nonparametric Bayes estimators for a cumulative distribution function based on (possibly right-censored) data. The prior distributions have, for example, been Dirichlet processes or, more generally, processes neutral to the right. The present article studies the related problem of finding Bayes estimators for cumulative hazard rates and related quantities, w.r.t. prior distributions that correspond to cumulative hazard rate processes with nonnegative independent increments. A particular class of prior processes, termed beta processes, is introduced and is shown to constitute a conjugate class. To arrive at these, a nonparametric time-discrete framework for survival data, which has some independent interest, is studied first. An important bonus of the approach based on cumulative hazards is that more complicated models for life history data than the simple life table situation can be treated, for example, time-inhomogeneous Markov chains. We find posterior distributions and derive Bayes estimators in such models and also present a semiparametric Bayesian analysis of the Cox regression model. The Bayes estimators are easy to interpret and easy to compute. In the limiting case of a vague prior the Bayes solution for a cumulative hazard is the Nelson-Aalen estimator and the Bayes solution for a survival probability is the Kaplan-Meier estimator.

515 citations


Journal ArticleDOI
Eric A. Wan1
TL;DR: The relationship between minimizing a mean squared error and finding the optimal Bayesian classifier is reviewed and a number of confidence measures are proposed to evaluate the performance of the neural network classifier within a statistical framework.
Abstract: The relationship between minimizing a mean squared error and finding the optimal Bayesian classifier is reviewed. This provides a theoretical interpretation for the process by which neural networks are used in classification. A number of confidence measures are proposed to evaluate the performance of the neural network classifier within a statistical framework. >

301 citations


Journal ArticleDOI
TL;DR: Certain fundamental results and attractive features of the proposed approach in the context of the random field theory are discussed, and a systematic spatial estimation scheme is presented that satisfies a variety of useful properties beyond those implied by the traditional stochastic estimation methods.
Abstract: The purpose of this paper is to stress the importance of a Bayesian/maximum-entropy view toward the spatial estimation problem. According to this view, the estimation equations emerge through a process that balances two requirements: High prior information about the spatial variability and high posterior probability about the estimated map. The first requirement uses a variety of sources of prior information and involves the maximization of an entropy function. The second requirement leads to the maximization of a so-called Bayes function. Certain fundamental results and attractive features of the proposed approach in the context of the random field theory are discussed, and a systematic spatial estimation scheme is presented. The latter satisfies a variety of useful properties beyond those implied by the traditional stochastic estimation methods.

291 citations


Journal ArticleDOI
TL;DR: A new inference method, Highest Confidence First (HCF) estimation, is used to infer a unique labeling from the a posteriori distribution that is consistent with both prior knowledge and evidence.
Abstract: Integrating disparate sources of information has been recognized as one of the keys to the success of general purpose vision systems. Image clues such as shading, texture, stereo disparities and image flows provide uncertain, local and incomplete information about the three-dimensional scene. Spatial a priori knowledge plays the role of filling in missing information and smoothing out noise. This thesis proposes a solution to the longstanding open problem of visual integration. It reports a framework, based on Bayesian probability theory, for computing an intermediate representation of the scene from disparate sources of information. The computation is formulated as a labeling problem. Local visual observations for each image entity are reported as label likelihoods. They are combined consistently and coherently on hierarchically structured label trees with a new, computationally simple procedure. The pooled label likelihoods are fused with the a priori spatial knowledge encoded as Markov Random Fields (MRF's). The a posteriori distribution of the labelings are thus derived in a Bayesian formalism. A new inference method, Highest Confidence First (HCF) estimation, is used to infer a unique labeling from the a posteriori distribution. Unlike previous inference methods based on the MRF formalism, HCF is computationally efficient and predictable while meeting the principles of graceful degradation and least commitment. The results of the inference process are consistent with both observable evidence and a priori knowledge. The effectiveness of the approach is demonstrated with experiments on two image analysis problems: intensity edge detection and surface reconstruction. For edge detection, likelihood outputs from a set of local edge operators are integrated with a priori knowledge represented as an MRF probability distribution. For surface reconstruction, intensity information is integrated with sparse depth measurements and a priori knowledge. Coupled MRF's provide a unified treatment of surface reconstruction and segmentation, and an extension of HCF implements a solution method. Experiments using real image and depth data yield robust results. The framework can also be generalized to higher-level vision problems, as well as to other domains.

285 citations


Journal ArticleDOI
TL;DR: In this paper, two probability languages, the Bayesian language and the language of belief functions, are compared, and the semantics (i.e., the meaning of the scale) and syntax of these languages are compared.

257 citations


Book
12 Mar 1990
TL;DR: In this paper, the authors focus on the theory of reduction of a Bayesian experiment considered as a unique probability measure on a product space (parameter space x sample space) and comprehensively examine sufficiency, including its applications to identification and comparison of models, as well as ancillarity, with its application to exogeneity.
Abstract: This important reference/text focuses on the theory of reduction of a Bayesian experiment considered as a (unique) probability measure on a product space (parameter space x sample space).Treating the basic model and its essential properties, it comprehensively examines sufficiency, including its applications to identification and to comparison of models, as xwll as ancillarity, with its application to exogeneity.

220 citations


Book ChapterDOI
01 Jan 1990
TL;DR: A more complete discussion of why probabilities need not correspond to physical causal influences, or “propensities” affecting mass phenomena, is given.
Abstract: At the 1988 workshop we called attention to the “Mind Projection Fallacy” which is present in all fields that use probability. Here we give a more complete discussion showing why probabilities need not correspond to physical causal influences, or “propensities” affecting mass phenomena. Probability theory is far more useful if we recognize that probabilities express fundamentally logical inferences pertaining to individual cases. We note several examples of the difference this makes in real applications.

Journal ArticleDOI
TL;DR: Information-search data refuted the idea that subjects represented hypotheses as Bayesian set, and three experiments were conducted to examine cognitive representations of hypothesis sets in the testing of multiple competing hypotheses.
Abstract: A well-documented phenomenon in opinion-revision literature is subjects' failure to revise probability estimates for an exhaustive set of mutually exclusive hypotheses in a complementary manner. However, prior research has not addressed the question of whether such behavior simply represents a misunderstanding of mathematical rules, or whether it is a consequence of a cognitive representation of hypotheses that is at odds with the Bayesian notion of a set relationship. Two alternatives to the Bayesian representation, a belief system (Shafer, 1976) and a system of independent hypotheses, were proposed, and three experiments were conducted to examine cognitive representations of hypothesis sets in the testing of multiple competing hypotheses. Subjects were given brief murder mysteries to solve and allowed to request various types of information about the suspects; after having received each new piece of information, subjects rated each suspect's probability of being the murderer. Presence and timing of suspect eliminations were varied in the first two experiments; the final experiment involved the varying of percentages of clues that referred to more than one suspect (for example, all of the female suspects). The noncomplementarity of opinion revisions remained a strong phenomenon in all conditions. Information-search data refuted the idea that subjects represented hypotheses as a Bayesian set; further study of the independent hypotheses theory and Shaferian belief functions as descriptive models is encouraged.

Journal ArticleDOI
TL;DR: The first five sections of the paper describe the Bayesian paradigm for statistics and its relationship with other attitudes towards inference and an attempt is made to appreciate how accurate formulae like the extension of the conversation, the product law and Bayes rule are in evaluating probabilities.
Abstract: The first five sections of the paper describe the Bayesian paradigm for statistics and its relationship with other attitudes towards inference. Section 1 outlines Wald's major contributions and explains how they omit the vital consideration of coherence. When this point is included the Bayesian view results, with the main difference that Waldean ideas require the concept of the sample space, whereas the Bayesian approach may dispense with it, using a probability distribution over parameter space instead. Section 2 relates statistical ideas to the problem of inference in science. Scientific inference is essentially the passage from observed, past data to unobserved, future data. The roles of models and theories in doing this are explored. The Bayesian view is that all this should be accomplished entirely within the calculus of probability and Section 3 justifies this choice by various axiom systems. The claim is made that this leads to a quite different paradigm from that of classical statistics and, in particular, prob- lems in the latter paradigm cease to have importance within the other. Point estimation provides an illustration. Some counter-examples to the Bayesian view are discussed. It is important that statistical conclusions should be usable in making decisions. Section 4 explains how the Bayesian view achieves this practi- cality by introducing utilities and the principle of maximizing expected utility. Practitioners are often unhappy with the ideas of basing inferences on one number, probability, or action on another, an expectation, so these points are considered and the methods justified. Section 5 discusses why the Bayesian viewpoint has not achieved the success that its logic suggests. Points discussed include the relationship between the inferences and the practical situation, for example with multiple comparisons; and the lack of the need to confine attention to normality or the exponential family. Its extensive use by nonstatisticians is documented. The most important objection to the Bayesian view is that which rightly says that probabilities are hard to assess. Consequently Section 6 considers how this might be done and an attempt is made to appreciate how accurate formulae like the extension of the conversation, the product law and Bayes rule are in evaluating probabilities.

Journal ArticleDOI
TL;DR: Examples are given that demonstrate the ability of Bayesian probability theory to determine the best model of a process even when more complex models fit the data better.

Journal ArticleDOI
TL;DR: A collection of meta-analysis techniques based on Bayesian statistics for interpreting, adjusting, and combining evidence to estimate parameters and outcomes important to the assessment of health technologies are described.
Abstract: This article describes a collection of meta-analysis techniques based on Bayesian statistics for interpreting, adjusting, and combining evidence to estimate parameters and outcomes important to the assessment of health technologies. The result of an analysis by the Confidence Profile Method is a joint posterior probability distribution for the parameters of interest, from which marginal distributions for any particular parameter can be calculated. The method can be used to analyze problems involving a variety of types of outcomes, a variety of measures of effect, and a variety of experimental designs. This article presents the elements necessary for analysis, including prior distributions, likelihood functions, and specific models for experimental designs that include adjustment for biases.

Book ChapterDOI
TL;DR: This tutorial paper discusses the theoretical basis of quantified maximum entropy, as a technique for obtaining probabilistic estimates of images and other positive additive distributions from noisy and incomplete data.
Abstract: This tutorial paper discusses the theoretical basis of quantified maximum entropy, as a technique for obtaining probabilistic estimates of images and other positive additive distributions from noisy and incomplete data. The analysis is fully Bayesian, with estimates always being obtained as probability distributions from which appropriate error bars can be found. This supersedes earlier techniques, even those using maximum entropy, which aimed to produce a single optimal distribution.

Journal ArticleDOI
01 Apr 1990
TL;DR: In this article, the authors extended the Stoneman approach by incorporating dynamic factors into modelling a firm's decision on the adoption of a divisible technology, such as learning by using and expectations on future prices of the innovation.
Abstract: DYNAMIC factors play a major role in the diffusion process of new technologies. Yet, dynamic decision-making has not been explicitly incorporated into models that describe intra-firm diffusion of a divisible technology. The basic approach in analyzing such processes is attributed to Mansfield (1968). Stoneman (1981) has modified and extended the Mansfield approach by offering a model capable of explaining desirable characteristics of diffusion process. In particular, the Stoneman model gives rise to a sigmoid diffusion curve under the assumption that the entrepreneur learns by using the new technology in a Bayesian manner. The work presented in this paper extends the Stoneman approach by incorporating dynamic factors into modelling a firm's decision on the adoption of a divisible technology. Two factors are specifically considered: learning by using; and expectations on future prices of the innovation. Rosenberg (1976, 1982) stressed the role of these two factors on the take-off and shape of diffusion processes. Ireland and Stoneman (1986) investigated the role of price expectations. Here we study the effects of these factors on intra-firm diffusion of a new technology, allowing the decision makers to be risk averse and to vary according to their learning ability and firm size. The diffusion process is obtained as a result of one dynamic optimization task. In this way, decision makers are credited with the ability to consider in their present decisions the effects they have on future events. For instance, it may be worthwhile adopting a technology today if, as a result of learning, it would improve future performance, although at present it inflicts losses. Likewise, it may be preferable to delay or accelerate adoption if the price of the innovation is expected to change. Most works on intra-firm diffusion of an innovation process investigate the above mentioned factors in a static framework. Examples, in addition to Stoneman (1981), include the works of Kislev and Shchori-Bachrach (1973), Feder (1980), and Just and Zilberman (1983). Producers are assumed to choose the usage level of a new technology according to a static criterion. Since conditions are changing over time, Entrepreneurs continually update decisions and a diffusion process occurs. Initial steps toward incorporating dynamic factors are found in Tonks (1986).

Proceedings Article
01 Jan 1990
TL;DR: A Bayesian approach to the problem of autonomous manipulation in the presence of state uncertainty is described and an attempt is made to find a plan for optimizing expected performance by searching for plans that optimize the robot's expected throughput.
Abstract: A Bayesian approach to the problem of autonomous manipulation in the presence of state uncertainty is described. Uncertainty is modeled with a probability distribution on the state space. Each plan (sequence of actions) defines a mapping on the state space and hence a posterior probability distribution. An attempt is made to find a plan for optimizing expected performance. The Bayesian framework is applied to a grasping problem. A planar polygon whose initial orientation is described by a uniform distribution and a frictionless parallel-jaw gripper is assumed in order to plan automatically a sequence of open-loop squeezing operations to reduce orientational uncertainty and grasp the object. Although many different performance measures are possible depending on the application, the approach is illustrated by searching for plans that optimize the robot's expected throughput. >

Journal ArticleDOI
TL;DR: Bayesian methods simplify the analysis of data from sequential clinical trials and avoid certain paradoxes of frequentist inference, and offer a natural setting for the synthesis of expert opinion in deciding policy matters.
Abstract: Attitudes of biostatisticians toward implementation of the Bayesian paradigm have changed during the past decade due to the increased availability of computational tools for realistic problems. Empirical Bayes' methods, already widely used in the analysis of longitudinal data, promise to improve cancer incidence maps by accounting for overdispersion and spatial correlation. Hierarchical Bayes' methods offer a natural framework in which to demonstrate the bioequivalence of pharmacologic compounds. Their use for quantitative risk assessment and carcinogenesis bioassay is more controversial, however, due to uncertainty regarding specification of informative priors. Bayesian methods simplify the analysis of data from sequential clinical trials and avoid certain paradoxes of frequentist inference. They offer a natural setting for the synthesis of expert opinion in deciding policy matters. Both frequentist and Bayes' methods have a place in biostatistical practice.

Journal ArticleDOI
TL;DR: The Confidence Profile Method is a new Bayesian method that can be used to assess technologies where the available evidence involves a variety of experimental designs, types of outcomes, and effect measures; a varieties of biases; combinations of biases and nested biases; uncertainty about biases; an underlying variability in the parameter of interest; indirect evidence; and technology families.
Abstract: The Confidence Profile Method is a new Bayesian method that can be used to assess technologies where the available evidence involves a variety of experimental designs, types of outcomes, and effect...

Journal ArticleDOI
TL;DR: In this paper, a variety of Bayesian consensus models with respect to their conformance or lack thereof to the unanimity principle and a more general compromise principle are examined for a large set of probability forecast data from meteorology.
Abstract: When two forecasters agree regarding the probability of an uncertain event, should a decision maker adopt that probability as his or her own? A decision maker who does so is said to act in accord with the unanimity principle. We examine a variety of Bayesian consensus models with respect to their conformance or lack thereof to the unanimity principle and a more general compromise principle. In an analysis of a large set of probability forecast data from meteorology, we show how well the various models, when fit to the data, reflect the empirical pattern of conformance to these principles.

Book ChapterDOI
01 Jun 1990
TL;DR: An iterative categorization algorithm has been developed which attempts to get optimal Bayesian estimates of the probabilities that objects will display various features and is efficient, works well in the case of large data bases, and replicates the full range of empirical literature in human classification.
Abstract: A rational analysis tries to predict the behavior of a cognitive system from the assumption it is optimized to the environment. An iterative categorization algorithm has been developed which attempts to get optimal Bayesian estimates of the probabilities that objects will display various features. A prior probability is estimated that an object comes from a category and combined with conditional probabilities of displaying features if the object comes from the category. Separate Bayesian treatments are offered for the cases of discrete and continuous dimensions. The resulting algorithm is efficient, works well in the case of large data bases, and replicates the full range of empirical literature in human categorization.

Journal ArticleDOI
TL;DR: In this article, a hierarchical Bayesian linear model is used to predict outstanding claims on a porfolio of general insurance policies, which can be expressed in the form of a linear model.
Abstract: The subject of predicting outstanding claims on a porfolio of general insurance policies is approached via the theory of hierarchical Bayesian linear models. This is particularly appropriate since the chain ladder technique can be expressed in the form of a linear model. The statistical methods which are applied allow the practitioner to use different modelling assumptions from those implied by a classical formulation, and to arrive at forecasts which have a greater degree of inherent stability. The results can also be used for other linear models. By using a statistical structure, a sound approach to the chain ladder technique can be derived. The Bayesian results allow the input of collateral information in a formal manner. Empirical Bayes results are derived which can be interpreted as credibility estimates. The statistical assumptions which are made in the modelling procedure are clearly set out and can be tested by the practitioner. The results based on the statistical theory form one part of the reserving procedure, and should be followed by expert interpretation and analysis. An illustration of the use of Bayesian and empirical Bayes estimation methods is given.

Journal ArticleDOI
TL;DR: The author presents a Bayesian method of testing possibly non-nested restrictions in a multivariate linear model and, using store-level scanner data, compares it with classical methods.
Abstract: The author presents a Bayesian method of testing possibly non-nested restrictions in a multivariate linear model and, using store-level scanner data, compares it with classical methods. The Bayesia...

Journal ArticleDOI
01 May 1990
TL;DR: A Bayesian detection model is formulated for a distributed system of sensors, wherein each sensor provides the central processor with a detection probability rather than an observation vector or a detection decision.
Abstract: A Bayesian detection model is formulated for a distributed system of sensors, wherein each sensor provides the central processor with a detection probability rather than an observation vector or a detection decision. Sufficiency relations are developed for comparing alternative sensor systems in terms of their likelihood functions. The sufficiency relations, characteristic Bayes risks, and receiver operating characteristics provide equivalent criteria for establishing a dominance order of sensor systems. Parametric likelihood functions drawn from the beta family of densities are presented, and analytic solutions for the decision model and dominance conditions are derived. The theory is illustrated with numerical examples highlighting the behavior of the model and benefits of fusing the detection probabilities. >

Book
01 Jul 1990
TL;DR: In this paper, the authors define a Bayesian approximation of a belief function and show that combining the Bayesian approximations of belief functions is computationally less involving than combining the belief functions themselves.
Abstract: An often mentioned obstacle for the use of Dempster-Shafer theory for the handling of uncertainty in expert systems is the computational complexity of the theory. One cause of this complexity is the fact that in Dempster-Shafer theory the evidence is represented by a belief function which is induced by a basic probability assignment, i.e. a probability measure on the powerset of possible answers to a question, and not by a probability measure on the set of possible answers to a question, like in a Bayesian approach. In this paper, we define a Bayesian approximation of a belief function and show that combining the Bayesian approximations of belief functions is computationally less involving than combining the belief functions themselves, while in many practical applications replacing the belief functions by their Bayesian approximations will not essentially affect the result.

Journal ArticleDOI
TL;DR: It is shown that in the maximum likelihood estimator (MLE) and FMAPE algorithms, the only correct choice of initial image for the iterative procedure in the absence of a priori knowledge about the image configuration is a uniform field.
Abstract: The development and tests of an iterative reconstruction algorithm for emission tomography based on Bayesian statistical concepts are described. The algorithm uses the entropy of the generated image as a prior distribution, can be accelerated by the choice of an exponent, and converges uniformly to feasible images by the choice of one adjustable parameter. A feasible image has been defined as one that is consistent with the initial data (i.e. it is an image that, if truly a source of radiation in a patient, could have generated the initial data by the Poisson process that governs radioactive disintegration). The fundamental ideas of Bayesian reconstruction are discussed, along with the use of an entropy prior with an adjustable contrast parameter, the use of likelihood with data increment parameters as conditional probability, and the development of the new fast maximum a posteriori with entropy (FMAPE) Algorithm by the successive substitution method. It is shown that in the maximum likelihood estimator (MLE) and FMAPE algorithms, the only correct choice of initial image for the iterative procedure in the absence of a priori knowledge about the image configuration is a uniform field. >

Book ChapterDOI
01 Jan 1990
TL;DR: In this article, a probabilistic reasoning mechanism for influence diagrams using interval rather than point valued probabilities is presented, where lower bounds on probabilities are stored at each node, and the resulting bounds for the transformed diagram are shown to be optimal within the class of constraints on probability distributions that can be expressed exclusively as lower bound on the component probabilities of the diagram.
Abstract: We describe a mechanism for performing probabilistic reasoning in influence diagrams using interval rather than point valued probabilities. We derive the procedures for node removal (corresponding to conditional expectation) and arc reversal (corresponding to Bayesian conditioning) in influence diagrams where lower bounds on probabilities are stored at each node. The resulting bounds for the transformed diagram are shown to be optimal within the class of constraints on probability distributions that can be expressed exclusively as lower bounds on the component probabilities of the diagram. Sequences of these operations can be performed to answer probabilistic queries with indeterminacies in the input and for performing sensitivity analysis on an influence diagram. The storage requirements and computational complexity of this approach are comparable to those for point-valued probabilistic inference mechanisms, making the approach attractive for performing sensitivity analysis and where probability information is not available. Limited empirical data on an implementation of the methodology are provided.

Journal ArticleDOI
TL;DR: Sahlin this paper showed that Ramsey also had a proof of the value of collecting evidence, years before the works of Good [1967], Savage [1954] and others, which is an important theorem since it tells us why we should continue to collect new evidence, i.e. continue to make new observations.
Abstract: Frank Ramsey's note 'Weight or the Value of Knowledge' is a gem. In the note Ramsey proves that when knowledge or information is free it pays in expectation to acquire it. Collecting evidence means that one is gambling with expectation but this is always preferable to not gambling, on the assumption that information is free (see Ramsay's figure below). Ramsey also shows how much the increase in weight is. For a Bayesian, but also for a non-Bayesian, this is an important theorem since it tells us why we should continue to collect new evidence, i.e. continue to make new observations. Several proofs of this theorem can be found in the literature (locus classicus is I. J. Good's article 'On the Principle of Total Evidence' [1967] and the discussion in L. J. Savage's [1954] The Foundation of Statistics; an early non-Bayesian discussion of the problem can be found in C. S. Peirce's, 'Note on the theory of the economy of research' (1876) [1958]; a discussion of Ramsey's note can be found in B. Skyrms [1990] and N.-E. Sahlin [1990]). For those interested in the history of probability theory it is a well-known fact that Ramsey in his celebrated paper 'Truth and Probability' (1926) [1990] laid the foundations of the modern theory of subjective probability. He showed how people's beliefs and desires can be measured by use of a betting method and that if a number of intuitive principles of rational behaviour are accepted, a measure used to represent our 'degrees of belief' will satisfy the laws of probability. He was the first one to prove the so-called Dutch book theorem. However, developing the theory of subjective probability he also laid the foundations of modern utility theory and decision theory (i.e. twenty years before J. von Neumann and 0. Morgenstern [1947] developed their utility theory in Theory of Games and Economic Behavior and almost thirty years before L. J. Savage [1954] developed his Bayesian decision theory in The Foundations of Statistics). Astoundingly enough, this note shows that Ramsey also had a proof of the value of collecting evidence, years before the works of Good [1967], Savage [1954] and others. I have not altered Ramsey's unorthodox notation for probabilities, which is taken from J. M. Keynes' A Treatise on Probability [1921] (the word 'weight' in the title of Ramsey's note alludes to Keynes' book). When Ramsey, for example, writes aK/h =p, this should be read as the probability of a and K given h equals p. But except for this detail the note reads easily. NILS-ERIC SAHLIN

01 Jan 1990
TL;DR: Bayesian analysis shows that a small p-value may not provide credible evidence that an anomalous phenomenon exists as mentioned in this paper, and an easily applied alternative method is described and applied to an example from the literature.
Abstract: Data from experiments that use random event generators are usually analyzed by classical (frequentist) statistical tests, which summarize the statistical significance of the test statistic as a p-value. However, classical statistical tests are frequently inappropriate to these data, and the resulting p-values can grossly overestimate the significance of the result. Bayesian analysis shows that a small p-value may not provide credible evidence that an anomalous phenomenon exists. An easily applied alternative methodol- ogy is described and applied to an example from the literature.