scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1991"


Book
01 Jan 1991
TL;DR: In this article, the authors introduce Rudiments of Linear Algebra and Multivariate Normal Theory, and introduce Neyman-Pearson Detectors and Maximum Likelihood Estimators.
Abstract: 1. Introduction. 2. Rudiments of Linear Algebra and Multivariate Normal Theory. 3. Sufficiency and MVUB Estimators. 4. Neyman-Pearson Detectors. 5. Bayes Detectors. 6. Maximum Likelihood Estimators. 7. Bayes Estimators. 8. Minimum Mean-Squared Error Estimators. 9. Least Squares. 10. Linear Prediction. 11. Modal Analysis.

1,670 citations


Journal ArticleDOI
TL;DR: In this article, a formal definition of perfect Bayesian equilibrium (PBE) for multi-period games with observed actions is introduced, where the strategies form a Bayes equilibrium for each continuation game, given the specified beliefs, and beliefs are updated from period to period in accordance with Bayes rule whenever possible.

541 citations


Journal ArticleDOI
TL;DR: A new empirical Bayes estimator, with parameters simply estimated by moments, is proposed and compared with iterative alternatives suggested by Clayton and Kaldor.
Abstract: Methods for estimating regional mortality and disease rates with a view to mapping disease are discussed. A new empirical Bayes estimator with parameters simply estimated by moments is proposed and compared with iterative alternatives suggested by Clayton and Kaldor. The author develops a local shrinkage estimator in which a crude disease rate is shrunk toward a local neighborhood rate. The estimators are compared using simulations and an empirical example based on infant mortality data for Auckland New Zealand. (EXCERPT)

349 citations


Journal ArticleDOI
TL;DR: In this paper, the output of human cognition is predicted from the assumption that it is an optimal response to the information processing demands of the environment, and a methodology called rational analysis is described for deriving predictions about cognitive phenomena using optimization assumptions.
Abstract: Can the output of human cognition be predicted from the assumption that it is an optimal response to the information-processing demands of the environment? A methodology called rational analysis is described for deriving predictions about cognitive phenomena using optimization assumptions. The predictions flow from the statistical structure of the environment and not the assumed structure of the mind. Bayesian inference is used, assuming that people start with a weak prior model of the world which they integrate with experience to develop stronger models of specific aspects of the world. Cognitive performance maximizes the difference between the expected gain and cost of mental effort. (1) Memory performance can be predicted on the assumption that retrieval seeks a maximal trade-off between the probability of finding the relevant memories and the effort required to do so; in (2) categorization performance there is a similar trade-off between accuracy in predicting object features and the cost of hypothesis formation; in (3) casual inference the trade-off is between accuracy in predicting future events and the cost of hypothesis formation; and in (4) problem solving it is between the probability of achieving goals and the cost of both external and mental problem-solving search. The implemention of these rational prescriptions in neurally plausible architecture is also discussed.

335 citations


Journal ArticleDOI
TL;DR: This paper sets out a Bayesian representation of the model in the spirit of Kalbfleisch (1978) and discusses inference using Monte Carlo methods.
Abstract: Many analyses in epidemiological and prognostic studies and in studies of event history data require methods that allow for unobserved covariates or "frailties." Clayton and Cuzick (1985, Journal of the Royal Statistical Society, Series A 148, 82-117) proposed a generalization of the proportional hazards model that implemented such random effects, but the proof of the asymptotic properties of the method remains elusive, and practical experience suggests that the likelihoods may be markedly nonquadratic. This paper sets out a Bayesian representation of the model in the spirit of Kalbfleisch (1978, Journal of the Royal Statistical Society, Series B 40, 214-221) and discusses inference using Monte Carlo methods.

306 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived Bayes estimators of the mean lifetime and the reliability function in the exponential life testing model and the loss functions used are asymmetric to reflect that, in most situations of interest, overestimating is more harmful than underestimating.

213 citations


Journal ArticleDOI
TL;DR: In situations in which an additional or even primary goal of analysis is to reach a set of decisions based on the data, Bayes and empirical-Bayes adjustments can provide a better basis for the decisions than conventional procedures.
Abstract: Rothman (Epidemiology 1990; 1:43–46) recommends against adjustments for multiple comparisons. Implicit in his recommendation, however, is an assumption that the sole objective of the data analysis is to report and scientifically interpret the data. We concur with his recommendation when this assumpt

141 citations


BookDOI
01 Jan 1991
TL;DR: This book discusses the theoretical nature of probability in the classroom, computers in probability education, and changes in probability, statistics, and in their applications.
Abstract: 1: The Educational Perspective.- 1. Aims and Rationale.- 2. Views on Didactics.- Fischer's open mathematics.- Fischbein's interplay between intuitions and mathematics.- Freudenthal's didactical phenomenology.- Bauersfeld's subjective domains of experience.- 3. Basic Ideas of the Chapters.- A probabilistic perspective.- Empirical research in understanding probability.- Analysis of the probability curriculum.- The theoretical nature of probability in the classroom.- Computers in probability education.- Psychological research in probabilistic understanding.- 2: A Probabilistic Perspective.- 1. History and Philosophy.- Tardy conceptualization of probability.- The rule of ' favourable to possible'.- Expectation and frequentist applications.- Inference and the normal law.- Foundations and obstacles.- Axiomatization of probability.- Modern views on probability.- 2. The Mathematical Background.- Model-building.- Assigning probabilities.- Conditional probability.- Random variables and distributions.- Central theorems.- Standard situations.- 3. Paradoxes and Fallacies.- Chance assignment.- Expectation.- Independence and dependence.- Logical curiosities.- Concluding comments.- 3: Empirical Research in Understanding Probability.- 1. Research Framework.- Peculiarities of stochastics and its teaching.- Research in psychology and didactics.- 2. Sample Space and Symmetry View.- Nod : Tossing a counter.- No.2: Hat lottery.- 3. Frequentist Interpretation.- No.3: The six children.- No.4: Snowfall.- 4. Independence and Dependence.- No.5: Dependent urns.- No.6: Independent urns.- 5. Statistical Inference.- No.7: Coin tossing.- No.8: Drawing from a bag.- 6. Concluding Comments.- Empirical research.- Teaching consequences.- 4: Analysis of the Probability Curriculum.- 1. General Aims.- Objectives.- Ideas.- Skills.- Inclination to apply ideas and skills.- 2. General Curriculum Issues.- Aspects of the curriculum.- Curriculum sources.- Choice of orientation.- 3. Curriculum Issues in Probability.- Student readiness.- Different approaches to probability curriculum.- 4. Approaches to the Probability Curriculum.- What to look for?.- Research needs.- 5: The Theoretical Nature of Probability in the Classroom.- 1. Approaches towards Teaching.- Structural approaches.- 2. The Theoretical Nature of Stochastic Knowledge.- Approaches to teaching probability.- Theoretical nature of probability.- Objects, signs and concepts.- 3. Didactic Means to Respect the Theoretical Nature of Probability.- Interrelations between mathematics and exemplary applications.- Means of representation and activities.- 4. On the Didactic Organization of Teaching Processes.- The role of teachers.- The role of task systems.- 5. Discussion of an Exemplary Task.- Didactic framework of the task.- Classroom observations.- Implications for task systems.- 6: Computers in Probability Education.- 1. Computers and Current Practice in Probability Teaching.- Pedagogical problems and perspectives.- Changes in probability, statistics, and in their applications.- Changing technology and its influence on pedagogical ideas.- 2. Computers as Mathematical Utilities.- The birthday problem.- Exploring Bayes' formula.- Binomial probabilities.- Programming languages and other tools.- 3. Simulation as a Problem Solving Method.- Integrating simulation, algorithmics and programming.- Simulation as an alternative to solving problems analytically.- The potential of computer-aided simulation.- Software for simulation and modelling.- Computer generated random numbers.- 4. Simulation and Data Analysis for Providing an Empirical Background for Probability.- Making theoretical objects experiential.- Beginning with' limited' technological equipment.- Laws of large numbers and frequentist interpretation.- Random sampling and sampling variation.- Structure in random sequences.- A simulation and modelling tool as companion of the curriculum.- Games and strategy.- 5. Visualization, Graphical Methods and Animation.- 6. Concluding Remarks.- Software/Bibliography.- 7: Psychological Research in Probabilistic Understanding.- 1. Traditional Research Paradigms.- Probability learning.- Bayesian revision.- Disjunctive and conjunctive probabilities.- Correlation.- 2. Current Research Paradigms.- Judgemental heuristics.- Structure and process models of thinking.- Probability calibration.- Event-related brain potential research.- Overview on research paradigms.- 3. Critical Dimensions of Educational Relevance.- The conception of the task.- The conception of the subject.- The conception of the subject-task relation.- 4. Developmental Approaches on the Acquisition of the Probability Concept.- The cognitive-developmental approach of Piaget and Inhelder.- Fischbein's learning-developmental approach.- Information processing approaches.- Semantic-conceptual and operative knowledge approach.- Discussion of the developmental approaches.- Looking Forward.

126 citations


Journal ArticleDOI

124 citations


Journal ArticleDOI
TL;DR: It is shown that approximations to the generalization error of the Bayes optimal algorithm can be achieved by learning algorithms that use a two-layer neutral net to learn a perceptron.
Abstract: The generalization error of the Bayes optimal classification algorithm when learning a perceptron from noise-free random training examples is calculated exactly using methods of statistical mechanics. It is shown that if an assumption of replica symmetry is made, then, in the thermodynamic limit, the error of the Bayes optimal algorithm is less than the error of a canonical stochastic learning algorithm, by a factor approaching \ensuremath{\surd}2 as the ratio of the number of training examples to perceptron weights grows. In addition, it is shown that approximations to the generalization error of the Bayes optimal algorithm can be achieved by learning algorithms that use a two-layer neutral net to learn a perceptron.

119 citations


Journal ArticleDOI
TL;DR: In this article, a hierarchical Bayes (HB) approach is proposed for prediction in general mixed linear models. But the results find application only in small area estimation, where the model unifies and extends a number of existing models.
Abstract: This paper introduces a hierarchical Bayes (HB) approach for prediction in general mixed linear models. The results find application in small area estimation. Our model unifies and extends a number of models previously considered in this area. Computational formulas for obtaining the Bayes predictors and their standard errors are given in the general case. The methods are applied to two actual data sets. Also, in a special case, the HB predictors are shown to possess some interesting frequentist properties.

Journal ArticleDOI
TL;DR: The Bayesian approach to statistical modeling and analysis proceeds by representing all uncertainties in the form of probability distributions as mentioned in this paper, and learning from new data is accomplished by application of Bayes Theorem, the latter providing a joint probability description of uncertainty for all model unknowns.
Abstract: The bayesian (or integrated likelihood) approach to statistical modelling and analysis proceeds by representing all uncertainties in the form of probability distributions. Learning from new data is accomplished by application of Bayes’s Theorem, the latter providing a joint probability description of uncertainty for all model unknowns. To pass from this joint probability distribution to a collection of marginal summary inferences for specified interesting individual (or subsets of) unknowns, requires appropriate integration of the joint distribution. In all but simple stylized problems, these (typically high-dimensional) integrations will have to be performed numerically. This need for efficient simultaneous calculation of potentially many numerical integrals poses novel computational problems. Developments over the past decade are reviewed, including adaptive quadrature, adaptive Monte Carlo, and a variant of a Markov chain simulation procedure known as the Gibbs sampler.

Journal ArticleDOI
TL;DR: In this article, Bayes, empirical bayes and Bayes empirical Bayes solutions are given to the problems of interval estimation, decision making, and point estimation of the population size N. The model accounting for this variation is known as At.
Abstract: SUMMARY In multiple capture-recapture surveys, the probability of capture can vary between sampling occasions. The model accounting for this variation is known as At. Bayes, empirical Bayes, and Bayes empirical Bayes solutions are given to the problems of interval estimation, decision making, and point estimation of the population size N. When the number of sampling occasions is small to moderate and the number of recaptured units observed on each sampling occasion is moderate, estimates obtained from empirical Bayes and Bayes empirical Bayes methods compare closely to Bayesian methods using a reference prior distribution for the capture probabilities. However, when the number of sampling occasions is large and the number of recaptured units observed on each sampling occasion is small, inferences obtained using different reference priors can differ considerably.

Journal ArticleDOI
TL;DR: In this article, it was shown that for a sufficiently large sample size, asymptotic equivalence of the network-generated rule to the theoretical Bayes optimal rule against the true distribution governing the occurrence of data follows from the law of large numbers.
Abstract: It is demonstrated both theoretically and experimentally that, under appropriate assumptions, a neural network pattern classifier implemented with a supervised learning algorithm generates the empirical Bayes rule that is optimal against the empirical distribution of the training sample. It is also shown that, for a sufficiently large sample size, asymptotic equivalence of the network-generated rule to the theoretical Bayes optimal rule against the true distribution governing the occurrence of data follows immediately from the law of large numbers. It is proposed that a Bayes statistical decision approach leads naturally to a probabilistic definition of the valid generalization which a neural network can be expected to generate from a finite training sample. >

Journal ArticleDOI
TL;DR: This paper outlines the empirical Bayes approach and development and comparison of approaches based on parametric priors and non-parametric prior, discussion of the importance of accounting for uncertainty in the estimated prior, comparison of the output and interpretation of fixed and random effects approaches to estimating population values, estimating histograms, and identification of key considerations.
Abstract: A compound sampling model, where a unit-specific parameter is sampled from a prior distribution and then observed are generated by a sampling distribution depending on the parameter, underlies a wide variety of biopharmaceutical data. For example, in a multi-centre clinical trial the true treatment effect varies from centre to centre. Observed treatment effects deviate from these true effects through sampling variation. Knowledge of the prior distribution allows use of Bayesian analysis to compute the posterior distribution of clinic-specific treatment effects (frequently summarized by the posterior mean and variance). More commonly, with the prior not completely specified, observed data can be used to estimate the prior and use it to produce the posterior distribution: an empirical Bayes (or variance component) analysis. In the empirical Bayes model the estimated prior mean gives the typical treatment effect and the estimated prior standard deviation indicates the heterogeneity of treatment effects. In both the Bayes and empirical Bayes approaches, estimated clinic effects are shrunken towards a common value from estimates based on single clinics. This shrinkage produces more efficient estimates. In addition, the compound model helps structure approaches to ranking and selection, provides adjustments for multiplicity, allows estimation of the histogram of clinic-specific effects, and structures incorporation of external information. This paper outlines the empirical Bayes approach. Coverage will include development and comparison of approaches based on parametric priors (for example, a Gaussian prior with unknown mean and variance) and non-parametric priors, discussion of the importance of accounting for uncertainty in the estimated prior, comparison of the output and interpretation of fixed and random effects approaches to estimating population values, estimating histograms, and identification of key considerations in the use and interpretation of empirical Bayes methods.

Journal ArticleDOI
TL;DR: The problem of distortionless encoding when the parameters of the probabilistic model of a source are unknown is considered from a statistical decision theory point of view and a class of predictive and nonpredictive codes is proposed that are optimal within this framework.
Abstract: The problem of distortionless encoding when the parameters of the probabilistic model of a source are unknown is considered from a statistical decision theory point of view. A class of predictive and nonpredictive codes is proposed that are optimal within this framework. Specifically, it is shown that the codeword length of the proposed predictive code coincides with that of the proposed nonpredictive code for any source sequence. A bound for the redundancy for universal coding is given in terms of the supremum of the Bayes risk. If this supremum exists, then there exists a minimax code whose mean code length approaches it in the proposed class of codes, and the minimax code is given by the Bayes solution relative to the prior distribution of the source parameters that maximizes the Bayes risk. >

Journal ArticleDOI
TL;DR: An application of Bayes' Theorem to the problem of estimating from past data the probabilities that patients have certain diseases, given their symptoms, and the computer diagnoses obtained are compared with those given by the "Simple Bayes" method, by the method of classification trees, and also with the preliminary and final diagnoses made by physicians.
Abstract: The paper describes an application of Bayes' Theorem to the problem of estimating from past data the probabilities that patients have certain diseases, given their symptoms. The data consist of hospital records of patients who suffered acute abdominal pain. For each patient the records showed a large number of symptoms and the final diagnosis to one of nine diseases or diagnostic groups. Most current methods of computer diagnosis use the "Simple Bayes" model in which the symptoms are assumed to be independent, but the present paper does not make this assumption. Those symptoms (or lack of symptoms) which are most relevant to the diagnosis of each disease are identified by a sequence of chi-squared tests. The computer diagnoses obtained as a result of the implementation of this approach are compared with those given by the "Simple Bayes" method, by the method of classification trees (CART), and also with the preliminary and final diagnoses made by physicians.

Journal ArticleDOI
TL;DR: A random parameter or hierarchical model approach to modeling the small-domain probabilities of the characteristic of interest and the probabilities of nonresponse is presented.
Abstract: A goal in many survey sampling problems is to estimate the probability that elements of the population within various small areas or domains have some characteristic or fall in some particular survey classification. The estimation problem is typically complicated by nonrandom nonresponse in that the probability that a unit responds to the survey may be related to the characteristic of interest. This article presents a random parameter or hierarchical model approach to modeling the small-domain probabilities of the characteristic of interest and the probabilities of nonresponse. The general model allows nonresponse probabilities to depend on a unit's survey classification. A special case of the model treats nonresponse as occurring at random. Empirical Bayes methods are used to obtain parameter estimates under the hierarchical models. The method is illustrated using data from the National Crime Survey.

Journal ArticleDOI
TL;DR: An iterative procedure is described as a generalization of Bayes' method of updating an a priori assignment over the power set of the frame of discernment using uncertain evidence.
Abstract: An iterative procedure is described as a generalization of Bayes' method of updating an a priori assignment over the power set of the frame of discernment using uncertain evidence. In the context of probability kinematics the law of commutativity holds and the convergence is well behaved. the probability assignments of each updating evidence is retained. A general assignment method is also discussed for combining evidences without reference to any prior. the methods described here can be used in the field of Artificial Intelligence for common-sense reasoning and more specifically for treating uncertainty in Expert Systems. They are also relevant for nonmonotonic reasoning, abduction, and learning theory.

Book ChapterDOI
01 Jul 1991
TL;DR: This paper illustrates how these apparent difficulties can be overcome, in both parametric and nonparametric settings, by the Gibbs sampler approach to Bayesian computation.
Abstract: Survival models used in biomedical and reliability contexts typically involve data censoring, and may also involve constraints in the form of ordered parameters. In addition, inferential interest often focuses on non-linear functions of natural model parameters. From a Bayesian statistical analysis perspective, these features combine to create difficult computational problems by seeming to require (multi-dimensional) numerical integrals over awkwardly defined regions. This paper illustrates how these apparent difficulties can be overcome, in both parametric and nonparametric settings, by the Gibbs sampler approach to Bayesian computation.

Patent
03 Jan 1991
TL;DR: In this article, an expert system is organized according to Bayes' theorem, which includes a diagnostic module that generates a diagnosis in the form of probability distributions, and a knowledge base that represents the trajectories of the possible states of the system and associated values of the likelihood that the various trajectories would be present if the system were in that state.
Abstract: An expert system is organized according to Bayes' theorem. The system includes a diagnostic module that generates a diagnosis in the form of probability distributions. The diagnostic module also is responsive to evidence in the form of discretized time trajectories in the value space of the observable variables. The diagnostic module is also responsive to data from a knowledge base that represents the trajectories of the possible states of the system and the associated values of the likelihood that the various trajectories of observed evidence would be present if the system were in that state.

ReportDOI
01 Jun 1991
TL;DR: In this paper, Monte Carlo Markov chain inference using sampling-based methods has been studied, and the authors advocate the standard Bayesian procedure that uses Bayes factors, and point out that this can be implemented quite easily using samplingbased methods.
Abstract: : This technical report consists of three short papers on Monte Carlo Markov chain inference. The first paper, How many iterations in the Gibbs sampler?, proposes an easily implemented method for determining the total number of iterations required to estimate probabilities and quantiles of the posterior distribution, and also the number of initial iterations that should be discarded to allow for burn-in. The second paper discusses model determination via predictive distributions. The paper advocates the standard Bayesian procedure that uses Bayes factors, and points out that this can be implemented quite easily using sampling-based methods. The third paper discusses issues in spatial statistics that use sampling-based methods. Several issues in the Bayesian image restoration approach are discussed: the modeling of spatial dependence, allowing for model uncertainty, the improper posterior distributions that arise in hierarchical Bayes modeling, and the modeling of local dependence between counts when it cannot be assumed that the observations are independent given the true rates.

04 Jan 1991
TL;DR: Gelfand and Smith as discussed by the authors used Gibbs sampler approach to Bayesian calculation of constrained parameter and truncated data problems, which avoids the need for, typically multidimensional, numerical integrations over awkwardly defined regions.
Abstract: : Bayesian analysis of constrained parameter and truncated data problems is complicated by the seeming need for, typically multidimensional, numerical integrations over awkwardly defined regions This paper illustrates how the Gibbs sampler approach to Bayesian calculation (Gelfand and Smith, 1990) avoids these difficulties and leads to straightforwardly implemented procedures, even for apparently very complicated model forms

Journal ArticleDOI
TL;DR: In this article, a sequential Bayesian parameter estimation technique is proposed to ensure that a minimum of experimental thermodynamic and phase diagram data suffices for the calculation of excess functions which are comparable in accuracy and precision with quantities obtained by conventional methods.
Abstract: A sequential Bayesian parameter estimation technique ensures that a minimum of experimental thermodynamic and phase diagram data suffices for the calculation of excess functions which are comparable in accuracy and precision with quantities obtained by conventional methods. In addition, the new algorithm provides a feedback between data evaluation and experimental design. Typical applications of the proposed method are illustrated by several examples.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed and evaluated models which, if acceptable, permit Bayesian and frequentist model-based predictive inference for the desired finite population parameters using both hierarchical bayesian and mixed linear models, emphasizing the use of transformed random variables.
Abstract: The Patterns of Care Studies were conducted to determine the quality of care received by cancer patients whose primary treatment modality is radiation therapy In this article, we propose and evaluate models which, if acceptable, permit Bayesian and frequentist model-based predictive inference for the desired finite population parameters Using both hierarchical Bayesian and frequentist mixed linear models, we describe methodology for making the desired inferences, emphasizing the use of transformed random variables Finally, we compare the frequentist, Bayes, and empirical Bayes approaches using data from one of the surveys All three methods produce essentially the same value for the (finite population) mean The standard empirical Bayes and frequentist measures of variability are very much smaller than those derived from the Bayesian approach, the latter reflecting uncertainty about the values of the scale parameters in the model

Journal ArticleDOI
TL;DR: This paper proposes that meta-analysis more accurately estimates the true Bayesian posterior probability than other methods of data pooling.

Journal ArticleDOI
01 Dec 1991
TL;DR: A generalization of Bayes' theorem to the case of fuzzy data is described which contains Baye's theorem for precise data as a special case and allows to use the information in fuzzy data in a coherent way.
Abstract: There are some ideas concerning a generalization of Bayes' theorem to the situation of fuzzy data. Some of them are given in the references [1], [5], and [7]. But the proposed methods are not generalizations in the sense of the probability content of Bayes' theorem for precise data. In the present paper a generalization of Bayes' theorem to the case of fuzzy data is described which contains Bayes' theorem for precise data as a special case and allows to use the information in fuzzy data in a coherent way. Moreover a generalization of the concept of HPD-regions is explained which makes it possible to model and analyze the situation of fuzzy data. Also a generalization of the concept of predictive distributions is given in order to calculate predictive densities based on fuzzy sample information.

Journal ArticleDOI
TL;DR: In this article, orthogonal polynomials are employed to estimate the prior distribution of the parameter of natural exponential families with quadratic variance functions in an approach which combines Bayesian and nonparametric empirical Bayesian methods.
Abstract: Certain orthogonal polynomials are employed to estimate the prior distribution of the parameter of natural exponential families with quadratic variance functions in an approach which combines Bayesian and nonparametric empirical Bayesian methods. These estimates are based on samples from the marginal distribution rather than the conditional distribution.

Journal ArticleDOI
TL;DR: In this paper, a smoothing method based on recursively "roughening" a smooth prior towards the discrete non-parametric estimate is proposed, which is shown to improve empirical Bayes performance.

Journal ArticleDOI
TL;DR: The decision theoretic/Bayesian approach is shown to provide a formal justification for the sample sizes often used in practice and shows the conditions under which such sample sizes are clearly inappropriate.