scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1993"


Journal ArticleDOI
TL;DR: A general predictive density is presented which includes all proposed Bayesian approaches the authors are aware of and using Laplace approximations they can conveniently assess and compare asymptotic behavior of these approaches.
Abstract: : Model determination is a fundamental data analytic task. Here we consider the problem of choosing amongst a finite (with loss of generality we assume two) set of models. After briefly reviewing classical and Bayesian model choice strategies we present a general predictive density which includes all proposed Bayesian approaches we are aware of. Using Laplace approximations we can conveniently assess and compare asymptotic behavior of these approaches. Concern regarding the accuracy of these approximation for small to moderate sample sizes encourages the use of Monte Carlo techniques to carry out exact calculations. A data set fit with nested non linear models enables comparison between proposals and between exact and asymptotic values.

1,233 citations



Journal ArticleDOI
TL;DR: The Bayes’ theorem is generalized within the transferable belief model framework and the DRC and GBT and their uses for belief propagation in directed belief networks are analysed.

692 citations


Journal ArticleDOI
TL;DR: It is argued that the problem of plan recognition, inferring an agent's plan from observations, is largely a problem of inference under conditions of uncertainty and an approach to the plan recognition problem that is based on Bayesian probability theory is presented.

483 citations



Book ChapterDOI
09 Jul 1993
TL;DR: In this paper, the OR-gate was generalized for multivalued variables, and the algorithm to compute probability in time proportional to the number of parents was developed, and applied to the learning model to this gate.
Abstract: Spiegelhalter and Lauritzen [15] studied sequential learning in Bayesian networks and proposed three models for the representation of conditional probabilities. A forth model, shown here, assumes that the parameter distribution is given by a product of Gaussian functions and updates them from the λ and π messages of evidence propagation. We also generalize the noisy OR-gate for multivalued variables, develop the algorithm to compute probability in time proportional to the number of parents (even in networks with loops) and apply the learning model to this gate.

252 citations


Journal ArticleDOI
Donald A. Berry1
TL;DR: This paper describes a Bayesian approach to the design and analysis of clinical trials, and compares it with the frequentist approach, noting the role of randomization is an important difference.
Abstract: This paper describes a Bayesian approach to the design and analysis of clinical trials, and compares it with the frequentist approach. Both approaches address learning under uncertainty. But they are different in a variety of ways. The Bayesian approach is more flexible. For example, accumulating data from a clinical trial can be used to update Bayesian measures, independent of the design of the trial. Frequentist measures are tied to the design, and interim analyses must be planned for frequentist measures to have meaning. Its flexibility makes the Bayesian approach ideal for analysing data from clinical trials. In carrying out a Bayesian analysis for inferring treatment effect, information from the clinical trial and other sources can be combined and used explicitly in drawing conclusions. Bayesians and frequentists address making decisions very differently. For example, when choosing or modifying the design of a clinical trial, Bayesians use all available information, including that which comes from the trial itself. The ability to calculate predictive probabilities for future observations is a distinct advantage of the Bayesian approach to designing clinical trials and other decisions. An important difference between Bayesian and frequentist thinking is the role of randomization.

173 citations


Journal ArticleDOI
TL;DR: A practical framework for the Bayesian analysis of Gaussian and Student-t regression models with autocorrelated errors with Gibbs sampling method shown that the proposed approach can readily deal with high-order autoregressive processes without requiring an importance sampling function or other tuning constants.

164 citations


Book ChapterDOI
01 Jan 1993
TL;DR: This chapter presents a decades-old, extremely powerful classification algorithm that can be cast in the form of a neural network that is generally excellent and is asymptotically Bayes optimal.
Abstract: This chapter presents a decades-old, extremely powerful classification algorithm that can be cast in the form of a neural network. Learning speed for this model is fast to instantaneous. However, classification time may be slow, and memory requirements are large. Performance is generally excellent and is asymptotically Bayes optimal.

138 citations


Journal ArticleDOI
TL;DR: In this paper, a simple formula is given that produces an approximate likelihood function L x ( θ) for θ, with all nuisance parameters eliminated, based on any system of approximate confidence intervals.
Abstract: Recently there has been considerable progress on setting good approximate confidence intervals for a single parameter θ in a multi-parameter family. Here we use these frequentist results as a convenient device for making Bayes, empirical Bayes and likelihood inferences about θ. A simple formula is given that produces an approximate likelihood function L x ( θ) for θ, with all nuisance parameters eliminated, based on any system of approximate confidence intervals. The statistician can then modify L x ( θ) with Bayes or empirical Bayes information for θ, without worrying about nuisance parameters

132 citations


Journal ArticleDOI
TL;DR: Bayesian analysis of rat brain data was used to demonstrate the shape of the probability density function from data sets of different quality, and Bayesian analysis performed substantially better than NLLS under conditions of relatively low signal‐to‐noise ratio.
Abstract: Traditionally, the method of nonlinear least squares (NLLS) analysis has been used to estimate the parameters obtained from exponential decay data. In this study, we evaluated the use of Bayesian probability theory to analyze such data; specifically, that resulting from intravoxel incoherent motion NMR experiments. Analysis was done both on simulated data to which different amounts of Gaussian noise had been added and on actual data derived from rat brain. On simulated data, Bayesian analysis performed substantially better than NLLS under conditions of relatively low signal-to-noise ratio. Bayesian probability theory also offers the advantages of: a) not requiring initial parameter estimates and hence not being susceptible to errors due to incorrect starting values and b) providing a much better representation of the uncertainty in the parameter estimates in the form of the probability density function. Bayesian analysis of rat brain data was used to demonstrate the shape of the probability density function from data sets of different quality.

Journal Article
TL;DR: Families of probability distributions which arise naturally as parameter likelihoods in conjugate prior distributions for exponential families are identified, described and their relevance to computational issues in Bayes hierarchical models noted as mentioned in this paper.
Abstract: Families of probability distributions which arise naturally as parameter likelihoods in conjugate prior distributions for exponential families are identified, described and their relevance to computational issues in Bayes hierarchical models noted.

Book ChapterDOI
01 Jan 1993
TL;DR: It is suggested here that Bayesian statistical inference can help answer these questions by allowing us to ‘read the neural code’ not only in the time domain but also across a population of neurons.
Abstract: What does the response of a neuron, or of a group of neurons mean? What does it say about the stimulus? How distributed and efficient is the encoding of information in population responses? It is suggested here that Bayesian statistical inference can help answer these questions by allowing us to ‘read the neural code’ not only in the time domain[2, 5] but also across a population of neurons. Based on repeated recordings of neural responses to a known set of stimuli, we can estimate the conditional probability distribution of the responses given the stimulus, P(response|stimulus). The behaviourally relevant distribution, i.e. the conditional probability distribution of the stimuli given an observed response from a cell or a group of cells, P(stimulus|response) can be derived using the Bayes rule. This distribution contains all the information present in the response about the stimulus, and gives an upper limit and a useful comparison to the performance of further neural processing stages receiving input from these neurons. As the notion of an ‘ideal observer’ makes the definition of psychophysical efficiency possible[1], this ‘ideal homunculus’ (looking at the neural response instead of the stimulus) can be used to test the efficiency of neural representation. The Bayes rule is: P(s|r) = P(r|s)P(s)/P(r) = P(r|s)P(s)/Σ s P(r|s)P(s), where in this case s stands for stimulus, r for response, and S is the set of possible stimuli.

Book ChapterDOI
01 Jan 1993
TL;DR: Under a Bayesian framework both hard and soft data can be encoded as local prior (also called pre-posterior) probability distributions, which provide models of uncertainty prevailing at unsampled locations.
Abstract: Under a Bayesian framework both hard and soft data can be encoded as local prior (also called pre-posterior) probability distributions. These local prior distributions are then updated into posterior distributions using the nearby hard and soft data. The posterior distributions provide models of uncertainty prevailing at unsampled locations.

Journal ArticleDOI
TL;DR: An alternative approach using mixture models to identify population heterogeneity and map construction within an empirical Bayes framework is described and a map is presented for hepatitis B data from Berlin in 1989.
Abstract: The analysis and recognition of disease clustering in space and its representation on a map is one of the oldest problems in epidemiology. Some traditional methods of constructing such a map are presented. An alternative approach using mixture models to identify population heterogeneity and map construction within an empirical Bayes framework is described. For hepatitis B data from Berlin in 1989, a map is presented and the different methods are evaluated using a parametric bootstrap approach.


Journal ArticleDOI
TL;DR: This article gives a brief introduction to Bayesian methods and contrasts them with classical hypothesis testing, showing that the quantification of prior beliefs is a common and necessary part of the interpretation of clinical information, whether from a laboratory test or published clinical trial.

Journal ArticleDOI
TL;DR: In this paper, a method is given for constructing a Bayesian interval estimate such that the coverage probability or the interval is approximately equal to the posterior probability orThe interval.
Abstract: Let Y 1 ,..., Y n denote independent observations each distributed according to a distribution depending on a scalar parameter θ; suppose that we are interested in constructing an interval estimate for θ. One approach is to use Bayesian inference. For a given prior density, we can construct an interval such that the posterior probability that θ lies in the interval is some specified value. In this paper, a method is given for constructing a Bayesian interval estimate such that the coverage probability or the interval is approximately equal to the posterior probability or the interval

Journal ArticleDOI
TL;DR: In this paper, an estimating function-based approach to component estimation in the two-stage generalized linear model with univariate random effects and a vector of fixed effects is proposed, which is especially valuable in the longitudinal setting where the response variable is discrete and the number of repeated observations on each unit is small.
Abstract: This article develops an estimating function-based approach to component estimation in the two-stage generalized linear model with univariate random effects and a vector of fixed effects. The novelty and unifying feature of the method is the use of estimating functions in the estimation of both the random effects and their variance. Two separate estimating procedures based on the method are proposed that differ in the intensity of numerical computation required. The estimating function approach is especially valuable in the longitudinal setting where the response variable is discrete and the number of repeated observations on each unit is small. Other key features of this empirical Bayes technique are that it uses all available data, it yields familiar forms for the estimators as special cases, and it is less computationally intensive than other methods designed to address the same problem. An application to the estimation of trends in acquired immune deficiency syndrome (AIDS) incidence across risk group...


Book ChapterDOI
05 Apr 1993
TL;DR: Various ways of estimating probabilities, mainly within the Bayesian framework, are discussed and their relevance and application to machine learning is given, and their relative performance empirically evaluated.
Abstract: Various ways of estimating probabilities, mainly within the Bayesian framework, are discussed. Their relevance and application to machine learning is given, and their relative performance empirically evaluated. A method of accounting for noisy data is given and also applied. The reliability of estimates is measured by a significance measure, which is also empirically tested. We briefly discuss the use of likelihood ratio as a significance measure.

Journal ArticleDOI
TL;DR: This work focuses on two types of prior that deserve consideration, the first of which is the non-informative prior giving standardized likelihood distributions as post-trial probability distributions and the second which has a spike of probability mass at the point of no treatment effect.
Abstract: Many clinicians wrongly interpret p-values as probabilities that treatment has an adverse effect and confidence intervals as probability intervals. Such inferences can be validly drawn from Bayesian analyses of trial results. These analyses use the data to update the prior (or pre-trial) beliefs to give posterior (or post-trial) beliefs about the magnitude of a treatment effect. However, for these methods to gain acceptance in the medical literature, understanding between statisticians and clinicians of the issues involved in choosing appropriate prior distributions for trial reporting needs to be reached. I focus on two types of prior that deserve consideration. The first is the non-informative prior giving standardized likelihood distributions as post-trial probability distributions. Their use is unlikely to be controversial among statisticians whilst being intuitively appealing to clinicians. The second type of prior has a spike of probability mass at the point of no treatment effect. Varying the magnitude of the spike illustrates the sensitivity of the conclusions drawn to the degree of prior scepticism in a treatment effect. With both, graphical displays provide clinical readers with the opportunity to explore the results more fully. An example of how a clinical trial might be reported in the medical literature using these methods is given.

Journal ArticleDOI
TL;DR: In this paper, the authors adopt an objective Bayesian approach and show that the Bayes intervals with a certain conditional prior density on the parameter of interest are confidence intervals as well, having nearly the correct frequency of coverage.
Abstract: Given a random sample from a distribution with density function that depends on an unknown parameter θ=(θ 1 ,..., θ p ), we are concerned with the problem of setting confidence intervals for a particular component of it, say θ 1 , treating the remaining components as a nuisance parameter. Adopting an objective Bayesian approach, we show that the Bayes intervals with a certain conditional prior density on the parameter of interest, θ 1 , are confidence intervals as well, having nearly the correct frequency of coverage. The frequentist performance of the proposed intervals is tested in a simulation study for gamma mean and shape parameters

Journal ArticleDOI
TL;DR: In this article, a fully-Bayes approach is presented for analyzing product reliability during the development phase, where the product goes through a series of test/modification stages, where each product test yields attribute (pass-fail) data, and failure types are classified as fixable or nonfixable.
Abstract: A fully Bayes approach is presented for analyzing product reliability during the development phase. Based on a Bayes version of the Barlow-Scheuer reliability-growth model, it is assumed that the product goes through a series of test/modification stages, where each product test yields attribute (pass-fail) data, and failure types are classified as fixable or nonfixable. Relevant information on both the failure probabilities and the reliability-growth process is used to motivate the prior joint distribution for the probability of each failure type over the specified range of testing. Results at a particular test-stage can be used to update the knowledge about the probability of each failure type (and thus product reliability) at the current test-stage as well as at subsequent test-stages, and at the end of the development phase. A relative ease of incorporation of prior information and a tractability of the posterior analysis are accomplished by using a Dirichlet distribution as the prior distribution for a transformation of the failure probabilities. >

Journal ArticleDOI
TL;DR: Large deviations techniques are used to show that in Bayes testing the equivalence of absolutely optimal and best identical-quantizer systems is not limited to error exponents, but extends to the actual Bayes error probabilities up to a multiplicative constant.
Abstract: The performance of a parallel distributed detection system is investigated as the number of sensors tends to infinity. It is assumed that the i.i.d. sensor data are quantized locally into m-ary messages and transmitted to the fusion center for binary hypothesis testing. The boundedness of the second moment of the postquantization log-likelihood ratio is examined in relation to the asymptotic error exponent. It is found that, when that second moment is unbounded, the Neyman-Pearson error exponent can become a function of the test level, whereas the Bayes error exponent remains, as previously conjectured by J.N. Tsitsiklis, (1986), unaffected. Large deviations techniques are also used to show that in Bayes testing the equivalence of absolutely optimal and best identical-quantizer systems is not limited to error exponents, but extends to the actual Bayes error probabilities up to a multiplicative constant. >

Journal ArticleDOI
TL;DR: An algorithm to compute the posterior probability based on visual inspection of structural components by incorporating fuzzy-set theory into Bayes’ theorem is presented and showed that the fuzzy-Bayesian approach is a viable enhancement to the safety assessment of existing structures.
Abstract: Classical reliability has significantly enhanced the ability of engineers to assess the safety of constructed projects. It has been shown that Bayes’ theorem is an effective tool in updating prior probabilities when the value of a random parameter is known. However, the preceding theorem usually fails to adequately address the uncertainties of the subjective parameters (such as “the connections are good”) that are associated with structural evaluation. It has been demonstrated that these parameters can be significant to the overall safety assessment. With the introduction of fuzzy-set theory, it is possible to quantify the qualitative evaluation and incorporate it into the safety assessment. This paper presents an algorithm to compute the posterior probability based on visual inspection of structural components by incorporating fuzzy-set theory into Bayes’ theorem. The results based on two examples—a reinforced concrete beam and a structural frame—showed that the fuzzy-Bayesian approach is a viable enhancement to the safety assessment of existing structures.

01 Jan 1993
TL;DR: It is proved that the self-composition of the Bayes error probability converges to zero if and only if the noise probability is less than a critical value expressed in terms of the numbers of parity-checks.
Abstract: Convergence of an algorithm for a linear feedback shift register initial state reconstructton using the noisy output sequence. based on a bitrise Bayesian iterative error-correction procedure and different weight parity-checks. is analyzed. It 1s proved that the self-composition of the Bayes error probability converges to zero if and only if the noise probability is less than a critical value expressed in terms of the numbers of parity-checks. An alternative approach to the critical noise estimation based on the residual error-rate after each iterative revision is also discussed.

Proceedings ArticleDOI
17 Jan 1993
TL;DR: It is argued that labeled samples are exponentially more valuable than unlabeled samples and identify the (exponent as the Bhatthacharyya distance).
Abstract: We attempt to discover the role and relative value of labeled and unlabeled samples in reducing the probability of error of the classification of a sample based on the previous observation of labeled and unlabeled data. We assume that the underlying densities belong to a regular family that generates identifiable mixtures. The unlabeled observations, under the above conditions, carry information about the statistical model and therefore can be effectively used to construct a decision rule. When the training set contains an infinite number of unlabeled samples, the first labeled observation reduces the probability of error to within a factor of two of the Bayes risk. Moreover subsequent labeled samples yield exponential convergence of the probability of classification error to the Bayes risk. We argue that labeled samples are exponentially more valuable than unlabeled samples and identify the (exponent as the Bhatthacharyya distance.

Journal ArticleDOI
TL;DR: In this article, the reliability function of the extreme value distribution was obtained by using two Bayes approximation procedures: Lindley (1980) and Tierney and Kadane (1986), and these estimates were compared to maximum likelihood estimates (MLE) based on a Monte Carlo simulation study.
Abstract: The authors obtain Bayes estimates of the reliability function of the extreme value distribution by using two Bayes approximation procedures: Lindley (1980), and Tierney and Kadane (1986). These estimates were compared to maximum-likelihood estimates (MLE) based on a Monte Carlo simulation study. Jeffreys invariant prior was used in the comparison for both Bayes procedures. The MLE are superior to either of the Bayes estimates, except for small values of t. The simpler Lindley Bayes procedure gives estimates with smaller root-mean-square error than estimates obtained by the Tierney and Kadane procedure except for large values of t. From a practical standpoint, the ML method is easiest to use and more accurate for the extreme value distribution than the two Bayes approximation procedures. Both Bayes procedures seem to perform equally. However, the Lindley method is easier to use with little loss of accuracy. >

Journal ArticleDOI
TL;DR: In robust Bayesian analysis, a prior is assumed to belong to a family instead of being specified exactly as discussed by the authors, and the multiplicity of priors naturally leads to a collection of Bayes actions (estimates), and these often form a convex set (an interval in the case of a real parameter).
Abstract: In robust Bayesian analysis, a prior is assumed to belong to a family instead of being specified exactly. The multiplicity of priors naturally leads to a collection of Bayes actions (estimates), and these often form a convex set (an interval in the case of a real parameter). It is clearly essential to be able to recommend one action from this set to the user. We address the following problem: if we systematically choose one action for each X thereby constructing a decision rule, is it going to be Bayes? Is it Bayes with respect to a prior in the original prior family? Even if it is not genuine Bayes, is it admissible?