scispace - formally typeset
Search or ask a question

Showing papers on "Posterior probability published in 1987"


Journal ArticleDOI
TL;DR: If data augmentation can be used in the calculation of the maximum likelihood estimate, then in the same cases one ought to be able to use it in the computation of the posterior distribution of parameters of interest.
Abstract: The idea of data augmentation arises naturally in missing value problems, as exemplified by the standard ways of filling in missing cells in balanced two-way tables. Thus data augmentation refers to a scheme of augmenting the observed data so as to make it more easy to analyze. This device is used to great advantage by the EM algorithm (Dempster, Laird, and Rubin 1977) in solving maximum likelihood problems. In situations when the likelihood cannot be approximated closely by the normal likelihood, maximum likelihood estimates and the associated standard errors cannot be relied upon to make valid inferential statements. From the Bayesian point of view, one must now calculate the posterior distribution of parameters of interest. If data augmentation can be used in the calculation of the maximum likelihood estimate, then in the same cases one ought to be able to use it in the computation of the posterior distribution. It is the purpose of this article to explain how this can be done. The basic idea ...

4,020 citations


Journal ArticleDOI
TL;DR: In this paper, a review of precise (point or small interval) hypotheses is reviewed, with special emphasis placed on exploring the dramatic conflict between conditional measures (Bayes factors and posterior probabilities) and the classical P-value (or observed significance level).
Abstract: Testing of precise (point or small interval) hypotheses is reviewed, with special emphasis placed on exploring the dramatic conflict between conditional measures (Bayes factors and posterior probabilities) and the classical P-value (or observed significance level). This conflict is highlighted by finding lower bounds on the conditional measures over wide classes of priors, in normal and binomial situations, lower bounds, which are much larger than the P-value; this leads to the recommendation of several alter- natives to P-values. Results are also given concerning the validity of approximating an interval null by a point null. The overall discussion features critical examination of issues such as the probability of objective testing and the possibility of testing from confidence sets.

641 citations


Journal ArticleDOI
TL;DR: For the one-sided hypothesis testing problem, it is shown in this article that the infimum of the Bayesian posterior probability of H 0 is equal to the p value, while for some classes of prior distributions the infum is less than or equal to p value.
Abstract: For the one-sided hypothesis testing problem it is shown that it is possible to reconcile Bayesian evidence against H 0, expressed in terms of the posterior probability that H 0 is true, with frequentist evidence against H 0, expressed in terms of the p value. In fact, for many classes of prior distributions it is shown that the infimum of the Bayesian posterior probability of H 0 is equal to the p value; in other cases the infimum is less than the p value. The results are in contrast to recent work of Berger and Sellke (1987) in the two-sided (point null) case, where it was found that the p value is much smaller than the Bayesian infimum. Some comments on the point null problem are also given.

390 citations


Journal ArticleDOI
TL;DR: Two methods for combining the information contents from multiple sources of remote-sensing image data and spatial data in general are described, including a probabilistic scheme that employs a global membership function that is derived from all available data sources and an evidential calculus based upon Dempster's orthogonal sum combination rule.
Abstract: Two methods for combining the information contents from multiple sources of remote-sensing image data and spatial data in general are described. One is a probabilistic scheme that employs a global membership function (similar to a joint posterior probability) that is derived from all available data sources. The other is an evidential calculus based upon Dempster's orthogonal sum combination rule. A feature of both methods is that uncertainty regarding data analysis can be incorporated into the process. Both schemes are evaluated in terms of their general applicability and certain equivalences are noted. Moreover, both are shown to perform well on mixed multispectral data.

230 citations


Journal ArticleDOI
TL;DR: In this article, the authors apply the bootstrap to situations where the prior must be estimated from the data (empirical Bayes methods), and show that the posterior distribution is Gaussian with mean B μ + (1-B)y and variance σ 2(1- B), where B = σ2/(σ2 + τ2).
Abstract: Consider the model with data generated by the following two-stage process. First, a parameter θ is sampled from a prior distribution G, and then an observation is sampled from the conditional distribution f(y | θ). If the prior distribution is known, then the Bayes estimate under squared error loss is the posterior expectation of θ conditional on the data y. For example, if G is Gaussian with mean μ and variance τ2 and f(y | θ) is Gaussian with mean θ and variance σ2, then the posterior distribution is Gaussian with mean B μ + (1 – B)y and variance σ2(1 – B), where B = σ2/(σ2 + τ2). Inferences about θ are based on this distribution. We study the application of the bootstrap to situations where the prior must be estimated from the data (empirical Bayes methods). For this model, we observe data Y T = [Y 1, …, Y K]T, each independent Y K following the compound model described previously. As first shown by James and Stein (1961), setting each θk equal to its estimated posterior mean, where , and , pr...

221 citations


Journal ArticleDOI
TL;DR: In this paper, a set of observed minima is viewed as a sample from a generalized multinomial distribution whose cells correspond to the local optima of the objective function, and the posterior distribution is obtained by constructing sequential Bayesian stopping rules which find the optimal trade off between reliability and computational effort.
Abstract: By far the most efficient methods for global optimization are based on starting a local optimization routine from an appropriate subset of uniformly distributed starting points. As the number of local optima is frequently unknown in advance, it is a crucial problem when to stop the sequence of sampling and searching. By viewing a set of observed minima as a sample from a generalized multinomial distribution whose cells correspond to the local optima of the objective function, we obtain the posterior distribution of the number of local optima and of the relative size of their regions of attraction. This information is used to construct sequential Bayesian stopping rules which find the optimal trade off between reliability and computational effort.

118 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the alpha level can be equated with the conditional prior probability, the overall prior probability and the conditional posterior probability, which is the probability of having made a Type I error in situations in which the null hypothesis is rejected.
Abstract: A statistical test leads to a Type I error whenever it leads to the rejection of a null hypothesis that is in fact true. The probability of making a Type I error can be characterized in the following three ways: the conditional prior probability (the probability of making a Type I error whenever a true null hypothesis is tested), the overall prior probability (the probability of making a Type I error across all experiments), and the conditional posterior probability (the probability of having made a Type I error in situations in which the null hypothesis is rejected). In this article, we show (a) that the alpha level can be equated with the first of these and (b) that it provides an upper bound for the second but (c) that it does not provide an estimate of the third, although it is commonly assumed to do so. We trace the source of this erroneous assumption first to statistical texts used by psychologists, which are generally ambiguous about which of the three interpretations is intended at any point in their discussions of Type I errors and which typically confound the conditional prior and posterior probabilities. Underlying this, however, is a more general fallacy in reasoning about probabilities, and we suggest that this may be the result of erroneous inferences about probabilistic conditional statements. Finally, we consider the possibility of estimating the (posterior) probability of a Type I error in situations in which the null hypothesis is rejected and, hence, the proportion of statistically significant results that may be Type I errors. © 1987 American Psychological Association.

117 citations


Journal ArticleDOI
TL;DR: Bayesian methods are used on the results of the preliminary study to obtain a posterior distribution representing the state of knowledge of the parameters of interest that provides the probability that the experimental regimen is superior to the standard by any particular amount.

62 citations


Journal Article
TL;DR: In this article, the posterior probability for a general comparative parameter is formulated as a finite sum of the beta-binomial type, which can be parameterized in terms of a difference and a ratio of two proportions as well, and the analysis is extended to concern the non-null values of the three usual parameters of association.
Abstract: Altham (1969) derived a relation between the cumulative posterior probability for association and the exact p-value in a 2 x 2 table. But she found that, in general, the exact posterior distribution of the chosen measure of association (odds ratio) was not easy to deal with. This paper covers generalizations of the Bayesian analysis in two directions. First, the posterior probability is formulated for a general comparative parameter, which implies that the analysis is not limited in application to problems involving odds ratio but can be parameterized in terms of a difference and a ratio of two proportions as well. Second, the formal analysis is extended to concern the non-null values of the three usual parameters of association. Under the model of a general beta (or a particular rectangular) prior distribution, the parameter-specific posterior functions are express- ible as finite sums of the beta-binomial type. The posterior distributions are immediately intelli- gible and provide for a uniform basis for the Bayesian analogues of interval estimation, point estimation and significance testing.

61 citations


Journal ArticleDOI
TL;DR: In this article, it was argued that there is no need to depart from traditional understandings of probability in this problem, and that separate pieces of evidence about the individual issues can be combined to modify the probability of the case as a whole.
Abstract: If, in a civil suit, the plaintiff has to establish several distinct issues in order to succeed, is it necessary for the probability of their conjunction to exceed one half, or is it sufficient to establish each component issue with probability exceeding one half? This paper analyses the way in which separate pieces of evidence about the individual issues can be combined to modify the probability of the case as a whole. Contrary to a conclusion of Jonathan Cohen, it is argued that there is no need to depart from traditional understandings of probability in this problem.

59 citations



Journal ArticleDOI
TL;DR: In this article, the posterior distributions are calculated using a non-informative prior distribution that is uniform on the intraclass correlation, and a simulation study for the estimation of the ratio of the variance components is also presented, together with a study of the sampling properties of highest posterior density regions for this ratio.
Abstract: The estimation of variance components in the one-way random model with unequal sample sizes is studied. A simulation study that indicates that modes of posterior distributions have good sampling properties compared with other estimators is presented. The posterior distributions are calculated using a noninformative prior distribution that is uniform on the intraclass correlation. A simulation study for the estimation of the ratio of the variance components is also presented, together with a study of the sampling properties of highest posterior density regions for this ratio, Bayesian estimators appear to be viable competitors to the many classical alternatives in a sampling framework.

Journal ArticleDOI
TL;DR: A selection of contextual classification methods for contextual classification of multispectral scanner data is presented and compared using computer-gented data on different scenes, and an attempt is made to characterize what kind of errors each particular method makes.
Abstract: Various methods for contextual classification of multispectral scanner data have been developed during the last 15 years, aiming at increased accuracy in classified images. The methods have for a large part been of four main types: 1) neighborhood-based classification based on stochastic models for the classes over the scene and for the vectors given the classes; 2) simultaneous classification of all pixels, using, e.g., Markov random-field models; 3) relaxation methods that iteratively modify posterior probabilities using information from an increasing neighborhood; and 4) methods using ordinary noncontextual rules based on transformed data. In the present paper a selection of these methods is presented and compared using computer-gented data on different scenes. Spatial autocorrelation is present in the data. Error rates are compared, and an attempt is made to characterize what kind of errors each particular method makes.

Journal ArticleDOI
TL;DR: In this article, Herriges et al. used a set of prior beliefs (the engineering approach), transforming them into a posterior distribution that describes appliance usage patterns and reflects the evidence provided by both approaches.
Abstract: Load forecasting models employed in the electric utility industry have become increas ingly dependent upon information about the electricity used by indivi dual appliances (i.e., end uses). Currently, information on appliance usage is obtained from two fundamentally different sources: (1) engi neering estimates and (2) conditional demand estimates. Bayesian anal ysis provides the means by which these two sources can be formally co mbined. Observed usage data (via the conditional demand approach) are used to modify a set of prior beliefs (the engineering approach), transforming them into a posterior distribution that describes appliance usage patterns and reflects the evidence provided by both approaches. Coauthors are Joseph A. Herriges, Kenneth E. Train, and Robert J. Windle. Copyright 1987 by MIT Press.

Book ChapterDOI
01 Jan 1987
TL;DR: Two models are given for the extraction of boundaries in digital images, one for discriminating textures and the other for discriminating objects, where a Markov random field is constructed as a prior distribution over intensities and labels.
Abstract: Two models are given for the extraction of boundaries in digital images, one for discriminating textures and the other for discriminating objects. In both cases a Markov random field is constructed as a prior distribution over intensities (observed) and labels (unobserved); the labels are either the texture types or boundary indicators. The posterior distribution, i.e., the conditional distribution over the labels given the intensities, is then analyzed by a Monte-Carlo algorithm called stochastic relaxation. The final labeling corresponds to a local maximum of the posterior likelihood.

01 Jan 1987
TL;DR: This paper presents a combination Bayesian/Item Response Theory procedure for pooling performance on a particular objective with information about an examinee’s overall test performance in order to produce more stable objective scores.
Abstract: This paper presents a combination Bayesian/Item Response Theory procedure for pooling performance on a particular objective with information about an examinee’s overall test performance in order to produce more stable objective scores. The procedure, including the calculation of a posterior distribution, is described. A split-half cross validation study finds that a credibility interval based on the posterior distribution is sufficiently accurate to be useful for scoring reports for teachers.

Journal ArticleDOI
TL;DR: In this article, the probability distribution of equilibrium outcomes is assumed to be a continuous but unknown function of agents' forecasts, and the main result is that with probability one the forecasts converge to the set of fixed points of the unknown mapping.

Proceedings Article
13 Jul 1987
TL;DR: This paper shows that regularization is an example of Bayesian modeling, and that using the regularization energy function for the surface interpolation problem results in a prior model that is fractal (self-affine over a range of scales).
Abstract: Many of the processing tasks arising in early vision involve the solution of ill-posed inverse problems. Two techniques that are often used to solve these inverse problems are regularization and Bayesian modeling. Regularization is used to find a solution that both fits the data and is also sufficiently smooth. Bayesian modeling uses a statistical prior model of the field being estimated to determine an optimal solution. One convenient way of specifying the prior model is to associate an energy function with each possible solution, and to use a Boltzmann distribution to relate the solution energy to its probability. This paper shows that regularization is an example of Bayesian modeling, and that using the regularization energy function for the surface interpolation problem results in a prior model that is fractal (self-affine over a range of scales). We derive an algorithm for generating typical (fractal) estimates from the posterior distribution. We also show how this algorithm can be used to estimate the uncertainty associated with a regularized solution, and how this uncertainty can be used at later stages of processing.

Journal ArticleDOI
TL;DR: In this article, the authors extended some of the work presented in Redner and Walker [I9841] on the maximum likelihood estimate of parameters in a mixture model to a Bayesian modal estimate.
Abstract: This paper extends some of the work presented in Redner and Walker [I9841 on the maximum likelihood estimate of parameters in a mixture model to a Bayesian modal estimate. The problem of determining the mode of the joint posterior distribution is discussed. Necessary conditions are given for a choice of parameters to be the mode and a numerical scheme based on the EM algorithm is presented. Some theoretical remarks on the resulting iterative scheme and simulation results are also given.

Proceedings Article
10 Jul 1987
TL;DR: In this paper, the authors examined the use of stochastic simulation of Bayesian belief networks as a method for computing the probabilities of values of variables and found that this algorithm, in certain networks, leads to much slower than expected convergence to the true posterior probability.
Abstract: This paper examines the use of stochastic simulation of Bayesian belief networks as a method for computing the probabilities of values of variables. Specifically, it examines the use of a scheme described by Henrion, called logic sampling, and an extension to that scheme described by Pearl. The scheme devised by Pearl allows us to "clamp" any number of variables to given values and to conduct stochastic simulation on the resulting network. We have found that this algorithm, in certain networks, leads to much slower than expected convergence to the true posterior probability. This behavior is a result of the tendency for local areas in the graph to become fixed through many stochastic iterations. The length of this non-convergence can be made arbitrarily long by strengthening the dependency between two nodes. This paper describes the use of graph modification. By modifying a belief network through the use of pruning, arc reversal, and node reduction, it may be possible to convert the network to a form that is computationally more efficient for stochastic simulation.

Journal ArticleDOI
William A. Nazaret1
TL;DR: In this paper, a Bayesian method is proposed to estimate the expected proportions in a three-way contingency table appropriate when prior knowledge about the main, first and second-order interaction effects can be described by a particular kind of exchangeability assumption.
Abstract: SUMMARY This paper presents a Bayesian method to estimate the expected proportions in a three-way contingency table appropriate when prior knowledge about the main, firstand second-order interaction effects can be described by a particular kind of exchangeability assumption. The proposed Bayes estimates are calculated by finding those values of the effects which maximize the resulting posterior distribution and can be used to explore the possibility that a nonsaturated submodel, such as independence or conditional independence, fits the data. This extends the work of Leonard (1975) for two-dimensional tables. We discuss numerical strategies to solve the estimating equations and point out how an incorrect choice of values for the 'indifference' case, as made by previous authors, can have serious effects on the convergence of the algorithms. The method is exemplified by a survey on skin cancer and data on voting transitions in British elections.

Journal ArticleDOI
TL;DR: In this article, a multiplicative seasonal forecasting model for cumulative events is presented, in which, conditional on end-of-season totals being given and seasonal shape being known, it is shown that events occurring within the season are multinomially distributed.
Abstract: A multiplicative seasonal forecasting model for cumulative events in which, conditional on end- of-season totals being given and seasonal shape being known, it is shown that events occurring within the season are multinomially distributed is presented. The model uses the information contained in the arrival of new events to obtain a posterior distribution for end-of-season totals. Bayesian forecasts are obtained recursively in two stages: first, by predicting the expected number and variance of event counts in future intervals within the remaining season, and then by predicting revised means and variances for end-of-season totals based on the most recent forecast error.

ReportDOI
01 Jan 1987
TL;DR: In this paper, a Bayesian approach is used to assign a probability distribution to the dependent variable through a specification of prior distributions for the unknown parameters in the regression model, and the appropriate posterior probabilities are derived for each submodel.
Abstract: This paper is concerned with the selection of subsets of ''predictor'' variables in a linear regression model for the prediction of a ''dependent'' variable. We take a Bayesian approach and assign a probability distribution to the dependent variable through a specification of prior distributions for the unknown parameters in the regression model. The appropriate posterior probabilities are derived for each submodel and methods are proposed for evaluating the family of prior distributions. Examples are given that show the application of the Bayesian methodology. 23 refs., 3 figs.

Journal ArticleDOI
TL;DR: In this article, a chi-square type statistic is devised to test the homogeneity of several Paretian Laws and its correctness is assessed numerically by simulating its quantiles and comparing them with the exact quantiles of a Chi-square variate.
Abstract: A chi-square type statistic is devised to test the homogeneity of several Paretian Laws. It is assessed numerically by simulating its quantiles and comparing them with the exact quantiles of a chi-square variate. When the two parameters of the distribution are unknown, a prior with finite probability measure is considered and the Pearson system of curves is used to approximate the posterior distribution of parameters.

Journal ArticleDOI
TL;DR: This note focuses on continuous-time ARMA processes observed in white noise, and a maximum a-posteriori (MAP) estimator is defined for the trajectory of the parameters' random process, which enables the MAP estimation of randomly slowly varying parameters.
Abstract: Recently, an iterative algorithm has been presented for estimating the parameters of partially observed continuous-time processes [1]. In this note we concentrate on continuous-time ARMA processes observed in white noise. A maximum a-posteriori (MAP) estimator is defined for the trajectory of the parameters' random process. This approach enables the MAP estimation of randomly slowly varying parameters, and extends the conventional treatment of time-invariant parameters. The iterative algorithm derived for the MAP estimation, increases the posterior probability of the parameters in each iteration, and converges to a stationary point of the posterior probability functional. Each iteration involves a standard linear smoother followed by a finite-dimensional linear system, and thus is easily implemented.

Journal ArticleDOI
TL;DR: This work considers the problem of assigning a realization into one of several autoregressive soursces that share a common known order and unknown error variance and develops an informal Bayesian inference based on the marginal posterior distribution of the classification vector.
Abstract: We consider the problem of assigning a realization into one of several autoregressive soursces that share a common known order and unknown error variance. The approach is to use an informal Bayesian inference based on the marginal posterior distribution of the classification vector. A realization is assigned to that actoregessive process with the largest posterior probability, and an example demomtrates the classification technique behaves in a reasonable way. A generalization is developed.

Posted Content
TL;DR: In this paper, Herriges et al. used the conditional demand approach to modify a set of prior beliefs (the engineering approach), transforming them into a posterior distribution that describes appliance usage patterns and reflects the evidence provided by both approaches.
Abstract: Load forecasting models employed in the electric utility industry have become increas ingly dependent upon information about the electricity used by indivi dual appliances (i.e., end uses). Currently, information on appliance usage is obtained from two fundamentally different sources: (1) engi neering estimates and (2) conditional demand estimates. Bayesian anal ysis provides the means by which these two sources can be formally co mbined. Observed usage data (via the conditional demand approach) are used to modify a set of prior beliefs (the engineering approach), transforming them into a posterior distribution that describes appliance usage patterns and reflects the evidence provided by both approaches. Coauthors are Joseph A. Herriges, Kenneth E. Train, and Robert J. Windle.

Journal ArticleDOI
TL;DR: It is shown how choice of a good metric in nearest neighbour estimates of posterior probability can lead to improved average conditional error rate estimates.

Journal ArticleDOI
TL;DR: In this article, the authors present an introduction to automated methods of classification and assignment, with particular reference to their use in the analysis of soil data, including types of variable describing a soil sample; measures of dissimilarity; clustering criteria and algorithms; representation of data as points in a low-dimensional space; assessment of classifications; incorporation into a classification of spatial relationships between soil samples; assignment of objects to the population with maximum posterior probability; assignment procedures for data described by variables of mixed type; kernel density estimation.
Abstract: . The paper presents an introduction to automated methods of classification and assignment, with particular reference to their use in the analysis of soil data. Material covered includes: types of variable describing a soil sample; measures of dissimilarity; clustering criteria and algorithms; representation of data as points in a low-dimensional space; assessment of classifications; incorporation into a classification of spatial relationships between soil samples; assignment of objects to the population with maximum posterior probability; assignment procedures for data described by variables of mixed type; kernel density estimation; assignment to spatially-located populations.

Journal ArticleDOI
TL;DR: In this paper, a distance-method estimator for the process distribution is developed and shown to be superior to the commonly used method-of-moments estimator, and several updating procedures including the posterior distribution, the exponential smoothing method, the moving window method and the all periods method are proposed and compared by a simulation study.
Abstract: The process distribution of a manufacturing process which reflects past experience with the quality levels of outgoing lots is an indispensible input of the Bayesian quality audit systems. A distance-method estimator for the process distribution is developed and shown to be superior to the commonly used method-of-moments estimator. In a continuous manufacturing process, the process distribution needs to be updated after each new lot is inspected. Several updating procedures including the posterior distribution, the exponential smoothing method, the moving window method and the all periods method are proposed and compared by a simulation study.