Topic

# Posterior probability

About: Posterior probability is a(n) research topic. Over the lifetime, 13731 publication(s) have been published within this topic receiving 475016 citation(s). The topic is also known as: posterior probability distribution & posterior distribution.

...read more

##### Papers

More filters

••

TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.

...read more

Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

...read more

18,328 citations

••

David Spiegelhalter

^{1}, Nicola G. Best^{2}, Bradley P. Carlin^{3}, Angelika van der Linde^{4}•Institutions (4)Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

...read more

10,825 citations

••

TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

...read more

Abstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported.

...read more

5,216 citations

••

TL;DR: There is a comprehensive introduction to the applied models of probability that stresses intuition, and both professionals, researchers, and the interested reader will agree that this is the most solid and widely used book for probability theory.

...read more

Abstract: The Seventh Edition of the successful Introduction to Probability Models introduces elementary probability theory and stochastic processes. This book is particularly well-suited to those applying probability theory to the study of phenomena in engineering, management science, the physical and social sciences, and operations research. Skillfully organized, Introduction to Probability Models covers all essential topics. Sheldon Ross, a talented and prolific textbook author, distinguishes this book by his effort to develop in students an intuitive, and therefore lasting, grasp of probability theory. Ross' classic and best-selling text has been carefully and substantially revised. The Seventh Edition includes many new examples and exercises, with the majority of the new exercises being of the easier type. Also, the book introduces stochastic processes, stressing applications, in an easily understood manner. There is a comprehensive introduction to the applied models of probability that stresses intuition. Both professionals, researchers, and the interested reader will agree that this is the most solid and widely used book for probability theory. Features: * Provides a detailed coverage of the Markov Chain Monte Carlo methods and Markov Chain covertimes * Gives a thorough presentation of k-record values and the surprising Ignatov's * theorem * Includes examples relating to: "Random walks to circles," "The matching rounds problem," "The best prize problem," and many more * Contains a comprehensive appendix with the answers to approximately 100 exercises from throughout the text * Accompanied by a complete instructor's solutions manual with step-by-step solutions to all exercises New to this edition: * Includes many new and easier examples and exercises * Offers new material on utilizing probabilistic method in combinatorial optimization problems * Includes new material on suspended animation reliability models * Contains new material on random algorithms and cycles of random permutations

...read more

4,938 citations

••

Abstract: We describe a Bayesian approach for learning Bayesian networks from a combination of prior knowledge and statistical data. First and foremost, we develop a methodology for assessing informative priors needed for learning. Our approach is derived from a set of assumptions made previously as well as the assumption of likelihood equivalence, which says that data should not help to discriminate network structures that represent the same assertions of conditional independence. We show that likelihood equivalence when combined with previously made assumptions implies that the user's priors for network parameters can be encoded in a single Bayesian network for the next case to be seen—a prior network—and a single measure of confidence for that network. Second, using these priors, we show how to compute the relative posterior probabilities of network structures given data. Third, we describe search methods for identifying network structures with high posterior probabilities. We describe polynomial algorithms for finding the highest-scoring network structures in the special case where every node has at most k e 1 parent. For the general case (k > 1), which is NP-hard, we review heuristic search algorithms including local search, iterative local search, and simulated annealing. Finally, we describe a methodology for evaluating Bayesian-network learning algorithms, and apply this approach to a comparison of various approaches.

...read more

3,960 citations