scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian probability published in 2015"


Proceedings Article
25 Jan 2015
TL;DR: A new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) is presented which addresses key limitations of existing calibration methods and can be readily combined with many existing classification algorithms.
Abstract: Learning probabilistic predictive models that are well calibrated is critical for many prediction and decision-making tasks in artificial intelligence. In this paper we present a new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) which addresses key limitations of existing calibration methods. The method post processes the output of a binary classification algorithm; thus, it can be readily combined with many existing classification algorithms. The method is computationally tractable, and empirically accurate, as evidenced by the set of experiments reported here on both real and simulated datasets.

887 citations


Posted Content
TL;DR: This work presents an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches, and approximate the model's intractable posterior with Bernoulli variational distributions.
Abstract: Convolutional neural networks (CNNs) work well on large datasets. But labelled data is hard to collect, and in some applications larger amounts of data are not available. The problem then is how to use CNNs with small data -- as CNNs overfit quickly. We present an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches. This is by placing a probability distribution over the CNN's kernels. We approximate our model's intractable posterior with Bernoulli variational distributions, requiring no additional model parameters. On the theoretical side, we cast dropout network training as approximate inference in Bayesian neural networks. This allows us to implement our model using existing tools in deep learning with no increase in time complexity, while highlighting a negative result in the field. We show a considerable improvement in classification accuracy compared to standard techniques and improve on published state-of-the-art results for CIFAR-10.

669 citations


Journal ArticleDOI
TL;DR: This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
Abstract: The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.

662 citations


Posted Content
TL;DR: Probabilistic Backpropagation (PBP) as discussed by the authors uses a forward propagation of probabilities through the network and then does a backward computation of gradients to estimate the posterior variance on the network weights.
Abstract: Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.

614 citations


Journal ArticleDOI
Kay H. Brodersen1, Fabian Gallusser, Jim Koehler, Nicolas Remy, Steven L. Scott 
TL;DR: This paper proposes to infer causal impact on the basis of a diusion-regressi on state-space model that predicts the counterfactual market response that would have occurred had no intervention taken place.
Abstract: An important problem in econometrics and marketing is to infer the causal impact that a designed market intervention has exerted on an outcome metric over time. This paper proposes to infer causal impact on the basis of a diffusion-regression state-space model that predicts the counterfactual market response in a synthetic control that would have occurred had no intervention taken place. In contrast to classical difference-in-differences schemes, state-space models make it possible to (i) infer the temporal evolution of attributable impact, (ii) incorporate empirical priors on the parameters in a fully Bayesian treatment, and (iii) flexibly accommodate multiple sources of variation, including local trends, seasonality and the time-varying influence of contemporaneous covariates. Using a Markov chain Monte Carlo algorithm for posterior inference, we illustrate the statistical properties of our approach on simulated data. We then demonstrate its practical utility by estimating the causal effect of an online advertising campaign on search-related site visits. We discuss the strengths and limitations of state-space models in enabling causal attribution in those settings where a randomised experiment is unavailable. The CausalImpact R package provides an implementation of our approach.

607 citations


Journal ArticleDOI
TL;DR: In this paper, a generative model called Bayesian rule lists (BRL) is proposed to predict the risk of stroke in patients with atrial fibrillation, which can be used to produce highly accurate and interpretable medical scoring systems.
Abstract: We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if...then... statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model called Bayesian Rule Lists that yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy on par with the current top algorithms for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS$_2$ score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial fibrillation. Our model is as interpretable as CHADS$_2$, but more accurate.

532 citations


Journal ArticleDOI
TL;DR: In this paper, a generative model called Bayesian Rule Lists (BRL) is proposed for predicting the risk of stroke in patients with atrial fibrillation, which can be used to produce highly accurate and interpretable medical scoring systems.
Abstract: We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if…then…statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model called Bayesian Rule Lists that yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy on par with the current top algorithms for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS$_{2}$ score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial fibrillation. Our model is as interpretable as CHADS$_{2}$, but more accurate.

520 citations


Proceedings Article
06 Jul 2015
TL;DR: This work presents a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP), which works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients.
Abstract: Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.

392 citations


Book
08 Oct 2015
TL;DR: This chapter discusses phylogenetic tree space exploration in the context of BEAST, a programming language that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging trees.
Abstract: What are the models used in phylogenetic analysis and what exactly is involved in Bayesian evolutionary analysis using Markov chain Monte Carlo (MCMC) methods? How can you choose and apply these models, which parameterisations and priors make sense, and how can you diagnose Bayesian MCMC when things go wrong? These are just a few of the questions answered in this comprehensive overview of Bayesian approaches to phylogenetics. This practical guide:Addresses the theoretical aspects of the field Advises on how to prepare and perform phylogenetic analysisHelps with interpreting analyses and visualisation of phylogeniesDescribes the software architecture Helps developing BEAST 2.2 extensions to allow these models to be extended further.With an accompanying website providing example files and tutorials (http://beast2.org/), this one-stop reference to applying the latest phylogenetic models in BEAST 2 will provide essential guidance for all users – from those using phylogenetic tools, to computational biologists and Bayesian statisticians.

390 citations


MonographDOI
01 May 2015
TL;DR: This book is an ideal introduction for graduate students in applied mathematics, computer science, engineering, geoscience and other emerging application areas with a general dynamical systems approach.
Abstract: In this book the authors describe the principles and methods behind probabilistic forecasting and Bayesian data assimilation. Instead of focusing on particular application areas, the authors adopt a general dynamical systems approach, with a profusion of low-dimensional, discrete-time numerical examples designed to build intuition about the subject. Part I explains the mathematical framework of ensemble-based probabilistic forecasting and uncertainty quantification. Part II is devoted to Bayesian filtering algorithms, from classical data assimilation algorithms such as the Kalman filter, variational techniques, and sequential Monte Carlo methods, through to more recent developments such as the ensemble Kalman filter and ensemble transform filters. The McKean approach to sequential filtering in combination with coupling of measures serves as a unifying mathematical framework throughout Part II. Assuming only some basic familiarity with probability, this book is an ideal introduction for graduate students in applied mathematics, computer science, engineering, geoscience and other emerging application areas.

353 citations


01 Mar 2015
TL;DR: BOLT-LMM is presented, which requires only a small number of O(MN) iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes.
Abstract: Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN) iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts.

Journal ArticleDOI
TL;DR: Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex and unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition.
Abstract: To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.

Journal ArticleDOI
TL;DR: In this paper, the posterior distribution of a high-dimensional linear regression under sparsity constraints is characterized and employed to the construction and study of credible sets for uncertainty quantification, where the prior is a mixture of point masses at zero and continuous distributions.
Abstract: We study full Bayesian procedures for high-dimensional linear regression under sparsity constraints. The prior is a mixture of point masses at zero and continuous distributions. Under compatibility conditions on the design matrix, the posterior distribution is shown to contract at the optimal rate for recovery of the unknown sparse vector, and to give optimal prediction of the response vector. It is also shown to select the correct sparse model, or at least the coefficients that are significantly different from zero. The asymptotic shape of the posterior distribution is characterized and employed to the construction and study of credible sets for uncertainty quantification.

Journal ArticleDOI
07 Oct 2015-Neuron
TL;DR: This work explores how a definition of confidence as Bayesian probability can unify these viewpoints, and entails that there are distinct forms in which confidence is represented and used in the brain, including distributional confidence, pertaining to neural representations of probability distributions, and summary confidence, referring to scalar summaries of those distributions.

Journal ArticleDOI
TL;DR: Two methods for implementing Bayesian meta-analysis are presented, using numerical integration and importance sampling techniques, and a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made are derived.
Abstract: Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14,886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated.

Journal ArticleDOI
TL;DR: T-REx as discussed by the authors is a line-by-line radiative transfer fully-Bayesian retrieval framework for exoplanetary atmospheres, which includes the optimised use of molecular line-lists from the ExoMol project.
Abstract: Spectroscopy of exoplanetary atmospheres has become a well established method for the characterisation of extrasolar planets. We here present a novel inverse retrieval code for exoplanetary atmospheres. T-REx (Tau Retrieval for Exoplanets) is a line-by-line radiative transfer fully Bayesian retrieval framework. T-REx includes the following features: 1) the optimised use of molecular line-lists from the ExoMol project; 2) an unbiased atmospheric composition prior selection, through custom built pattern recognition software; 3) the use of two independent algorithms to fully sample the Bayesian likelihood space: nested sampling as well as a more classical Markov Chain Monte Carlo approach; 4) iterative Bayesian parameter and model selection using the full Bayesian Evidence as well as the Savage-Dickey Ratio for nested models, and 5) the ability to fully map very large parameter spaces through optimal code parallelisation and scalability to cluster computing. In this publication we outline the T-REx framework and demonstrate, using a theoretical hot-Jupiter transmission spectrum, the parameter retrieval and model selection. We investigate the impact of Signal-to-Noise and spectral resolution on the retrievability of individual model parameters, both in terms of error bars on the temperature and molecular mixing ratios as well as its effect on the model's global Bayesian evidence.

Book
01 Jan 2015
TL;DR: This book provides the theoretical background in an easy-to-understand approach, encouraging readers to examine the processes that generated their data, and presents problems and solutions that are most often applicable to other data and questions, making it an invaluable resource for analyzing a variety of data types.
Abstract: Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and STAN examines the Bayesian and frequentist methods of conducting data analyses. The book provides the theoretical background in an easy-to-understand approach, encouraging readers to examine the processes that generated their data. Including discussions of model selection, model checking, and multi-model inference, the book also uses effect plots that allow a natural interpretation of data. Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and STAN introduces Bayesian software, using R for the simple modes, and flexible Bayesian software (BUGS and Stan) for the more complicated ones. Guiding the ready from easy toward more complex (real) data analyses ina step-by-step manner, the book presents problems and solutions?including all R codes?that are most often applicable to other data and questions, making it an invaluable resource for analyzing a variety of data types. Introduces Bayesian data analysis, allowing users to obtain uncertainty measurements easily for any derived parameter of interestWritten in a step-by-step approach that allows for eased understanding by non-statisticiansIncludes a companion website containing R-code to help users conduct Bayesian data analyses on their own dataAll example data as well as additional functions are provided in the R-package blmeco

Book
18 Nov 2015
TL;DR: Bayesian methods for reinforcement learning have been widely investigated, yielding principled methods for incorporating prior information intoinference algorithms as discussed by the authors, and the major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action selection exploration/exploitation as a function of the uncertainty in learning; and 2 it provides a machinery to incorporate prior knowledge into the algorithms.
Abstract: Bayesian methods for machine learning have been widely investigated,yielding principled methods for incorporating prior information intoinference algorithms. In this survey, we provide an in-depth reviewof the role of Bayesian methods for the reinforcement learning RLparadigm. The major incentives for incorporating Bayesian reasoningin RL are: 1 it provides an elegant approach to action-selection exploration/exploitation as a function of the uncertainty in learning; and2 it provides a machinery to incorporate prior knowledge into the algorithms.We first discuss models and methods for Bayesian inferencein the simple single-step Bandit model. We then review the extensiverecent literature on Bayesian methods for model-based RL, where priorinformation can be expressed on the parameters of the Markov model.We also present Bayesian methods for model-free RL, where priors areexpressed over the value function or policy class. The objective of thepaper is to provide a comprehensive survey on Bayesian RL algorithmsand their theoretical and empirical properties.

Journal ArticleDOI
01 Jul 2015-Genetics
TL;DR: This study shows that CAVIar and BIMBAM are actually approximately equivalent to each other, and develops a fine-mapping method using marginal test statistics in the Bayesian framework, which is called CAVIAR Bayes factor (CAVIARBF).
Abstract: Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf.

Journal ArticleDOI
TL;DR: In this paper, the authors compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches for variable subset selection for regression and classification.
Abstract: The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.

Journal ArticleDOI
TL;DR: It is argued that the use of informative priors should always be reported together with a sensitivity analysis to show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis.
Abstract: Background : The analysis of small data sets in longitudinal studies can lead to power issues and often suffers from biased parameter values. These issues can be solved by using Bayesian estimation in conjunction with informative prior distributions. By means of a simulation study and an empirical example concerning posttraumatic stress symptoms (PTSS) following mechanical ventilation in burn survivors, we demonstrate the advantages and potential pitfalls of using Bayesian estimation. Methods : First, we show how to specify prior distributions and by means of a sensitivity analysis we demonstrate how to check the exact influence of the prior (mis-) specification. Thereafter, we show by means of a simulation the situations in which the Bayesian approach outperforms the default, maximum likelihood and approach. Finally, we re-analyze empirical data on burn survivors which provided preliminary evidence of an aversive influence of a period of mechanical ventilation on the course of PTSS following burns. Results : Not suprisingly, maximum likelihood estimation showed insufficient coverage as well as power with very small samples. Only when Bayesian analysis, in conjunction with informative priors, was used power increased to acceptable levels. As expected, we showed that the smaller the sample size the more the results rely on the prior specification. Conclusion : We show that two issues often encountered during analysis of small samples, power and biased parameters, can be solved by including prior information into Bayesian analysis. We argue that the use of informative priors should always be reported together with a sensitivity analysis. Keywords: Bayesian estimation; maximum likelihood; prior specification; power; repeated measures analyses; small samples; burn survivors; mechanical ventilation; PTSS Responsible Editor: Cherie Armour, University of Ulster, United Kingdom. For the abstract or full text in other languages, please see Supplementary files in the column to the right (under ‘Article Tools’). (Published: 11 March 2015) Citation: European Journal of Psychotraumatology 2015, 6 : 25216 - http://dx.doi.org/10.3402/ejpt.v6.25216

Journal ArticleDOI
TL;DR: This article addresses the problem of inferring multiple undirected networks in situations where some of the networks may be unrelated, while others share common features, and proposes a Bayesian approach to inference on multiple Gaussian graphical models.
Abstract: In this article, we propose a Bayesian approach to inference on multiple Gaussian graphical models. Specifically, we address the problem of inferring multiple undirected networks in situations where some of the networks may be unrelated, while others share common features. We link the estimation of the graph structures via a Markov random field (MRF) prior, which encourages common edges. We learn which sample groups have a shared graph structure by placing a spike-and-slab prior on the parameters that measure network relatedness. This approach allows us to share information between sample groups, when appropriate, as well as to obtain a measure of relative network similarity across groups. Our modeling framework incorporates relevant prior knowledge through an edge-specific informative prior and can encourage similarity to an established network. Through simulations, we demonstrate the utility of our method in summarizing relative network similarity and compare its performance against related methods. We ...

Book
04 Apr 2015
TL;DR: Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and STAN as discussed by the authors examines the Bayesian and frequentist methods of conducting data analyses, providing the theoretical background in an easy-to-understand approach, encouraging readers to examine the processes that generated their data.
Abstract: Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and STAN examines the Bayesian and frequentist methods of conducting data analyses The book provides the theoretical background in an easy-to-understand approach, encouraging readers to examine the processes that generated their data Including discussions of model selection, model checking, and multi-model inference, the book also uses effect plots that allow a natural interpretation of data Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and STAN introduces Bayesian software, using R for the simple modes, and flexible Bayesian software (BUGS and Stan) for the more complicated ones Guiding the ready from easy toward more complex (real) data analyses ina step-by-step manner, the book presents problems and solutions-including all R codes-that are most often applicable to other data and questions, making it an invaluable resource for analyzing a variety of data types * Introduces Bayesian data analysis, allowing users to obtain uncertainty measurements easily for any derived parameter of interest* Written in a step-by-step approach that allows for eased understanding by non-statisticians* Includes a companion website containing R-code to help users conduct Bayesian data analyses on their own data* All example data as well as additional functions are provided in the R-package blmeco

Journal ArticleDOI
TL;DR: An introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures is provided and results in probability theory are described that generalize de Finetti’s theorem to such data.
Abstract: The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti’s theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This article provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti’s theorem to such data and discuss their relevance to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays.

Journal ArticleDOI
TL;DR: The hypothesis that Bayesian Gaussian process logistic regression (GP-LR) models can be effective at performing patient stratification is supported: the implemented model achieves 75% accuracy disambiguating healthy subjects from subjects with amnesic mild cognitive impairment and 97%.

Journal ArticleDOI
TL;DR: Blavaan, an R package for estimating Bayesian structural equation models (SEMs) via JAGS and for summarizing the results, describes a novel parameter expansion approach for estimating specific types of models with residual covariances, which facilitates estimation of these models in JAGs.
Abstract: This article describes blavaan, an R package for estimating Bayesian structural equation models (SEMs) via JAGS and for summarizing the results. It also describes a novel parameter expansion approach for estimating specific types of models with residual covariances, which facilitates estimation of these models in JAGS. The methodology and software are intended to provide users with a general means of estimating Bayesian SEMs, both classical and novel, in a straightforward fashion. Users can estimate Bayesian versions of classical SEMs with lavaan syntax, they can obtain state-of-the-art Bayesian fit measures associated with the models, and they can export JAGS code to modify the SEMs as desired. These features and more are illustrated by example, and the parameter expansion approach is explained in detail.

Journal ArticleDOI
TL;DR: It is shown that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas.
Abstract: We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

Journal ArticleDOI
TL;DR: In this paper, the authors describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application, and show that the strength of the restraint should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result.
Abstract: We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraint should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" (BioEn) method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.

Proceedings Article
07 Dec 2015
TL;DR: This work describes a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network.
Abstract: We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), eg, for applications involving bandits or active learning One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics) Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time) We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [HLA15] and an approach based on variational Bayes [BCKW15] Our method performs better than both of these, is much simpler to implement, and uses less computation at test time

Journal ArticleDOI
TL;DR: This paper shows how to fit a number of spatial models with R-INLA, including its interaction with other R packages for data analysis, and describes a novel method to extend the number of latent models available for the model parameters.
Abstract: The integrated nested Laplace approximation (INLA) provides an interesting way of approximating the posterior marginals of a wide range of Bayesian hierarchical models. This approximation is based on conducting a Laplace approximation of certain functions and numerical integration is extensively used to integrate some of the models parameters out. The R-INLA package offers an interface to INLA, providing a suitable framework for data analysis. Although the INLA methodology can deal with a large number of models, only the most relevant have been implemented within R-INLA. However, many other important models are not available for R-INLA yet. In this paper we show how to fit a number of spatial models with R-INLA, including its interaction with other R packages for data analysis. Secondly, we describe a novel method to extend the number of latent models available for the model parameters. Our approach is based on conditioning on one or several model parameters and fit these conditioned models with R-INLA. Then these models are combined using Bayesian model averaging to provide the final approximations to the posterior marginals of the model. Finally, we show some examples of the application of this technique in spatial statistics. It is worth noting that our approach can be extended to a number of other fields, and not only spatial statistics.