Showing papers on "Bayesian inference published in 2016"

PDF

Open Access

Proceedings Article•

Dropout as a Bayesian approximation: representing model uncertainty in deep learning

[...]

Yarin Gal¹, Zoubin Ghahramani¹•Institutions (1)

19 Jun 2016

TL;DR: A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.

...read moreread less

Abstract: Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes. A direct result of this theory gives us tools to model uncertainty with dropout NNs - extracting information from existing models that has been thrown away so far. This mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy. We perform an extensive study of the properties of dropout's uncertainty. Various network architectures and nonlinearities are assessed on tasks of regression and classification, using MNIST as an example. We show a considerable improvement in predictive log-likelihood and RMSE compared to existing state-of-the-art methods, and finish by using dropout's uncertainty in deep reinforcement learning.

...read moreread less

3,472 citations

Proceedings Article•

A theoretically grounded application of dropout in recurrent neural networks

[...]

Yarin Gal¹, Zoubin Ghahramani¹•Institutions (1)

University of Cambridge¹

05 Dec 2016

TL;DR: The authors apply this variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks, and to the best of their knowledge improve on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity).

...read moreread less

Abstract: Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.

...read moreread less

1,557 citations

Posted Content•

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

[...]

Qiang Liu¹, Dilin Wang¹•Institutions (1)

Dartmouth College¹

16 Aug 2016-arXiv: Machine Learning

TL;DR: This paper proposed a variational inference algorithm that forms a natural counterpart of gradient descent for optimization, which iteratively transports a set of particles to match the target distribution by applying a form of functional gradient descent that minimizes the KL divergence.

...read moreread less

Abstract: We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence. Empirical studies are performed on various real world models and datasets, on which our method is competitive with existing state-of-the-art methods. The derivation of our method is based on a new theoretical result that connects the derivative of KL divergence under smooth transforms with Stein's identity and a recently proposed kernelized Stein discrepancy, which is of independent interest.

...read moreread less

581 citations

Journal Article•DOI•

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.

[...]

Abhirup Datta¹, Sudipto Banerjee¹, Andrew O. Finley¹, Alan E. Gelfand¹•Institutions (1)

University of California, Los Angeles¹

18 Aug 2016-Journal of the American Statistical Association

TL;DR: A class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets are developed and it is established that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices.

...read moreread less

Abstract: Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze fores...

...read moreread less

543 citations

Journal Article•DOI•

Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation

[...]

Jasper A. Vrugt¹, Jasper A. Vrugt²•Institutions (2)

University of California, Irvine¹, University of Amsterdam²

01 Jan 2016-Environmental Modelling and Software

TL;DR: The basic theory of Markov chain Monte Carlo (MCMC) simulation is reviewed and a MATLAB toolbox of the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm developed by Vrugt et al. is introduced, used for Bayesian inference in fields ranging from physics, chemistry and engineering, to ecology, hydrology, and geophysics.

...read moreread less

Abstract: Bayesian inference has found widespread application and use in science and engineering to reconcile Earth system models with data, including prediction in space (interpolation), prediction in time (forecasting), assimilation of observations and deterministic/stochastic model output, and inference of the model parameters. Bayes theorem states that the posterior probability, p ( H | Y ˜ ) of a hypothesis, H is proportional to the product of the prior probability, p(H) of this hypothesis and the likelihood, L ( H | Y ˜ ) of the same hypothesis given the new observations, Y ˜ , or p ( H | Y ˜ ) ∝ p ( H ) L ( H | Y ˜ ) . In science and engineering, H often constitutes some numerical model, ℱ(x) which summarizes, in algebraic and differential equations, state variables and fluxes, all knowledge of the system of interest, and the unknown parameter values, x are subject to inference using the data Y ˜ . Unfortunately, for complex system models the posterior distribution is often high dimensional and analytically intractable, and sampling methods are required to approximate the target. In this paper I review the basic theory of Markov chain Monte Carlo (MCMC) simulation and introduce a MATLAB toolbox of the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm developed by Vrugt et al. (2008a, 2009a) and used for Bayesian inference in fields ranging from physics, chemistry and engineering, to ecology, hydrology, and geophysics. This MATLAB toolbox provides scientists and engineers with an arsenal of options and utilities to solve posterior sampling problems involving (among others) bimodality, high-dimensionality, summary statistics, bounded parameter spaces, dynamic simulation models, formal/informal likelihood functions (GLUE), diagnostic model evaluation, data assimilation, Bayesian model averaging, distributed computation, and informative/noninformative prior distributions. The DREAM toolbox supports parallel computing and includes tools for convergence analysis of the sampled chain trajectories and post-processing of the results. Seven different case studies illustrate the main capabilities and functionalities of the MATLAB toolbox.

...read moreread less

521 citations

Journal Article•DOI•

RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language

[...]

Sebastian Höhna, Michael J. Landis, Tracy A. Heath¹, Tracy A. Heath², Bastien Boussau³, Nicolas Lartillot³, Brian R. Moore⁴, John P. Huelsenbeck, Fredrik Ronquist⁵ - Show less +5 more•Institutions (5)

University of Kansas¹, Iowa State University², Centre national de la recherche scientifique³, University of California, Davis⁴, Swedish Museum of Natural History⁵

01 Jul 2016-Systematic Biology

TL;DR: RevBayes is a new open-source software package based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models that outperforms competing software for several standard analyses and needs to explicitly specify each part of the model and analysis.

...read moreread less

Abstract: Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs We developed a new open-source software package, RevBayes, to address these problems RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses Compared with other programs, RevBayes has fewer black-box elements Users need to explicitly specify each part of the model and analysis Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny RevBayes is freely available at http://wwwRevBayescom [Bayesian inference; Graphical models; MCMC; statistical phylogenetics]

...read moreread less

505 citations

Journal Article•DOI•

Bayesian model reduction and empirical Bayes for group (DCM) studies

[...]

Karl J. Friston¹, Vladimir Litvak¹, Ashwini Oswal¹, Adeel Razi², Klaas E. Stephan¹, Bernadette C.M. van Wijk¹, Gabriel Ziegler¹, Peter Zeidman¹ - Show less +4 more•Institutions (2)

Wellcome Trust Centre for Neuroimaging¹, NED University of Engineering and Technology²

01 Mar 2016-NeuroImage

TL;DR: The robustness of Bayesian model reduction to violations of the Laplace assumption in dynamic causal modelling is illustrated and how its recursive application can facilitate both classical and Bayesian inference about group differences is considered.

...read moreread less

441 citations

Journal Article•DOI•

Active inference and learning

[...]

Karl J. Friston¹, Thomas H. B. FitzGerald², Francesco Rigoli¹, Philipp Schwartenbeck², John P. O'Doherty³, Giovanni Pezzulo⁴ - Show less +2 more•Institutions (4)

Wellcome Trust Centre for Neuroimaging¹, Max Planck Society², California Institute of Technology³, National Research Council⁴

01 Sep 2016-Neuroscience & Biobehavioral Reviews

TL;DR: This work has shown that optimal behaviour is quintessentially belief based, and that habits are learned by observing one’s own goal directed behaviour and selected online during active inference.

...read moreread less

373 citations

Journal Article•DOI•

A general framework for updating belief distributions.

[...]

Pier Giovanni Bissiri¹, Christopher Holmes², Stephen G. Walker³•Institutions (3)

University of Milano-Bicocca¹, University of Oxford², University of Texas at Austin³

23 Feb 2016-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: It is argued that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case.

...read moreread less

Abstract: We propose a framework for general Bayesian inference. We argue that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case. Modern application areas make it increasingly challenging for Bayesians to attempt to model the true data-generating mechanism. For instance, when the object of interest is low dimensional, such as a mean or median, it is cumbersome to have to achieve this via a complete model for the whole data distribution. More importantly, there are settings where the parameter of interest does not directly index a family of density functions and thus the Bayesian approach to learning about such parameters is currently regarded as problematic. Our framework uses loss functions to connect information in the data to functionals of interest. The updating of beliefs then follows from a decision theoretic approach involving cumulative loss functions. Importantly, the procedure coincides with Bayesian updating when a true likelihood is known yet provides coherent subjective inference in much more general settings. Connections to other inference frameworks are highlighted.

...read moreread less

359 citations

Book Chapter•DOI•

Ambiguity and the Bayesian Paradigm

[...]

Itzhak Gilboa¹, Massimo Marinacci²•Institutions (2)

HEC Paris¹, Bocconi University²

01 Jan 2016-Research Papers in Economics

TL;DR: A survey of recent decision-theoretic literature involving beliefs that cannot be quantified by a Bayesian prior is given in this paper, where historical, philosophical, and axiomatic foundations of the Bayesian model are discussed as well as several alternative models recently proposed.

...read moreread less

Abstract: This is a survey of some of the recent decision-theoretic literature involving beliefs that cannot be quantified by a Bayesian prior. We discuss historical, philosophical, and axiomatic foundations of the Bayesian model, as well as of several alternative models recently proposed. The definition and comparison of ambiguity aversion and the updating of non-Bayesian beliefs are briefly discussed. Finally, several applications are mentioned to illustrate the way that ambiguity (or “Knightian uncertainty”) can change the way we think about economic problems.

...read moreread less

345 citations

Book•

Computer Age Statistical Inference: Algorithms, Evidence, and Data Science

[...]

Bradley Efron¹, Trevor Hastie¹•Institutions (1)

Stanford University¹

21 Jul 2016

TL;DR: This book takes an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s, with speculation on the future direction of statistics and data science.

...read moreread less

Abstract: The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and in influence. 'Big data', 'data science', and 'machine learning' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? This book takes us on an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. The book ends with speculation on the future direction of statistics and data science.

...read moreread less

Journal Article•DOI•

Statistical physics of inference: thresholds and algorithms

[...]

Lenka Zdeborová¹, Florent Krzakala²•Institutions (2)

Université Paris-Saclay¹, Pierre-and-Marie-Curie University²

19 Aug 2016-Advances in Physics

TL;DR: The connection between inference and statistical physics is currently witnessing an impressive renaissance and the current state-of-the-art is reviewed, with a pedagogical focus on the Ising model which, formulated as an inference problem, is called the planted spin glass.

...read moreread less

Abstract: Many questions of fundamental interest in today's science can be formulated as inference problems: some partial, or noisy, observations are performed over a set of variables and the goal is to recover, or infer, the values of the variables based on the indirect information contained in the measurements. For such problems, the central scientific questions are: Under what conditions is the information contained in the measurements sufficient for a satisfactory inference to be possible? What are the most efficient algorithms for this task? A growing body of work has shown that often we can understand and locate these fundamental barriers by thinking of them as phase transitions in the sense of statistical physics. Moreover, it turned out that we can use the gained physical insight to develop new promising algorithms. The connection between inference and statistical physics is currently witnessing an impressive renaissance and we review here the current state-of-the-art, with a pedagogical focus on the Ising ...

...read moreread less

Proceedings Article•

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

[...]

Qiang Liu¹, Dilin Wang¹•Institutions (1)

Dartmouth College¹

01 Jan 2016

TL;DR: A general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization that iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence.

...read moreread less

Abstract: We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence. Empirical studies are performed on various real world models and datasets, on which our method is competitive with existing state-of-the-art methods. The derivation of our method is based on a new theoretical result that connects the derivative of KL divergence under smooth transforms with Stein’s identity and a recently proposed kernelized Stein discrepancy, which is of independent interest.

...read moreread less

Journal Article•DOI•

Performance-Driven Distributed PCA Process Monitoring Based on Fault-Relevant Variable Selection and Bayesian Inference

[...]

Qingchao Jiang¹, Xuefeng Yan², Biao Huang¹•Institutions (2)

University of Alberta¹, East China University of Science and Technology²

01 Jan 2016-IEEE Transactions on Industrial Electronics

TL;DR: This study proposes a fault-relevant variable selection and Bayesian inference-based distributed method for efficient fault detection and isolation, which reduces redundancy and complexity, explores numerous local behaviors, and provides accurate description of faults, thus improving monitoring performance significantly.

...read moreread less

Abstract: Multivariate statistical process monitoring involves dimension reduction and latent feature extraction in large-scale processes and typically incorporates all measured variables. However, involving variables without beneficial information may degrade monitoring performance. This study analyzes the effect of variable selection on principal component analysis (PCA) monitoring performance. Then, it proposes a fault-relevant variable selection and Bayesian inference-based distributed method for efficient fault detection and isolation. First, the optimal subset of variables is identified for each fault using an optimization algorithm. Second, a sub-PCA model is established in each subset. Finally, the monitoring results of all of the subsets are combined through Bayesian inference. The proposed method reduces redundancy and complexity, explores numerous local behaviors, and provides accurate description of faults, thus improving monitoring performance significantly. Case studies on a numerical example, the Tennessee Eastman benchmark process, and an industrial-scale plant demonstrate the efficiency.

...read moreread less

Journal Article•DOI•

Reliable ABC model choice via random forests.

[...]

Pierre Pudlo¹, Jean-Michel Marin¹, Arnaud Estoup², Jean-Marie Cornuet², Mathieu Gautier², Christian P. Robert³, Christian P. Robert⁴ - Show less +3 more•Institutions (4)

University of Montpellier¹, Institut national de la recherche agronomique², Paris Dauphine University³, University of Warwick⁴

15 Mar 2016-Bioinformatics

TL;DR: This work proposes a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms, modifying the way Bayesian model selection is both understood and operated.

...read moreread less

Abstract: Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques.Results: We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets.Availability and implementation: The proposed methodology is implemented in the R package abcrf available on the CRAN.

...read moreread less

Journal Article•DOI•

Painfree and accurate Bayesian estimation of psychometric functions for (potentially) overdispersed data

[...]

Heiko H. Schütt¹, Heiko H. Schütt², Stefan Harmeling³, Jakob H. Macke⁴, Jakob H. Macke⁵, Felix A. Wichmann⁴ - Show less +2 more•Institutions (5)

University of Tübingen¹, University of Potsdam², University of Düsseldorf³, Max Planck Society⁴, Center of Advanced European Studies and Research⁵

01 May 2016-Vision Research

TL;DR: It is shown that the use of the beta-binomial model makes it possible to determine accurate credible intervals even in data which exhibit substantial overdispersion, and Bayesian inference methods are used for estimating the posterior distribution of the parameters of the psychometric function.

...read moreread less

Journal Article•DOI•

Bayesian Benefits for the Pragmatic Researcher

[...]

Eric-Jan Wagenmakers¹, Richard D. Morey², Michael D. Lee³•Institutions (3)

University of Amsterdam¹, Cardiff University², University of California, Irvine³

08 Jun 2016-Current Directions in Psychological Science

TL;DR: The practical advantages of Bayesian inference are demonstrated here through two concrete examples as mentioned in this paper, which demonstrate how Bayesian analyses can be more informative, more elegant, and more flexible than the orthodox methodology that remains dominant within the field of psychology.

...read moreread less

Abstract: The practical advantages of Bayesian inference are demonstrated here through two concrete examples. In the first example, we wish to learn about a criminal’s IQ: a problem of parameter estimation. In the second example, we wish to quantify and track support in favor of the null hypothesis that Adam Sandler movies are profitable regardless of their quality: a problem of hypothesis testing. The Bayesian approach unifies both problems within a coherent predictive framework, in which parameters and models that predict the data successfully receive a boost in plausibility, whereas parameters and models that predict poorly suffer a decline. Our examples demonstrate how Bayesian analyses can be more informative, more elegant, and more flexible than the orthodox methodology that remains dominant within the field of psychology.

...read moreread less

Journal Article•DOI•

Fundamentals and Recent Developments in Approximate Bayesian Computation

[...]

Jarno Lintusaari¹, Michael U. Gutmann², Ritabrata Dutta², Samuel Kaski², Jukka Corander² - Show less +1 more•Institutions (2)

Aalto University¹, Helsinki Institute for Information Technology²

19 Oct 2016-Systematic Biology

TL;DR: Approximate Bayesian computation refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible.

...read moreread less

Abstract: Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.]

...read moreread less

Journal Article•DOI•

The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective

[...]

John K. Kruschke¹, Torrin M. Liddell¹•Institutions (1)

Indiana University¹

15 Nov 2016-Social Science Research Network

TL;DR: In this article, Bayesian approaches to hypothesis testing and estimation with confidence or credible intervals have been discussed, as well as Bayesian methods to meta-analysis, random control trials, and planning.

...read moreread less

Abstract: In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty, on the other hand. Among frequentists in psychology a shift of emphasis from hypothesis testing to estimation has been dubbed “the New Statistics” (Cumming, 2014). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, random control trials, and planning (e.g., power analysis).

...read moreread less

Journal Article•DOI•

Dynamic temperature selection for parallel tempering in Markov chain Monte Carlo simulations

[...]

W. D. Vousden¹, Will M. Farr¹, Ilya Mandel¹•Institutions (1)

University of Birmingham¹

11 Jan 2016-Monthly Notices of the Royal Astronomical Society

TL;DR: This paper presents a simple, easily-implemented algorithm for dynamically adapting the temperature configuration of a sampler while sampling, and dynamically adjusts the temperature spacing to achieve a uniform rate of exchanges between chains at neighbouring temperatures.

...read moreread less

Abstract: Modern problems in astronomical Bayesian inference require efficient methods for sampling from complex, high-dimensional, often multi-modal probability distributions. Most popular methods, such as Markov chain Monte Carlo sampling, perform poorly on strongly multi-modal probability distributions, rarely jumping between modes or settling on just one mode without finding others. Parallel tempering addresses this problem by sampling simultaneously with separate Markov chains from tempered versions of the target distribution with reduced contrast levels. Gaps between modes can be traversed at higher temperatures, while individual modes can be efficiently explored at lower temperatures. In this paper, we investigate how one might choose the ladder of temperatures to achieve more efficient sampling, as measured by the autocorrelation time of the sampler. In particular, we present a simple, easily-implemented algorithm for dynamically adapting the temperature configuration of a sampler while sampling. This algorithm dynamically adjusts the temperature spacing to achieve a uniform rate of exchanges between chains at neighbouring temperatures. We compare the algorithm to conventional geometric temperature configurations on a number of test distributions and on an astrophysical inference problem, reporting efficiency gains by a factor of 1.2-2.5 over a well-chosen geometric temperature configuration and by a factor of 1.5-5 over a poorly chosen configuration. On all of these problems a sampler using the dynamical adaptations to achieve uniform acceptance ratios between neighbouring chains outperforms one that does not.

...read moreread less

Journal Article•DOI•

Rational Irrationality: Modeling Climate Change Belief Polarization Using Bayesian Networks

[...]

John Cook¹, John Cook², Stephan Lewandowsky³, Stephan Lewandowsky²•Institutions (3)

University of Queensland¹, University of Western Australia², University of Bristol³

01 Jan 2016-Topics in Cognitive Science

TL;DR: Fitting a Bayes net model to the data indicated that under a Bayesian framework, free-market support is a significant driver of beliefs about climate change and trust in climate scientists and active distrust of climate scientists among a small number of U.S. conservatives drives contrary updating in response to consensus information.

...read moreread less

Abstract: Belief polarization is said to occur when two people respond to the same evidence by updating their beliefs in opposite directions. This response is considered to be “irrational” because it involves contrary updating, a form of belief updating that appears to violate normatively optimal responding, as for example dictated by Bayes' theorem. In light of much evidence that people are capable of normatively optimal behavior, belief polarization presents a puzzling exception. We show that Bayesian networks, or Bayes nets, can simulate rational belief updating. When fit to experimental data, Bayes nets can help identify the factors that contribute to polarization. We present a study into belief updating concerning the reality of climate change in response to information about the scientific consensus on anthropogenic global warming (AGW). The study used representative samples of Australian and U.S. participants. Among Australians, consensus information partially neutralized the influence of worldview, with free-market supporters showing a greater increase in acceptance of human-caused global warming relative to free-market opponents. In contrast, while consensus information overall had a positive effect on perceived consensus among U.S. participants, there was a reduction in perceived consensus and acceptance of human-caused global warming for strong supporters of unregulated free markets. Fitting a Bayes net model to the data indicated that under a Bayesian framework, free-market support is a significant driver of beliefs about climate change and trust in climate scientists. Further, active distrust of climate scientists among a small number of U.S. conservatives drives contrary updating in response to consensus information among this particular group.

...read moreread less

Journal Article•DOI•

Dimension-independent likelihood-informed MCMC

[...]

Tiangang Cui¹, Kody J. H. Law², Youssef M. Marzouk¹•Institutions (2)

Massachusetts Institute of Technology¹, Oak Ridge National Laboratory²

01 Jan 2016-Journal of Computational Physics

TL;DR: This work introduces a family of Markov chain Monte Carlo samplers that can adapt to the particular structure of a posterior distribution over functions that may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure.

...read moreread less

Book Chapter•DOI•

Solution and Estimation Methods for DSGE Models

[...]

Jesús Fernández-Villaverde¹, Jesús Fernández-Villaverde², Juan Rubio Ramírez³, Frank Schorfheide•Institutions (3)

University of Pennsylvania¹, National Bureau of Economic Research², Emory University³

01 Jan 2016-Handbook of Macroeconomics

TL;DR: An overview of solution and estimation techniques for dynamic stochastic general equilibrium models and the foundations of numerical approximation techniques as well as statistical inference are covered.

...read moreread less

Abstract: This chapter provides an overview of solution and estimation techniques for dynamic stochastic general equilibrium models. We cover the foundations of numerical approximation techniques as well as statistical inference and survey the latest developments in the field.

...read moreread less

Journal Article•DOI•

Metainference: A Bayesian inference method for heterogeneous systems

[...]

Massimiliano Bonomi¹, Carlo Camilloni¹, Andrea Cavalli¹, Michele Vendruscolo¹•Institutions (1)

University of Cambridge¹

01 Jan 2016-Science Advances

TL;DR: Metainference as mentioned in this paper models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle, which is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states.

...read moreread less

Abstract: Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called “metainference,” that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors.

...read moreread less

Proceedings Article•DOI•

Deep Gaussian processes for regression using approximate expectation propagation

[...]

Thang D. Bui¹, José Miguel Hernández-Lobato², Daniel Hernández-Lobato³, Yingzhen Li¹, Richard E. Turner¹ - Show less +1 more•Institutions (3)

University of Cambridge¹, Harvard University², Autonomous University of Madrid³

19 Jun 2016

TL;DR: A new approximate Bayesian learning scheme is developed that enables DGPs to be applied to a range of medium to large scale regression problems for the first time and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks.

...read moreread less

Abstract: Deep Gaussian processes (DGPs) are multilayer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are nonparametric probabilistic models and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. This paper develops a new approximate Bayesian learning scheme that enables DGPs to be applied to a range of medium to large scale regression problems for the first time. The new method uses an approximate Expectation Propagation procedure and a novel and efficient extension of the probabilistic backpropagation algorithm for learning. We evaluate the new method for non-linear regression on eleven real-world datasets, showing that it always outperforms GP regression and is almost always better than state-of-the-art deterministic and sampling-based approximate inference methods for Bayesian neural networks. As a by-product, this work provides a comprehensive analysis of six approximate Bayesian methods for training neural networks.

...read moreread less

Journal Article•DOI•

EuroForMix: An open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts

[...]

Øyvind Bleka¹, Øyvind Bleka², Geir Storvik¹, Peter Gill²•Institutions (2)

University of Oslo¹, Norwegian Institute of Public Health²

01 Mar 2016-Forensic Science International-genetics

TL;DR: The software implements a model to explain the allelic peak height on a continuous scale in order to carry out weight-of-evidence calculations for profiles which could be from a mixture of contributors, and is the first freely open source, continuous model, to be reported in the literature.

...read moreread less

Abstract: We have released a software named EuroForMix to analyze STR DNA profiles in a user-friendly graphical user interface. The software implements a model to explain the allelic peak height on a continuous scale in order to carry out weight-of-evidence calculations for profiles which could be from a mixture of contributors. Through a properly parameterized model we are able to do inference on mixture proportions, the peak height properties, stutter proportion and degradation. In addition, EuroForMix includes models for allele drop-out, allele drop-in and sub-population structure. EuroForMix supports two inference approaches for likelihood ratio calculations. The first approach uses maximum likelihood estimation of the unknown parameters. The second approach is Bayesian based which requires prior distributions to be specified for the parameters involved. The user may specify any number of known and unknown contributors in the model, however we find that there is a practical computing time limit which restricts the model to a maximum of four unknown contributors. EuroForMix is the first freely open source, continuous model (accommodating peak height, stutter, drop-in, drop-out, population substructure and degradation), to be reported in the literature. It therefore serves an important purpose to act as an unrestricted platform to compare different solutions that are available. The implementation of the continuous model used in the software showed close to identical results to the R-package DNAmixtures, which requires a HUGIN Expert license to be used. An additional feature in EuroForMix is the ability for the user to adapt the Bayesian inference framework by incorporating their own prior information.

...read moreread less

Journal Article•DOI•

Computational Precision of Mental Inference as Critical Source of Human Choice Suboptimality.

[...]

Jan Drugowitsch¹, Jan Drugowitsch², Jan Drugowitsch³, Valentin Wyart³, Anne-Dominique Devauchelle³, Etienne Koechlin³ - Show less +2 more•Institutions (3)

Harvard University¹, University of Geneva², École Normale Supérieure³

21 Dec 2016-Neuron

TL;DR: It is shown that imperfections in inference alone cause a dominant fraction of suboptimal choices and two-thirds of this suboptimality appear to derive from the limited precision of neural computations implementing inference rather than from systematic deviations from Bayes-optimal inference.

...read moreread less

Journal Article•DOI•

An Integrated Approach to Maneuver-Based Trajectory Prediction and Criticality Assessment in Arbitrary Road Environments

[...]

Matthias Schreier¹, Volker Willert¹, Jürgen Adamy¹•Institutions (1)

Technische Universität Darmstadt¹

01 Oct 2016-IEEE Transactions on Intelligent Transportation Systems

TL;DR: An integrated Bayesian approach to maneuver-based trajectory prediction and criticality assessment that is not limited to specific driving situations is described and it is shown how parametric free space maps can advantageously be utilized for this purpose.

...read moreread less

Abstract: This paper describes an integrated Bayesian approach to maneuver-based trajectory prediction and criticality assessment that is not limited to specific driving situations. First, a distribution of high-level driving maneuvers is inferred for each vehicle in the traffic scene via Bayesian inference. For this purpose, the domain is modeled in a Bayesian network with both causal and diagnostic evidences and an additional trash maneuver class, which allows the detection of irrational driving behavior and the seamless application from highly structured to nonstructured environments. Subsequently, maneuver-based probabilistic trajectory prediction models are employed to predict each vehicle's configuration forward in time. Random elements in the designed models consider the uncertainty within the future driving maneuver execution of human drivers. Finally, the criticality time metric time-to-critical-collision-probability (TTCCP) is introduced and estimated via Monte Carlo simulations. The TTCCP is a generalization of the time-to-collision (TTC) in arbitrary uncertain multiobject driving environments and valid for longer prediction horizons. All uncertain predictions of all maneuvers of every vehicle are taken into account. Additionally, the criticality assessment considers arbitrarily shaped static environments, and it is shown how parametric free space (PFS) maps can advantageously be utilized for this purpose.

...read moreread less

Journal Article•DOI•

Bayesian optimization for likelihood-free inference of simulator-based statistical models

[...]

Michael U. Gutmann¹, Jukka Corander¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jan 2016-Journal of Machine Learning Research

TL;DR: In this article, a Bayesian optimization strategy is proposed to accelerate the likelihood-free inference through a reduction in the number of required simulations by several orders of magnitude, where the discrepancy between simulated and observed data is small.

...read moreread less

Abstract: Our paper deals with inferring simulator-based statistical models given some observed data. A simulator-based model is a parametrized mechanism which specifies how data are generated. It is thus also referred to as generative model. We assume that only a finite number of parameters are of interest and allow the generative process to be very general; it may be a noisy nonlinear dynamical system with an unrestricted number of hidden variables. This weak assumption is useful for devising realistic models but it renders statistical inference very difficult. The main challenge is the intractability of the likelihood function. Several likelihood-free inference methods have been proposed which share the basic idea of identifying the parameters by finding values for which the discrepancy between simulated and observed data is small. A major obstacle to using these methods is their computational cost. The cost is largely due to the need to repeatedly simulate data sets and the lack of knowledge about how the parameters affect the discrepancy. We propose a strategy which combines probabilistic modeling of the discrepancy with optimization to facilitate likelihood-free inference. The strategy is implemented using Bayesian optimization and is shown to accelerate the inference through a reduction in the number of required simulations by several orders of magnitude.

...read moreread less

Journal Article•DOI•

Scene Construction, Visual Foraging, and Active Inference.

[...]

M. Berk Mirza¹, Rick A. Adams², Christoph Mathys³, Christoph Mathys¹, Karl J. Friston¹ - Show less +1 more•Institutions (3)

Wellcome Trust Centre for Neuroimaging¹, University College London², École Polytechnique Fédérale de Lausanne³

14 Jun 2016-Frontiers in Computational Neuroscience

TL;DR: This paper describes an active inference scheme for visual searches and the perceptual synthesis entailed by scene construction, and highlights the link between approximate Bayesian inference under mean field assumptions and functional segregation in the visual cortex.

...read moreread less

Abstract: This paper describes an active inference scheme for visual searches and the perceptual synthesis entailed by scene construction. Active inference assumes that perception and action minimize variational free energy, where actions are selected to minimize the free energy expected in the future. This assumption generalizes risk-sensitive control and expected utility theory to include epistemic value; namely, the value (or salience) of information inherent in resolving uncertainty about the causes of ambiguous cues or outcomes. Here, we apply active inference to saccadic searches of a visual scene. We consider the (difficult) problem of categorizing a scene, based on the spatial relationship among visual objects where, crucially, visual cues are sampled myopically through a sequence of saccadic eye movements. This means that evidence for competing hypotheses about the scene has to be accumulated sequentially, calling upon both prediction (planning) and postdiction (memory). Our aim is to highlight some simple but fundamental aspects of the requisite functional anatomy; namely, the link between approximate Bayesian inference under mean field assumptions and functional segregation in the visual cortex. This link rests upon the (neurobiologically plausible) process theory that accompanies the normative formulation of active inference for Markov decision processes. In future work, we hope to use this scheme to model empirical saccadic searches and identify the prior beliefs that underwrite intersubject variability in the way people forage for information in visual scenes (e.g., in schizophrenia).

...read moreread less

Collapse