scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 2018"


Journal ArticleDOI
TL;DR: The software package Tracer is presented, for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference, which provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more.
Abstract: Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.

5,492 citations


Journal ArticleDOI
TL;DR: This work explores Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness and demonstrates how the properties of each design can be evaluated using Monte Carlo simulations.
Abstract: A sizeable literature exists on the use of frequentist power analysis in the null-hypothesis significance testing (NHST) paradigm to facilitate the design of informative experiments. In contrast, there is almost no literature that discusses the design of experiments when Bayes factors (BFs) are used as a measure of evidence. Here we explore Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness. We elaborate on three possible BF designs, (a) a fixed-n design, (b) an open-ended Sequential Bayes Factor (SBF) design, where researchers can test after each participant and can stop data collection whenever there is strong evidence for either [Formula: see text] or [Formula: see text], and (c) a modified SBF design that defines a maximal sample size where data collection is stopped regardless of the current state of evidence. We demonstrate how the properties of each design (i.e., expected strength of evidence, expected sample size, expected probability of misleading evidence, expected probability of weak evidence) can be evaluated using Monte Carlo simulations and equip researchers with the necessary information to compute their own Bayesian design analyses.

359 citations


Journal ArticleDOI
TL;DR: It is shown on simulated data that the fully Bayes penalty mimics oracle performance, providing a viable alternative to cross-validation and developing theory for the separable and nonseparable variants of the penalty.
Abstract: Despite the wide adoption of spike-and-slab methodology for Bayesian variable selection, its potential for penalized likelihood estimation has largely been overlooked. In this paper, we bridge this gap by cross-fertilizing these two paradigms with the Spike-and-Slab LASSO procedure for variable selection and parameter estimation in linear regression. We introduce a new class of self-adaptive penalty functions that arise from a fully Bayes spike-and-slab formulation, ultimately moving beyond the separable penalty framework. A virtue of these non-separable penalties is their ability to borrow strength across coordinates, adapt to ensemble sparsity information and exert multiplicity adjustment. The Spike-and-Slab LASSO procedure harvests efficient coordinate-wise implementations with a path-following scheme for dynamic posterior exploration. We show on simulated data that the fully Bayes penalty mimics oracle performance, providing a viable alternative to cross-validation. We develop theory for the s...

299 citations


Journal ArticleDOI
TL;DR: An applied introduction to Bayesian inference with Bayes factors using JASP provides a straightforward means of performing reproducible Bayesian hypothesis tests using a graphical “point and click” environment that will be familiar to researchers conversant with other graphical statistical packages, such as SPSS.
Abstract: Despite its popularity as an inferential framework, classical null hypothesis significance testing (NHST) has several restrictions. Bayesian analysis can be used to complement NHST, however, this approach has been underutilized largely due to a dearth of accessible software options. JASP is a recently developed open-source statistical package that facilitates both Bayesian and NHST analysis using a graphical interface. This article provides an applied introduction to Bayesian inference with Bayes factors using JASP. We use JASP to compare and contrast Bayesian alternatives for several common classical null hypothesis significance tests: correlations, frequency distributions, t-tests, ANCOVAs, and ANOVAs. These examples are also used to illustrate the strengths and limitations of both NHST and Bayesian hypothesis testing. A comparison of NHST and Bayesian inferential frameworks demonstrates that Bayes factors can complement p-values by providing additional information for hypothesis testing. Namely, Bayes factors can quantify relative evidence for both alternative and null hypotheses. Moreover, the magnitude of this evidence can be presented as an easy-to-interpret odds ratio. While Bayesian analysis is by no means a new method, this type of statistical inference has been largely inaccessible for most psychiatry researchers. JASP provides a straightforward means of performing reproducible Bayesian hypothesis tests using a graphical “point and click” environment that will be familiar to researchers conversant with other graphical statistical packages, such as SPSS.

258 citations


Journal ArticleDOI
TL;DR: It is argued that appropriate conclusions match the Bayesian inferences, but not those based on significance testing, where they disagree; it is shown that a high-powered non-significant result is consistent with no evidence for H0 over H1 worth mentioning, which a Bayes factor can show.
Abstract: Inference using significance testing and Bayes factors is compared and contrasted in five case studies based on real research. The first study illustrates that the methods will often agree, both in motivating researchers to conclude that H1 is supported better than H0, and the other way round, that H0 is better supported than H1. The next four, however, show that the methods will also often disagree. In these cases, the aim of the paper will be to motivate the sensible evidential conclusion, and then see which approach matches those intuitions. Specifically, it is shown that a high-powered non-significant result is consistent with no evidence for H0 over H1 worth mentioning, which a Bayes factor can show, and, conversely, that a low-powered non-significant result is consistent with substantial evidence for H0 over H1, again indicated by Bayesian analyses. The fourth study illustrates that a high-powered significant result may not amount to any evidence for H1 over H0, matching the Bayesian conclusion. Finally, the fifth study illustrates that different theories can be evidentially supported to different degrees by the same data; a fact that P-values cannot reflect but Bayes factors can. It is argued that appropriate conclusions match the Bayesian inferences, but not those based on significance testing, where they disagree.

251 citations


Journal ArticleDOI
TL;DR: The relationship between p-values and minimum Bayes factors also depends on the sample size and on the dimension of the parameter of interest as discussed by the authors, and the relationship between the two-sided significance tests for a point null hypothesis in more detail.
Abstract: The p-value quantifies the discrepancy between the data and a null hypothesis of interest, usually the assumption of no difference or no effect. A Bayesian approach allows the calibration of p-values by transforming them to direct measures of the evidence against the null hypothesis, so-called Bayes factors. We review the available literature in this area and consider two-sided significance tests for a point null hypothesis in more detail. We distinguish simple from local alternative hypotheses and contrast traditional Bayes factors based on the data with Bayes factors based on p-values or test statistics. A well-known finding is that the minimum Bayes factor, the smallest possible Bayes factor within a certain class of alternative hypotheses, provides less evidence against the null hypothesis than the corresponding p-value might suggest. It is less known that the relationship between p-values and minimum Bayes factors also depends on the sample size and on the dimension of the parameter of interest. We i...

182 citations


Posted Content
TL;DR: This work reformulates the model-agnostic meta-learning algorithm (MAML) of Finn et al. (2017) as a method for probabilistic inference in a hierarchical Bayesian model and proposes an improvement to the MAML algorithm that makes use of techniques from approximate inference and curvature estimation.
Abstract: Meta-learning allows an intelligent agent to leverage prior learning episodes as a basis for quickly improving performance on a novel task. Bayesian hierarchical modeling provides a theoretical framework for formalizing meta-learning as inference for a set of parameters that are shared across tasks. Here, we reformulate the model-agnostic meta-learning algorithm (MAML) of Finn et al. (2017) as a method for probabilistic inference in a hierarchical Bayesian model. In contrast to prior methods for meta-learning via hierarchical Bayes, MAML is naturally applicable to complex function approximators through its use of a scalable gradient descent procedure for posterior inference. Furthermore, the identification of MAML as hierarchical Bayes provides a way to understand the algorithm's operation as a meta-learning procedure, as well as an opportunity to make use of computational strategies for efficient inference. We use this opportunity to propose an improvement to the MAML algorithm that makes use of techniques from approximate inference and curvature estimation.

150 citations


Journal ArticleDOI
TL;DR: The fundamental tenets of Bayesian inference are introduced, which derive from two basic laws of probability theory, and the interpretation of probabilities, discrete and continuous versions of Bayes’ rule, parameter estimation, and model comparison are covered.
Abstract: We introduce the fundamental tenets of Bayesian inference, which derive from two basic laws of probability theory. We cover the interpretation of probabilities, discrete and continuous versions of Bayes’ rule, parameter estimation, and model comparison. Using seven worked examples, we illustrate these principles and set up some of the technical background for the rest of this special issue of Psychonomic Bulletin & Review. Supplemental material is available via https://osf.io/wskex/.

134 citations


Proceedings Article
15 Feb 2018
TL;DR: In this article, the authors reformulate the MAML algorithm as a method for probabilistic inference in a hierarchical Bayesian model, and propose an improvement to the algorithm that makes use of techniques from approximate inference and curvature estimation.
Abstract: Meta-learning allows an intelligent agent to leverage prior learning episodes as a basis for quickly improving performance on a novel task. Bayesian hierarchical modeling provides a theoretical framework for formalizing meta-learning as inference for a set of parameters that are shared across tasks. Here, we reformulate the model-agnostic meta-learning algorithm (MAML) of Finn et al. (2017) as a method for probabilistic inference in a hierarchical Bayesian model. In contrast to prior methods for meta-learning via hierarchical Bayes, MAML is naturally applicable to complex function approximators through its use of a scalable gradient descent procedure for posterior inference. Furthermore, the identification of MAML as hierarchical Bayes provides a way to understand the algorithm's operation as a meta-learning procedure, as well as an opportunity to make use of computational strategies for efficient inference. We use this opportunity to propose an improvement to the MAML algorithm that makes use of techniques from approximate inference and curvature estimation.

116 citations


Journal ArticleDOI
TL;DR: A narrative review of the body of computational research addressing neuropsychological syndromes, and focus on those that employ Bayesian frameworks, to understand the link between biology and computation that is at the heart of neuropsychology.
Abstract: Computational theories of brain function have become very influential in neuroscience. They have facilitated the growth of formal approaches to disease, particularly in psychiatric research. In this paper, we provide a narrative review of the body of computational research addressing neuropsychological syndromes, and focus on those that employ Bayesian frameworks. Bayesian approaches to understanding brain function formulate perception and action as inferential processes. These inferences combine 'prior' beliefs with a generative (predictive) model to explain the causes of sensations. Under this view, neuropsychological deficits can be thought of as false inferences that arise due to aberrant prior beliefs (that are poor fits to the real world). This draws upon the notion of a Bayes optimal pathology - optimal inference with suboptimal priors - and provides a means for computational phenotyping. In principle, any given neuropsychological disorder could be characterized by the set of prior beliefs that would make a patient's behavior appear Bayes optimal. We start with an overview of some key theoretical constructs and use these to motivate a form of computational neuropsychology that relates anatomical structures in the brain to the computations they perform. Throughout, we draw upon computational accounts of neuropsychological syndromes. These are selected to emphasize the key features of a Bayesian approach, and the possible types of pathological prior that may be present. They range from visual neglect through hallucinations to autism. Through these illustrative examples, we review the use of Bayesian approaches to understand the link between biology and computation that is at the heart of neuropsychology.

111 citations


Proceedings Article
08 Dec 2018
TL;DR: This work shows that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization and identifies a common family of acquisition functions, including EI and UCB, whose characteristics not only facilitate but justify use of greedy approaches for their maximization.
Abstract: Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes' decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose characteristics not only facilitate but justify use of greedy approaches for their maximization.

Journal ArticleDOI
TL;DR: Divide-and-conquer based methods for Bayesian inference provide a general approach for tractable posterior inference when the sample size is large as discussed by the authors, and divide the data into smaller subset.
Abstract: Divide-and-conquer based methods for Bayesian inference provide a general approach for tractable posterior inference when the sample size is large. These methods divide the data into smaller subset...

Journal ArticleDOI
TL;DR: An efficient and robust computational framework to perform Bayesian model comparison of causal inference strategies, which incorporates a number of alternative assumptions about the observers, and investigates whether human observers’ performance in an explicit cause attribution and an implicit heading discrimination task can be modeled as a causal inference process.
Abstract: The precision of multisensory perception improves when cues arising from the same cause are integrated, such as visual and vestibular heading cues for an observer moving through a stationary environment. In order to determine how the cues should be processed, the brain must infer the causal relationship underlying the multisensory cues. In heading perception, however, it is unclear whether observers follow the Bayesian strategy, a simpler non-Bayesian heuristic, or even perform causal inference at all. We developed an efficient and robust computational framework to perform Bayesian model comparison of causal inference strategies, which incorporates a number of alternative assumptions about the observers. With this framework, we investigated whether human observers’ performance in an explicit cause attribution and an implicit heading discrimination task can be modeled as a causal inference process. In the explicit causal inference task, all subjects accounted for cue disparity when reporting judgments of common cause, although not necessarily all in a Bayesian fashion. By contrast, but in agreement with previous findings, data from the heading discrimination task only could not rule out that several of the same observers were adopting a forced-fusion strategy, whereby cues are integrated regardless of disparity. Only when we combined evidence from both tasks we were able to rule out forced-fusion in the heading discrimination task. Crucially, findings were robust across a number of variants of models and analyses. Our results demonstrate that our proposed computational framework allows researchers to ask complex questions within a rigorous Bayesian framework that accounts for parameter and model uncertainty.

Journal ArticleDOI
TL;DR: A methodology wherein the full uncertainty associated with probability model form and parameter estimation are retained and efficiently propagated and a complete probabilistic description of both aleatory and epistemic uncertainty is achieved with several orders of magnitude reduction in computational cost.

Journal ArticleDOI
26 Feb 2018-PLOS ONE
TL;DR: How the specification of a “vague” normally distributed prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation and compromise inference about species-habitat relationships is reported on.
Abstract: Understanding patterns of species occurrence and the processes underlying these patterns is fundamental to the study of ecology. One of the more commonly used approaches to investigate species occurrence patterns is occupancy modeling, which can account for imperfect detection of a species during surveys. In recent years, there has been a proliferation of Bayesian modeling in ecology, which includes fitting Bayesian occupancy models. The Bayesian framework is appealing to ecologists for many reasons, including the ability to incorporate prior information through the specification of prior distributions on parameters. While ecologists almost exclusively intend to choose priors so that they are "uninformative" or "vague", such priors can easily be unintentionally highly informative. Here we report on how the specification of a "vague" normally distributed (i.e., Gaussian) prior on coefficients in Bayesian occupancy models can unintentionally influence parameter estimation. Using both simulated data and empirical examples, we illustrate how this issue likely compromises inference about species-habitat relationships. While the extent to which these informative priors influence inference depends on the data set, researchers fitting Bayesian occupancy models should conduct sensitivity analyses to ensure intended inference, or employ less commonly used priors that are less informative (e.g., logistic or t prior distributions). We provide suggestions for addressing this issue in occupancy studies, and an online tool for exploring this issue under different contexts.

Proceedings Article
26 Feb 2018
TL;DR: The authors showed that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD) leads to a valid PAC-Bayes bound due to control of the 2-Wasserstein distance to a differentially private stationary distribution.
Abstract: The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors, yielding tighter generalization bounds on data-dependent posteriors. Using this flexibility, however, is difficult, especially when the data distribution is presumed to be unknown. We show how a differentially private data-dependent prior yields a valid PAC-Bayes bound, and then show how non-private mechanisms for choosing priors can also yield generalization bounds. As an application of this result, we show that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD; Welling and Teh, 2011) leads to a valid PAC-Bayes bound due to control of the 2-Wasserstein distance to a differentially private stationary distribution. We study our data-dependent bounds empirically, and show that they can be nonvacuous even when other distribution-dependent bounds are vacuous.

Journal ArticleDOI
TL;DR: The behavior of Bayesian model selection when the compared models are misspecified is characterized and it is demonstrated that when the models are nearly equally wrong, the method exhibits unpleasant polarized behaviors, supporting one model with high confidence while rejecting others.
Abstract: The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.

Journal ArticleDOI
TL;DR: It is found that subjects do take sensory uncertainty into account when reporting confidence, suggesting that brain areas involved in reporting confidence can access low-level representations of sensory uncertainty, a prerequisite of Bayesian inference.
Abstract: Humans can meaningfully report their confidence in a perceptual or cognitive decision. It is widely believed that these reports reflect the Bayesian probability that the decision is correct, but this hypothesis has not been rigorously tested against non-Bayesian alternatives. We use two perceptual categorization tasks in which Bayesian confidence reporting requires subjects to take sensory uncertainty into account in a specific way. We find that subjects do take sensory uncertainty into account when reporting confidence, suggesting that brain areas involved in reporting confidence can access low-level representations of sensory uncertainty, a prerequisite of Bayesian inference. However, behavior is not fully consistent with the Bayesian hypothesis and is better described by simple heuristic models that use uncertainty in a non-Bayesian way. Both conclusions are robust to changes in the uncertainty manipulation, task, response modality, model comparison metric, and additional flexibility in the Bayesian model. Our results suggest that adhering to a rational account of confidence behavior may require incorporating implementational constraints.

Posted Content
TL;DR: In this article, a hierarchical prior for parameters and a novel empirical Bayes procedure for automatically selecting prior variances are proposed to approximate moments in neural networks, eliminating gradient variance, and the resulting method is highly efficient and robust.
Abstract: Bayesian neural networks (BNNs) hold great promise as a flexible and principled solution to deal with uncertainty when learning from finite data. Among approaches to realize probabilistic inference in deep neural networks, variational Bayes (VB) is theoretically grounded, generally applicable, and computationally efficient. With wide recognition of potential advantages, why is it that variational Bayes has seen very limited practical use for BNNs in real applications? We argue that variational inference in neural networks is fragile: successful implementations require careful initialization and tuning of prior variances, as well as controlling the variance of Monte Carlo gradient estimates. We provide two innovations that aim to turn VB into a robust inference tool for Bayesian neural networks: first, we introduce a novel deterministic method to approximate moments in neural networks, eliminating gradient variance; second, we introduce a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances. Combining these two innovations, the resulting method is highly efficient and robust. On the application of heteroscedastic regression we demonstrate good predictive performance over alternative approaches.

Journal ArticleDOI
TL;DR: In this article, the authors develop alternatives to Markov chain Monte Carlo implementations of Bayesian synthetic likelihoods with reduced computational overheads, using stochastic gradient variational inference methods for posterior approximation in the synthetic likelihood context.
Abstract: Synthetic likelihood is an attractive approach to likelihood-free inference when an approximately Gaussian summary statistic for the data, informative for inference about the parameters, is available. The synthetic likelihood method derives an approximate likelihood function from a plug-in normal density estimate for the summary statistic, with plug-in mean and covariance matrix obtained by Monte Carlo simulation from the model. In this article, we develop alternatives to Markov chain Monte Carlo implementations of Bayesian synthetic likelihoods with reduced computational overheads. Our approach uses stochastic gradient variational inference methods for posterior approximation in the synthetic likelihood context, employing unbiased estimates of the log likelihood. We compare the new method with a related likelihood-free variational inference technique in the literature, while at the same time improving the implementation of that approach in a number of ways. These new algorithms are feasible to implement in situations which are challenging for conventional approximate Bayesian computation methods, in terms of the dimensionality of the parameter and summary statistic.

Posted Content
TL;DR: In this paper, the authors identify a common family of acquisition functions, including EI and UCB, whose properties not only facilitate but justify use of greedy approaches for their maximization.
Abstract: Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes' decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose properties not only facilitate but justify use of greedy approaches for their maximization.

Journal ArticleDOI
TL;DR: A framework for coalescent-based phylogenetic and phylodynamic inference which enables highly-flexible modeling of demographic and epidemiological processes is presented and a flexible markup language is described for translating parametric demographic or epidemiological models into a structured coalescent model enabling simultaneous estimation of demographic or Epidemiological parameters and time-scaled phylogenies.
Abstract: Population genetic modeling can enhance Bayesian phylogenetic inference by providing a realistic prior on the distribution of branch lengths and times of common ancestry. The parameters of a population genetic model may also have intrinsic importance, and simultaneous estimation of a phylogeny and model parameters has enabled phylodynamic inference of population growth rates, reproduction numbers, and effective population size through time. Phylodynamic inference based on pathogen genetic sequence data has emerged as useful supplement to epidemic surveillance, however commonly-used mechanistic models that are typically fitted to non-genetic surveillance data are rarely fitted to pathogen genetic data due to a dearth of software tools, and the theory required to conduct such inference has been developed only recently. We present a framework for coalescent-based phylogenetic and phylodynamic inference which enables highly-flexible modeling of demographic and epidemiological processes. This approach builds upon previous structured coalescent approaches and includes enhancements for computational speed, accuracy, and stability. A flexible markup language is described for translating parametric demographic or epidemiological models into a structured coalescent model enabling simultaneous estimation of demographic or epidemiological parameters and time-scaled phylogenies. We demonstrate the utility of these approaches by fitting compartmental epidemiological models to Ebola virus and Influenza A virus sequence data, demonstrating how important features of these epidemics, such as the reproduction number and epidemic curves, can be gleaned from genetic data. These approaches are provided as an open-source package PhyDyn for the BEAST2 phylogenetics platform.

Journal ArticleDOI
TL;DR: Mean-field variational Bayes (MFVB) is an approximate Bayesian posterior inference technique that is increasingly popular due to its fast runtimes on large-scale data sets.
Abstract: Mean-field Variational Bayes (MFVB) is an approximate Bayesian posterior inference technique that is increasingly popular due to its fast runtimes on large-scale data sets. However, even when MFVB ...

Proceedings ArticleDOI
15 Aug 2018
TL;DR: The main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit and matches the Bayes optimal value recently postulated by Reeves when certain fixed point equations have unique solutions.
Abstract: Deep generative networks provide a powerful tool for modeling complex data in a wide range of applications. In inverse problems that use these networks as generative priors on data, one must often perform inference of the inputs of the networks from the outputs. Inference is also required for sampling during stochastic training of these generative models. This paper considers inference in a deep stochastic neural network where the parameters (e.g., weights, biases and activation functions) are known and the problem is to estimate the values of the input and hidden units from the output. A novel and computationally tractable inference method called Multi-Layer Vector Approximate Message Passing (ML-VAMP) is presented. Our main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit. In addition, the MSE achieved by ML-VAMP matches the Bayes optimal value recently postulated by Reeves when certain fixed point equations have unique solutions.

Journal ArticleDOI
TL;DR: This paper explores the extension of BN with interval probabilities to the modelling of maritime accidents, which allows for the quantification of the epistemic uncertainty.

Journal ArticleDOI
TL;DR: The main finding shows that the rankings produced by the NB and GFMNB-2 models for hotspot identification are often quite different, and this was especially noticeable with the Texas dataset.
Abstract: The empirical Bayes (EB) method is commonly used by transportation safety analysts for conducting different types of safety analyses, such as before–after studies and hotspot analyses. To date, mos...

Journal ArticleDOI
TL;DR: This work proposes a class of models for nonmonotone missing data mechanisms that spans the MAR model, while allowing the underlying full data law to remain unrestricted and introduces an unconstrained maximum likelihood estimator for estimating the missing data probabilities which is easily implemented using existing software.
Abstract: The development of coherent missing data models to account for nonmonotone missing at random (MAR) data by inverse probability weighting (IPW) remains to date largely unresolved. As a consequence, IPW has essentially been restricted for use only in monotone MAR settings. We propose a class of models for nonmonotone missing data mechanisms that spans the MAR model, while allowing the underlying full data law to remain unrestricted. For parametric specifications within the proposed class, we introduce an unconstrained maximum likelihood estimator for estimating the missing data probabilities which is easily implemented using existing software. To circumvent potential convergence issues with this procedure, we also introduce a constrained Bayesian approach to estimate the missing data process which is guaranteed to yield inferences that respect all model restrictions. The efficiency of standard IPW estimation is improved by incorporating information from incomplete cases through an augmented estimati...

Posted Content
TL;DR: Bayesian model reduction is considered and structure learning and hierarchical or empirical Bayes that can be regarded as a metaphor for neurobiological processes like abductive reasoning are considered.
Abstract: This paper reviews recent developments in statistical structure learning; namely, Bayesian model reduction. Bayesian model reduction is a method for rapidly computing the evidence and parameters of probabilistic models that differ only in their priors. In the setting of variational Bayes this has an analytical solution, which finesses the problem of scoring large model spaces in model comparison or structure learning. In this technical note, we review Bayesian model reduction and provide the relevant equations for several discrete and continuous probability distributions. We provide worked examples in the context of multivariate linear regression, Gaussian mixture models and dynamical systems (dynamic causal modelling). These examples are accompanied by the Matlab scripts necessary to reproduce the results. Finally, we briefly review recent applications in the fields of neuroimaging and neuroscience. Specifically, we consider structure learning and hierarchical or empirical Bayes that can be regarded as a metaphor for neurobiological processes like abductive reasoning.

Journal ArticleDOI
TL;DR: This work proposes a Bayesian generative classifier based on Gaussian mixture models to assign proteins probabilistically to sub-cellular niches, thus proteins have a probability distribution over sub- cellular locations, with Bayesian computation performed using the expectation-maximisation (EM) algorithm, as well as Markov-chain Monte-Carlo (MCMC).
Abstract: Analysis of the spatial sub-cellular distribution of proteins is of vital importance to fully understand context specific protein function. Some proteins can be found with a single location within a cell, but up to half of proteins may reside in multiple locations, can dynamically re-localise, or reside within an unknown functional compartment. These considerations lead to uncertainty in associating a protein to a single location. Currently, mass spectrometry (MS) based spatial proteomics relies on supervised machine learning algorithms to assign proteins to sub-cellular locations based on common gradient profiles. However, such methods fail to quantify uncertainty associated with sub-cellular class assignment. Here we reformulate the framework on which we perform statistical analysis. We propose a Bayesian generative classifier based on Gaussian mixture models to assign proteins probabilistically to sub-cellular niches, thus proteins have a probability distribution over sub-cellular locations, with Bayesian computation performed using the expectation-maximisation (EM) algorithm, as well as Markov-chain Monte-Carlo (MCMC). Our methodology allows proteome-wide uncertainty quantification, thus adding a further layer to the analysis of spatial proteomics. Our framework is flexible, allowing many different systems to be analysed and reveals new modelling opportunities for spatial proteomics. We find our methods perform competitively with current state-of-the art machine learning methods, whilst simultaneously providing more information. We highlight several examples where classification based on the support vector machine is unable to make any conclusions, while uncertainty quantification using our approach provides biologically intriguing results. To our knowledge this is the first Bayesian model of MS-based spatial proteomics data.

Posted Content
TL;DR: A novel uncertainty estimation for classification tasks for Bayesian convolutional neural networks with variational inference is introduced by normalizing the output of a Softplus function in the final layer, which estimates aleatoric and epistemic uncertainty in a coherent manner.
Abstract: We introduce a novel uncertainty estimation for classification tasks for Bayesian convolutional neural networks with variational inference. By normalizing the output of a Softplus function in the final layer, we estimate aleatoric and epistemic uncertainty in a coherent manner. The intractable posterior probability distributions over weights are inferred by Bayes by Backprop. Firstly, we demonstrate how this reliable variational inference method can serve as a fundamental construct for various network architectures. On multiple datasets in supervised learning settings (MNIST, CIFAR-10, CIFAR-100), this variational inference method achieves performances equivalent to frequentist inference in identical architectures, while the two desiderata, a measure for uncertainty and regularization are incorporated naturally. Secondly, we examine how our proposed measure for aleatoric and epistemic uncertainties is derived and validate it on the aforementioned datasets.