scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 2019"


Journal ArticleDOI
TL;DR: Polygenic risk scores method PRS-CS is presented, a polygenic prediction method that infers posterior effect sizes of single nucleotide polymorphisms (SNPs) using genome-wide association summary statistics and an external linkage disequilibrium (LD) reference panel.
Abstract: Polygenic risk scores (PRS) have shown promise in predicting human complex traits and diseases. Here, we present PRS-CS, a polygenic prediction method that infers posterior effect sizes of single nucleotide polymorphisms (SNPs) using genome-wide association summary statistics and an external linkage disequilibrium (LD) reference panel. PRS-CS utilizes a high-dimensional Bayesian regression framework, and is distinct from previous work by placing a continuous shrinkage (CS) prior on SNP effect sizes, which is robust to varying genetic architectures, provides substantial computational advantages, and enables multivariate modeling of local LD patterns. Simulation studies using data from the UK Biobank show that PRS-CS outperforms existing methods across a wide range of genetic architectures, especially when the training sample size is large. We apply PRS-CS to predict six common complex diseases and six quantitative traits in the Partners HealthCare Biobank, and further demonstrate the improvement of PRS-CS in prediction accuracy over alternative methods.

681 citations


Journal ArticleDOI
TL;DR: The proposed method can find the best hyperparameters for the widely used machine learning models, such as the random forest algorithm and the neural networks, even multi-grained cascade forest under the consideration of time cost.

496 citations


Journal ArticleDOI
TL;DR: A powerful individual-level data Bayesian multiple regression model (BayesR) is extended to one that utilises summary statistics from genome-wide association studies (GWAS) and it outperforms other summary statistic-based methods.
Abstract: Accurate prediction of an individual’s phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding. Various approaches are being used for polygenic prediction including Bayesian multiple regression methods that require access to individual-level genotype data. Here, the authors extend BayesR to utilise GWAS summary statistics (SBayesR) and show that it outperforms other summary statistic-based methods.

274 citations


Journal ArticleDOI
TL;DR: This paper provides a worked example of using Dynamic Causal Modelling (DCM) and Parametric Empirical Bayes (PEB) to characterise inter-subject variability in neural circuitry (effective connectivity) and provides a tutorial style explanation of the underlying theory and assumptions.

220 citations


Journal ArticleDOI
TL;DR: It is proved that the VB posterior converges to the Kullback–Leibler (KL) minimizer of a normal distribution, centered at the truth and the corresponding variational expectation of the parameter is consistent and asymptotically normal.
Abstract: A key challenge for modern Bayesian statistics is how to perform scalable inference of posterior distributions. To address this challenge, variational Bayes (VB) methods have emerged as a popular a...

167 citations


Journal ArticleDOI
TL;DR: This work proposes to conduct likelihood-free Bayesian inferences about parameters with no prior selection of the relevant components of the summary statistics and bypassing the derivation of the associated tolerance level using the random forest methodology of Breiman (2001).
Abstract: Approximate Bayesian Computation (ABC) has grown into a standard methodology to handle Bayesian inference in models associated with intractable likelihood functions. Most ABC implementations require the selection of a summary statistic as the data itself is too large or too complex to be compared to simulated realisations from the assumed model. The dimension of this statistic is generally constrained to be close to the dimension of the model parameter for efficiency reasons. Furthermore, the tolerance level that governs the acceptance or rejection of parameter values needs to be calibrated and the range of calibration techniques available so far is mostly based on asymptotic arguments. We propose here to conduct Bayesian inference based on an arbitrarily large vector of summary statistics without imposing a selection of the relevant components and bypassing the derivation of a tolerance. The approach relies on the random forest methodology of Breiman (2001) when applied to regression. We advocate the derivation of a new random forest for each component of the parameter vector, a tool from which an approximation to the marginal posterior distribution can be derived. Correlations between parameter components are handled by separate random forests. This technology offers significant gains in terms of robustness to the choice of the summary statistics and of computing time, when compared with more standard ABC solutions.

163 citations


Journal ArticleDOI
TL;DR: The authors focus on two flexible models, Bayesian and frequentist, to determine overall effect sizes in network meta-analysis, making the material easy to understand for Korean researchers who did not major in statistics.
Abstract: The objective of this study is to describe the general approaches to network meta-analysis that are available for quantitative data synthesis using R software. We conducted a network meta-analysis using two approaches: Bayesian and frequentist methods. The corresponding R packages were "gemtc" for the Bayesian approach and "netmeta" for the frequentist approach. In estimating a network meta-analysis model using a Bayesian framework, the "rjags" package is a common tool. "rjags" implements Markov chain Monte Carlo simulation with a graphical output. The estimated overall effect sizes, test for heterogeneity, moderator effects, and publication bias were reported using R software. The authors focus on two flexible models, Bayesian and frequentist, to determine overall effect sizes in network meta-analysis. This study focused on the practical methods of network meta-analysis rather than theoretical concepts, making the material easy to understand for Korean researchers who did not major in statistics. The authors hope that this study will help many Korean researchers to perform network meta-analyses and conduct related research more easily with R software.

154 citations


Posted Content
TL;DR: This paper derives the optimal strategy for membership inference with a few assumptions on the distribution of the parameters, and shows that optimal attacks only depend on the loss function, and thus black-box attacks are as good as white- box attacks.
Abstract: Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set. In this paper, we derive the optimal strategy for membership inference with a few assumptions on the distribution of the parameters. We show that optimal attacks only depend on the loss function, and thus black-box attacks are as good as white-box attacks. As the optimal strategy is not tractable, we provide approximations of it leading to several inference methods, and show that existing membership inference methods are coarser approximations of this optimal strategy. Our membership attacks outperform the state of the art in various settings, ranging from a simple logistic regression to more complex architectures and datasets, such as ResNet-101 and Imagenet.

141 citations


Book ChapterDOI
10 Jun 2019
TL;DR: In this article, the authors deal with likelihood inference for spatial point processes using the methods of R. A. Moyeed and A. J. Moller using Markov chain Monte Carlo (MCMC).
Abstract: This chapter deals with likelihood inference for spatial point processes using the methods of R. A. Moyeed and A. J. Baddeley, C. J. Geyer and E. A. Thompson, A. E. Gelfand and B. P. Carlin, C. J. Geyer, and C. J. Geyer and J. Moller using Markov chain Monte Carlo (MCMC). The MCMC, including the Gibbs sampler and the Metropolis, the Metropolis–Hastings, and the Metropolis–Hastings–Green algorithms, permits the simulation of any stochastic process specified by an unnormalized density. Thus the family of unnormalized densities is involved in both conditional likelihood inference and likelihood inference with missing data. Latent variables, random effects, and ordinary empirical Bayes models all involve missing data of some form. Missing data involve the same considerations as conditional families. The oldest general class of models specified by unnormalized densities are exponential families. Models specified by unnormalized densities present a problem for Bayesian inference.

127 citations


Posted Content
TL;DR: This paper predicts how certain the model prediction is based on the epistemic and aleatoric uncertainties and empirically shows how the uncertainty can decrease, allowing the decisions made by the network to become more deterministic as the training accuracy increases.
Abstract: Artificial Neural Networks are connectionist systems that perform a given task by learning on examples without having prior knowledge about the task. This is done by finding an optimal point estimate for the weights in every node. Generally, the network using point estimates as weights perform well with large datasets, but they fail to express uncertainty in regions with little or no data, leading to overconfident decisions. In this paper, Bayesian Convolutional Neural Network (BayesCNN) using Variational Inference is proposed, that introduces probability distribution over the weights. Furthermore, the proposed BayesCNN architecture is applied to tasks like Image Classification, Image Super-Resolution and Generative Adversarial Networks. The results are compared to point-estimates based architectures on MNIST, CIFAR-10 and CIFAR-100 datasets for Image CLassification task, on BSD300 dataset for Image Super Resolution task and on CIFAR10 dataset again for Generative Adversarial Network task. BayesCNN is based on Bayes by Backprop which derives a variational approximation to the true posterior. We, therefore, introduce the idea of applying two convolutional operations, one for the mean and one for the variance. Our proposed method not only achieves performances equivalent to frequentist inference in identical architectures but also incorporate a measurement for uncertainties and regularisation. It further eliminates the use of dropout in the model. Moreover, we predict how certain the model prediction is based on the epistemic and aleatoric uncertainties and empirically show how the uncertainty can decrease, allowing the decisions made by the network to become more deterministic as the training accuracy increases. Finally, we propose ways to prune the Bayesian architecture and to make it more computational and time effective.

117 citations


Journal ArticleDOI
TL;DR: Here, the authors use Bayesian causal modeling and measures of neural activity to show how the brain dynamically codes for and combines sensory signals to draw causal inferences.
Abstract: Transforming the barrage of sensory signals into a coherent multisensory percept relies on solving the binding problem - deciding whether signals come from a common cause and should be integrated or, instead, segregated. Human observers typically arbitrate between integration and segregation consistent with Bayesian Causal Inference, but the neural mechanisms remain poorly understood. Here, we presented people with audiovisual sequences that varied in the number of flashes and beeps, then combined Bayesian modelling and EEG representational similarity analyses. Our data suggest that the brain initially represents the number of flashes and beeps independently. Later, it computes their numbers by averaging the forced-fusion and segregation estimates weighted by the probabilities of common and independent cause models (i.e. model averaging). Crucially, prestimulus oscillatory alpha power and phase correlate with observers' prior beliefs about the world's causal structure that guide their arbitration between sensory integration and segregation.

Proceedings Article
01 Jan 2019
TL;DR: This work introduces a novel deterministic method to approximate moments in neural networks, eliminating gradient variance and introduces a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances, and demonstrates good predictive performance over alternative approaches.
Abstract: Bayesian neural networks (BNNs) hold great promise as a flexible and principled solution to deal with uncertainty when learning from finite data. Among approaches to realize probabilistic inference in deep neural networks, variational Bayes (VB) is theoretically grounded, generally applicable, and computationally efficient. With wide recognition of potential advantages, why is it that variational Bayes has seen very limited practical use for BNNs in real applications? We argue that variational inference in neural networks is fragile: successful implementations require careful initialization and tuning of prior variances, as well as controlling the variance of Monte Carlo gradient estimates. We provide two innovations that aim to turn VB into a robust inference tool for Bayesian neural networks: first, we introduce a novel deterministic method to approximate moments in neural networks, eliminating gradient variance; second, we introduce a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances. Combining these two innovations, the resulting method is highly efficient and robust. On the application of heteroscedastic regression we demonstrate good predictive performance over alternative approaches.

Journal ArticleDOI
TL;DR: After reading this tutorial and executing the associated code, researchers will be able to use their own data for the evaluation of hypotheses by means of the Bayes factor, not only in thecontext of ANOVA models, but also in the context of other statistical models.
Abstract: Learning about hypothesis evaluation using the Bayes factor could enhance psychological research. In contrast to null-hypothesis significance testing it renders the evidence in favor of each of the hypotheses under consideration (it can be used to quantify support for the null-hypothesis) instead of a dichotomous reject/do-not-reject decision; it can straightforwardly be used for the evaluation of multiple hypotheses without having to bother about the proper manner to account for multiple testing; and it allows continuous reevaluation of hypotheses after additional data have been collected (Bayesian updating). This tutorial addresses researchers considering to evaluate their hypotheses by means of the Bayes factor. The focus is completely applied and each topic discussed is illustrated using Bayes factors for the evaluation of hypotheses in the context of an ANOVA model, obtained using the R package bain. Readers can execute all the analyses presented while reading this tutorial if they download bain and the R-codes used. It will be elaborated in a completely nontechnical manner: what the Bayes factor is, how it can be obtained, how Bayes factors should be interpreted, and what can be done with Bayes factors. After reading this tutorial and executing the associated code, researchers will be able to use their own data for the evaluation of hypotheses by means of the Bayes factor, not only in the context of ANOVA models, but also in the context of other statistical models. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: This tutorial provides a practical introduction to Bayesian multilevel modeling by reanalyzing a phonetic data set containing formant values for 5 vowels of standard Indonesian, as spoken by 8 speakers, with several repetitions of each vowel.
Abstract: Purpose Bayesian multilevel models are increasingly used to overcome the limitations of frequentist approaches in the analysis of complex structured data. This tutorial introduces Bayesian multilevel modeling for the specific analysis of speech data, using the brms package developed in R. Method In this tutorial, we provide a practical introduction to Bayesian multilevel modeling by reanalyzing a phonetic data set containing formant (F1 and F2) values for 5 vowels of standard Indonesian (ISO 639-3:ind), as spoken by 8 speakers (4 females and 4 males), with several repetitions of each vowel. Results We first give an introductory overview of the Bayesian framework and multilevel modeling. We then show how Bayesian multilevel models can be fitted using the probabilistic programming language Stan and the R package brms, which provides an intuitive formula syntax. Conclusions Through this tutorial, we demonstrate some of the advantages of the Bayesian framework for statistical modeling and provide a detailed c...

Journal ArticleDOI
TL;DR: Risk modeling with BNs has advantages over regression-based approaches, and three that are relevant to health outcomes research are focused on: the generation of network structures in which relationships between variables can be easily communicated; their ability to apply Bayes's theorem to conduct individual-level risk estimation; and their easy transformation into decision models.

Journal ArticleDOI
Xinyu Chen1, Zhaocheng He1, Yixian Chen1, Yuhuan Lu1, Jiawei Wang1 
TL;DR: This study presents a fully Bayesian framework for automatically learning parameters of this model using variational Bayes (VB) and demonstrates that the proposed Bayesian augmented tensor factorization (BATF) model achieves best imputation accuracies and outperforms the state-of-the-art baselines.
Abstract: Spatiotemporal traffic data, which represent multidimensional time series on considering different spatial locations, are ubiquitous in real-world transportation systems. However, the inevitable missing data problem makes data-driven intelligent transportation systems suffer from an incorrect response. Therefore, imputing missing values is of great importance but challenging as it is not easy to capture spatiotemporal traffic patterns, including explicit and latent features. In this study, we propose an augmented tensor factorization model by incorporating generic forms of domain knowledge from transportation systems. Specifically, we present a fully Bayesian framework for automatically learning parameters of this model using variational Bayes (VB). Relying on the publicly available urban traffic speed data set collected in Guangzhou, China, experiments on two types of missing data scenarios (i.e., random and non-random) demonstrate that the proposed Bayesian augmented tensor factorization (BATF) model achieves best imputation accuracies and outperforms the state-of-the-art baselines (e.g., Bayesian tensor factorization models). Besides, we discover interpretable patterns from the experimentally learned global parameter, biases, and latent factors that indeed conform to the dynamic of traffic states.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the contraction property of the fractional posterior in a general misspecified framework and derive a sharp oracle inequality in multivariate convex regression problems, and also illustrate the theory in Gaussian process regression and density estimation problems.
Abstract: We consider the fractional posterior distribution that is obtained by updating a prior distribution via Bayes theorem with a fractional likelihood function, a usual likelihood function raised to a fractional power. First, we analyze the contraction property of the fractional posterior in a general misspecified framework. Our contraction results only require a prior mass condition on certain Kullback–Leibler (KL) neighborhood of the true parameter (or the KL divergence minimizer in the misspecified case), and obviate constructions of test functions and sieves commonly used in the literature for analyzing the contraction property of a regular posterior. We show through a counterexample that some condition controlling the complexity of the parameter space is necessary for the regular posterior to contract, rendering additional flexibility on the choice of the prior for the fractional posterior. Second, we derive a novel Bayesian oracle inequality based on a PAC-Bayes inequality in misspecified models. Our derivation reveals several advantages of averaging based Bayesian procedures over optimization based frequentist procedures. As an application of the Bayesian oracle inequality, we derive a sharp oracle inequality in multivariate convex regression problems. We also illustrate the theory in Gaussian process regression and density estimation problems.

Posted Content
TL;DR: An optimization-centric view on and a novel generalization of Bayesian inference is introduced, called the Rule of Three (RoT), which derives it axiomatically and recover existing posteriors as special cases, including the Bayesian posterior and its approximation by standard VI.
Abstract: We advocate an optimization-centric view on and introduce a novel generalization of Bayesian inference. Our inspiration is the representation of Bayes' rule as infinite-dimensional optimization problem (Csiszar, 1975; Donsker and Varadhan; 1975, Zellner; 1988). First, we use it to prove an optimality result of standard Variational Inference (VI): Under the proposed view, the standard Evidence Lower Bound (ELBO) maximizing VI posterior is preferable to alternative approximations of the Bayesian posterior. Next, we argue for generalizing standard Bayesian inference. The need for this arises in situations of severe misalignment between reality and three assumptions underlying standard Bayesian inference: (1) Well-specified priors, (2) well-specified likelihoods, (3) the availability of infinite computing power. Our generalization addresses these shortcomings with three arguments and is called the Rule of Three (RoT). We derive it axiomatically and recover existing posteriors as special cases, including the Bayesian posterior and its approximation by standard VI. In contrast, approximations based on alternative ELBO-like objectives violate the axioms. Finally, we study a special case of the RoT that we call Generalized Variational Inference (GVI). GVI posteriors are a large and tractable family of belief distributions specified by three arguments: A loss, a divergence and a variational family. GVI posteriors have appealing properties, including consistency and an interpretation as approximate ELBO. The last part of the paper explores some attractive applications of GVI in popular machine learning models, including robustness and more appropriate marginals. After deriving black box inference schemes for GVI posteriors, their predictive performance is investigated on Bayesian Neural Networks and Deep Gaussian Processes, where GVI can comprehensively improve upon existing methods.

Journal ArticleDOI
TL;DR: bayNorm is introduced, a novel Bayesian approach for scaling and inference of scRNA-seq counts that improves accuracy and sensitivity of differential expression analysis and reduces batch effect compared with other existing methods.
Abstract: Motivation Normalization of single-cell RNA-sequencing (scRNA-seq) data is a prerequisite to their interpretation. The marked technical variability, high amounts of missing observations and batch effect typical of scRNA-seq datasets make this task particularly challenging. There is a need for an efficient and unified approach for normalization, imputation and batch effect correction. Results Here, we introduce bayNorm, a novel Bayesian approach for scaling and inference of scRNA-seq counts. The method's likelihood function follows a binomial model of mRNA capture, while priors are estimated from expression values across cells using an empirical Bayes approach. We first validate our assumptions by showing this model can reproduce different statistics observed in real scRNA-seq data. We demonstrate using publicly available scRNA-seq datasets and simulated expression data that bayNorm allows robust imputation of missing values generating realistic transcript distributions that match single molecule fluorescence in situ hybridization measurements. Moreover, by using priors informed by dataset structures, bayNorm improves accuracy and sensitivity of differential expression analysis and reduces batch effect compared with other existing methods. Altogether, bayNorm provides an efficient, integrated solution for global scaling normalization, imputation and true count recovery of gene expression measurements from scRNA-seq data. Availability and implementation The R package 'bayNorm' is publishd on bioconductor at https://bioconductor.org/packages/release/bioc/html/bayNorm.html. The code for analyzing data in this article is available at https://github.com/WT215/bayNorm_papercode. Supplementary information Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: In this article, the authors discuss Bayesian inference for the identification of elastoplastic material parameters in addition to errors in the stress measurements, which are commonly considered, and furthermore consider errors in strain measurements since a difference between the model and the experimental data may still be present if the data is not contaminated by noise.

Proceedings Article
24 May 2019
TL;DR: In this article, the authors derive the optimal strategy for membership inference with a few assumptions on the distribution of the parameters and show that membership attacks only depend on the loss function, and thus black box attacks are as good as white-box attacks.
Abstract: Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set. In this paper, we derive the optimal strategy for membership inference with a few assumptions on the distribution of the parameters. We show that optimal attacks only depend on the loss function, and thus black-box attacks are as good as white-box attacks. As the optimal strategy is not tractable, we provide approximations of it leading to several inference methods, and show that existing membership inference methods are coarser approximations of this optimal strategy. Our membership attacks outperform the state of the art in various settings, ranging from a simple logistic regression to more complex architectures and datasets, such as ResNet-101 and Imagenet.

Journal ArticleDOI
TL;DR: In this paper, the authors compared the performance of different meta-analysis methods, including the DerSimonian-Laird approach, empirically and in a simulation study, based on few studies, imbalanced study sizes, and considering odds-ratio and risk ratio (RR) effect sizes.
Abstract: Standard random-effects meta-analysis methods perform poorly when applied to few studies only. Such settings however are commonly encountered in practice. It is unclear, whether or to what extent small-sample-size behaviour can be improved by more sophisticated modeling. We consider likelihood-based methods, the DerSimonian-Laird approach, Empirical Bayes, several adjustment methods and a fully Bayesian approach. Confidence intervals are based on a normal approximation, or on adjustments based on the Student-t-distribution. In addition, a linear mixed model and two generalized linear mixed models (GLMMs) assuming binomial or Poisson distributed numbers of events per study arm are considered for pairwise binary meta-analyses. We extract an empirical data set of 40 meta-analyses from recent reviews published by the German Institute for Quality and Efficiency in Health Care (IQWiG). Methods are then compared empirically as well as in a simulation study, based on few studies, imbalanced study sizes, and considering odds-ratio (OR) and risk ratio (RR) effect sizes. Coverage probabilities and interval widths for the combined effect estimate are evaluated to compare the different approaches. Empirically, a majority of the identified meta-analyses include only 2 studies. Variation of methods or effect measures affects the estimation results. In the simulation study, coverage probability is, in the presence of heterogeneity and few studies, mostly below the nominal level for all frequentist methods based on normal approximation, in particular when sizes in meta-analyses are not balanced, but improve when confidence intervals are adjusted. Bayesian methods result in better coverage than the frequentist methods with normal approximation in all scenarios, except for some cases of very large heterogeneity where the coverage is slightly lower. Credible intervals are empirically and in the simulation study wider than unadjusted confidence intervals, but considerably narrower than adjusted ones, with some exceptions when considering RRs and small numbers of patients per trial-arm. Confidence intervals based on the GLMMs are, in general, slightly narrower than those from other frequentist methods. Some methods turned out impractical due to frequent numerical problems. In the presence of between-study heterogeneity, especially with unbalanced study sizes, caution is needed in applying meta-analytical methods to few studies, as either coverage probabilities might be compromised, or intervals are inconclusively wide. Bayesian estimation with a sensibly chosen prior for between-trial heterogeneity may offer a promising compromise.

Journal ArticleDOI
15 Jan 2019-Energy
TL;DR: A novel long-term probability forecasting model is proposed to predict the Chinese per-capita electricity consumption and its variation interval over the period 2010–2030 and the case study shows that the proposed methodology has higher accuracy and adaptability.

Journal ArticleDOI
TL;DR: The results suggest that pTFCE is more robust to various ground truth shapes and provides a stricter control over cluster “leaking” than TFCE and, in many realistic cases, further improves its sensitivity.


Journal ArticleDOI
TL;DR: It is shown that the brain accomplishes Bayesian causal inference by dynamically encoding multiple spatial estimates in an audiovisual spatial localisation task, and the dynamic evolution of perceptual estimates reflects the hierarchical nature of Bayesian causal inference, a statistical computation, which is crucial for effective interactions with the environment.
Abstract: To form a percept of the environment, the brain needs to solve the binding problem-inferring whether signals come from a common cause and are integrated or come from independent causes and are segregated. Behaviourally, humans solve this problem near-optimally as predicted by Bayesian causal inference; but the neural mechanisms remain unclear. Combining Bayesian modelling, electroencephalography (EEG), and multivariate decoding in an audiovisual spatial localisation task, we show that the brain accomplishes Bayesian causal inference by dynamically encoding multiple spatial estimates. Initially, auditory and visual signal locations are estimated independently; next, an estimate is formed that combines information from vision and audition. Yet, it is only from 200 ms onwards that the brain integrates audiovisual signals weighted by their bottom-up sensory reliabilities and top-down task relevance into spatial priority maps that guide behavioural responses. As predicted by Bayesian causal inference, these spatial priority maps take into account the brain's uncertainty about the world's causal structure and flexibly arbitrate between sensory integration and segregation. The dynamic evolution of perceptual estimates thus reflects the hierarchical nature of Bayesian causal inference, a statistical computation, which is crucial for effective interactions with the environment.

Journal ArticleDOI
TL;DR: This paper provides a counterexample which shows that in general this claim that maximum a posteriori estimators are a limiting case of Bayes estimators with 0–1 loss is false and corrects that by providing a level-set condition for posterior densities such that the result holds.
Abstract: Maximum a posteriori and Bayes estimators are two common methods of point estimation in Bayesian statistics. It is commonly accepted that maximum a posteriori estimators are a limiting case of Bayes estimators with 0–1 loss. In this paper, we provide a counterexample which shows that in general this claim is false. We then correct the claim that by providing a level-set condition for posterior densities such that the result holds. Since both estimators are defined in terms of optimization problems, the tools of variational analysis find a natural application to Bayesian point estimation.

Proceedings ArticleDOI
13 May 2019
TL;DR: This work proposes a Bayesian deep learning model that outperforms the state-of-the-art detection models of misinformation detection and develops a Stochastic Gradient Variational Bayes algorithm to approximate the analytically intractable posterior distribution.
Abstract: Social media platforms are a plethora of misinformation and its potential negative influence on the public is a growing concern. This concern has drawn the attention of the research community on developing mechanisms to detect misinformation. The task of misinformation detection consists of classifying whether a claim is True or False. Most research concentrates on developing machine learning models, such as neural networks, that outputs a single value in order to predict the veracity of a claim. One of the major problem faced by these models is the inability of representing the uncertainty of the prediction, which is due incomplete or finite available information about the claim being examined. We address this problem by proposing a Bayesian deep learning model. The Bayesian model outputs a distribution used to represent both the prediction and its uncertainty. In addition to the claim content, we also encode auxiliary information given by people's replies to the claim. First, the model encodes a claim to be verified, and generate a prior belief distribution from which we sample a latent variable. Second, the model encodes all the people's replies to the claim in a temporal order through a Long Short Term Memory network in order to summarize their content. This summary is then used to update the prior belief generating the posterior belief. Moreover, in order to train this model, we develop a Stochastic Gradient Variational Bayes algorithm to approximate the analytically intractable posterior distribution. Experiments conducted on two public datasets demonstrate that our model outperforms the state-of-the-art detection models.

Journal ArticleDOI
TL;DR: A hierarchical Bayesian inference (HBI) framework for concurrent model comparison, parameter estimation and inference at the population level, combining previous approaches is proposed and it is shown that this framework has important advantages for both parameters estimation and model comparison theoretically and experimentally.
Abstract: Computational modeling plays an important role in modern neuroscience research. Much previous research has relied on statistical methods, separately, to address two problems that are actually interdependent. First, given a particular computational model, Bayesian hierarchical techniques have been used to estimate individual variation in parameters over a population of subjects, leveraging their population-level distributions. Second, candidate models are themselves compared, and individual variation in the expressed model estimated, according to the fits of the models to each subject. The interdependence between these two problems arises because the relevant population for estimating parameters of a model depends on which other subjects express the model. Here, we propose a hierarchical Bayesian inference (HBI) framework for concurrent model comparison, parameter estimation and inference at the population level, combining previous approaches. We show that this framework has important advantages for both parameter estimation and model comparison theoretically and experimentally. The parameters estimated by the HBI show smaller errors compared to other methods. Model comparison by HBI is robust against outliers and is not biased towards overly simplistic models. Furthermore, the fully Bayesian approach of our theory enables researchers to make inference on group-level parameters by performing HBI t-test.

Journal ArticleDOI
TL;DR: This study is the first to compare male and female performance on many basic numerical tasks using both Bayesian and frequentist analyses, suggesting that a male advantage in foundational numerical skills is the exception rather than the rule.
Abstract: This study investigates gender differences in basic numerical skills that are predictive of math achievement. Previous research in this area is inconsistent and has relied upon traditional hypothesis testing, which does not allow for assertive conclusions to be made regarding nonsignificant findings. This study is the first to compare male and female performance (N = 1,391; ages 6-13) on many basic numerical tasks using both Bayesian and frequentist analyses. The results provide strong evidence of gender similarities on the majority of basic numerical tasks measured, suggesting that a male advantage in foundational numerical skills is the exception rather than the rule.