scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian inference published in 2019"


Proceedings Article
07 Feb 2019
TL;DR: In this article, the authors proposed SWA-Gaussian (SWAG) approach for uncertainty representation and calibration in deep learning, where the first moment of stochastic gradient descent (SGD) is computed using a modified learning rate schedule.
Abstract: We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization in deep learning. With SWAG, we fit a Gaussian using the SWA solution as the first moment and a low rank plus diagonal covariance also derived from the SGD iterates, forming an approximate posterior distribution over neural network weights; we then sample from this Gaussian distribution to perform Bayesian model averaging. We empirically find that SWAG approximates the shape of the true posterior, in accordance with results describing the stationary distribution of SGD iterates. Moreover, we demonstrate that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including variational inference, MC dropout, KFAC Laplace, and temperature scaling.

493 citations


Journal ArticleDOI
TL;DR: Bilby as discussed by the authors is a Bayesian inference library for gravitational-wave astronomy, which provides expert-level parameter estimation infrastructure with straightforward syntax and tools that facilitate use by beginners, allowing users to perform accurate and reliable gravitationalwave parameter estimation on both real, freely available data from LIGO/Virgo and simulated data.
Abstract: Bayesian parameter estimation is fast becoming the language of gravitational-wave astronomy. It is the method by which gravitational-wave data is used to infer the sources' astrophysical properties. We introduce a user-friendly Bayesian inference library for gravitational-wave astronomy, Bilby. This Python code provides expert-level parameter estimation infrastructure with straightforward syntax and tools that facilitate use by beginners. It allows users to perform accurate and reliable gravitational-wave parameter estimation on both real, freely available data from LIGO/Virgo and simulated data. We provide a suite of examples for the analysis of compact binary mergers and other types of signal models, including supernovae and the remnants of binary neutron star mergers. These examples illustrate how to change the signal model, implement new likelihood functions, and add new detectors. Bilby has additional functionality to do population studies using hierarchical Bayesian modeling. We provide an example in which we infer the shape of the black hole mass distribution from an ensemble of observations of binary black hole mergers.

442 citations


Journal ArticleDOI
01 Jul 2019-Oikos
TL;DR: Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model.
Abstract: Throughout the last two decades, Bayesian statistical methods have proliferated throughout ecology and evolution. Numerous previous references established both philosophical and computational guidelines for implementing Bayesian methods. However, protocols for incorporating prior information, the defining characteristic of Bayesian philosophy, are nearly nonexistent in the ecological literature. Here, I hope to encourage the use of weakly informative priors in ecology and evolution by providing a ‘consumer's guide’ to weakly informative priors. The first section outlines three reasons why ecologists should abandon noninformative priors: 1) common flat priors are not always noninformative, 2) noninformative priors provide the same result as simpler frequentist methods, and 3) noninformative priors suffer from the same high type I and type M error rates as frequentist methods. The second section provides a guide for implementing informative priors, wherein I detail convenient ‘reference’ prior distributions for common statistical models (i.e. regression, ANOVA, hierarchical models). I then use simulations to visually demonstrate how informative priors influence posterior parameter estimates. With the guidelines provided here, I hope to encourage the use of weakly informative priors for Bayesian analyses in ecology. Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model.

246 citations


Journal ArticleDOI
TL;DR: The Bayesian probabilistic matrix factorization model by Salakhutdinov and Mnih is extended to higher-order tensors and applied for spatiotemporal traffic data imputation tasks and shows the proposed model can produce accurate imputations even under temporally correlated data corruption.
Abstract: The missing data problem is inevitable when collecting traffic data from intelligent transportation systems. Previous studies have shown the advantages of tensor completion-based approaches in solving multi-dimensional data imputation problems. In this paper, we extend the Bayesian probabilistic matrix factorization model by Salakhutdinov and Mnih (2008) to higher-order tensors and apply it for spatiotemporal traffic data imputation tasks. In doing so, we care about not only the model configuration but also the representation of data (i.e., matrix, third-order tensor and fourth-order tensor). Using a nine-week spatiotemporal traffic speed data set (road segment × day × time of day) collected in Guangzhou, China, we evaluate the performance of this fully Bayesian model and explore how different data representations affect imputation performance through extensive experiments. The results show the proposed model can produce accurate imputations even under temporally correlated data corruption. Our experiments also show that data representation is a crucial factor for model performance, and a third-order tensor structure outperforms the matrix and fourth-order tensor representations in preserving information in our data set. We hope this work could give insights to practitioners when performing spatiotemporal data imputation tasks.

207 citations


Journal ArticleDOI
27 Nov 2019
TL;DR: In this article, an alternative summation of the MultiNest draws, called importance nested sampling (INS), is presented, which can calculate the Bayesian evidence at up to an order of magnitude higher accuracy than vanilla NS with no change in the way Multi-Nest explores the parameter space.
Abstract: Bayesian inference involves two main computational challenges. First, in estimating the parameters of some model for the data, the posterior distribution may well be highly multi-modal: a regime in which the convergence to stationarity of traditional Markov Chain Monte Carlo (MCMC) techniques becomes incredibly slow. Second, in selecting between a set of competing models the necessary estimation of the Bayesian evidence for each is, by definition, a (possibly high-dimensional) integration over the entire parameter space; again this can be a daunting computational task, although new Monte Carlo (MC) integration algorithms offer solutions of ever increasing efficiency. Nested sampling (NS) is one such contemporary MC strategy targeted at calculation of the Bayesian evidence, but which also enables posterior inference as a by-product, thereby allowing simultaneous parameter estimation and model selection. The widely-used MultiNest algorithm presents a particularly efficient implementation of the NS technique for multi-modal posteriors. In this paper we discuss importance nested sampling (INS), an alternative summation of the MultiNest draws, which can calculate the Bayesian evidence at up to an order of magnitude higher accuracy than `vanilla' NS with no change in the way MultiNest explores the parameter space. This is accomplished by treating as a (pseudo-)importance sample the totality of points collected by MultiNest, including those previously discarded under the constrained likelihood sampling of the NS algorithm. We apply this technique to several challenging test problems and compare the accuracy of Bayesian evidences obtained with INS against those from vanilla NS.

204 citations


Journal ArticleDOI
TL;DR: This work introduces a novel approach to Bayesian inference that improves robustness to small departures from the model: rather than conditioning on the event that the observed data are generated by the model, one conditions on theevent that the model generates data close to the observedData, in a distributional sense.
Abstract: The standard approach to Bayesian inference is based on the assumption that the distribution of the data belongs to the chosen model class. However, even a small violation of this assumption can ha...

183 citations


Journal ArticleDOI
TL;DR: It is proved that the VB posterior converges to the Kullback–Leibler (KL) minimizer of a normal distribution, centered at the truth and the corresponding variational expectation of the parameter is consistent and asymptotically normal.
Abstract: A key challenge for modern Bayesian statistics is how to perform scalable inference of posterior distributions. To address this challenge, variational Bayes (VB) methods have emerged as a popular a...

167 citations


Journal ArticleDOI
TL;DR: This work proposes to conduct likelihood-free Bayesian inferences about parameters with no prior selection of the relevant components of the summary statistics and bypassing the derivation of the associated tolerance level using the random forest methodology of Breiman (2001).
Abstract: Approximate Bayesian Computation (ABC) has grown into a standard methodology to handle Bayesian inference in models associated with intractable likelihood functions. Most ABC implementations require the selection of a summary statistic as the data itself is too large or too complex to be compared to simulated realisations from the assumed model. The dimension of this statistic is generally constrained to be close to the dimension of the model parameter for efficiency reasons. Furthermore, the tolerance level that governs the acceptance or rejection of parameter values needs to be calibrated and the range of calibration techniques available so far is mostly based on asymptotic arguments. We propose here to conduct Bayesian inference based on an arbitrarily large vector of summary statistics without imposing a selection of the relevant components and bypassing the derivation of a tolerance. The approach relies on the random forest methodology of Breiman (2001) when applied to regression. We advocate the derivation of a new random forest for each component of the parameter vector, a tool from which an approximation to the marginal posterior distribution can be derived. Correlations between parameter components are handled by separate random forests. This technology offers significant gains in terms of robustness to the choice of the summary statistics and of computing time, when compared with more standard ABC solutions.

163 citations


Journal ArticleDOI
TL;DR: Subsampling Markov chain Monte Carlo is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature.
Abstract: We propose subsampling Markov chain Monte Carlo (MCMC), an MCMC framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a highly efficient unbiased estimator of the log-likelihood based on control variates, such that the computing cost is much smaller than that of the full log-likelihood in standard MCMC. The likelihood estimate is bias-corrected and used in two dependent pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. We propose a practical estimator of the error and show that the error is negligible even for a very small m in our applications. We demonstrate that subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature. Supplementary materials for this article are availabl...

162 citations


Journal ArticleDOI
TL;DR: This work presents a new proof scheme that is quite straightforward with respect to the previous ones and has a much wider range of applicability and also sheds new insights on the reasons for the validity of replica formulas in Bayesian inference.
Abstract: In recent years important progress has been achieved towards proving the validity of the replica predictions for the (asymptotic) mutual information (or “free energy”) in Bayesian inference problems. The proof techniques that have emerged appear to be quite general, despite they have been worked out on a case-by-case basis. Unfortunately, a common point between all these schemes is their relatively high level of technicality. We present a new proof scheme that is quite straightforward with respect to the previous ones. We call it the adaptive interpolation method because it can be seen as an extension of the interpolation method developped by Guerra and Toninelli in the context of spin glasses, with an interpolation path that is adaptive. In order to illustrate our method we show how to prove the replica formula for three non-trivial inference problems. The first one is symmetric rank-one matrix estimation (or factorisation), which is the simplest problem considered here and the one for which the method is presented in full details. Then we generalize to symmetric tensor estimation and random linear estimation. We believe that the present method has a much wider range of applicability and also sheds new insights on the reasons for the validity of replica formulas in Bayesian inference.

147 citations


Journal Article
TL;DR: It is shown that the Unadjusted Langevin Algorithm can be formulated as a first order optimization algorithm of an objective functional defined on the Wasserstein space of order $2$ and a non-asymptotic analysis of this method to sample from logconcave smooth target distribution is given.
Abstract: In this paper, we provide new insights on the Unadjusted Langevin Algorithm. We show that this method can be formulated as a first order optimization algorithm of an objective functional defined on the Wasserstein space of order $2$. Using this interpretation and techniques borrowed from convex optimization, we give a non-asymptotic analysis of this method to sample from logconcave smooth target distribution on $\mathbb{R}^d$. Based on this interpretation, we propose two new methods for sampling from a non-smooth target distribution, which we analyze as well. Besides, these new algorithms are natural extensions of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm, which is a popular extension of the Unadjusted Langevin Algorithm. Similar to SGLD, they only rely on approximations of the gradient of the target log density and can be used for large-scale Bayesian inference.

Journal ArticleDOI
TL;DR: In this article, neural density estimators (NDEs) are used to learn the likelihood function from a set of simulated datasets, with active learning to adaptively acquire simulations in the most relevant regions of parameter space on-the-fly.
Abstract: Likelihood-free inference provides a framework for performing rigorous Bayesian inference using only forward simulations, properly accounting for all physical and observational effects that can be successfully included in the simulations. The key challenge for likelihood-free applications in cosmology, where simulation is typically expensive, is developing methods that can achieve high-fidelity posterior inference with as few simulations as possible. Density-estimation likelihood-free inference (DELFI) methods turn inference into a density estimation task on a set of simulated data-parameter pairs, and give orders of magnitude improvements over traditional Approximate Bayesian Computation approaches to likelihood-free inference. In this paper we use neural density estimators (NDEs) to learn the likelihood function from a set of simulated datasets, with active learning to adaptively acquire simulations in the most relevant regions of parameter space on-the-fly. We demonstrate the approach on a number of cosmological case studies, showing that for typical problems high-fidelity posterior inference can be achieved with just $\mathcal{O}(10^3)$ simulations or fewer. In addition to enabling efficient simulation-based inference, for simple problems where the form of the likelihood is known, DELFI offers a fast alternative to MCMC sampling, giving orders of magnitude speed-up in some cases. Finally, we introduce \textsc{pydelfi} -- a flexible public implementation of DELFI with NDEs and active learning -- available at \url{https://github.com/justinalsing/pydelfi}.

Journal ArticleDOI
TL;DR: This work reviews the computational roles played by internal models of the motor system and environmental dynamics and the neural and behavioral evidence for their implementation in the brain in the context of theoretic formalism.
Abstract: Rationality principles such as optimal feedback control and Bayesian inference underpin a probabilistic framework that has accounted for a range of empirical phenomena in biological sensorimotor control. To facilitate the optimization of flexible and robust behaviors consistent with these theories, the ability to construct internal models of the motor system and environmental dynamics can be crucial. In the context of this theoretic formalism, we review the computational roles played by such internal models and the neural and behavioral evidence for their implementation in the brain.

Journal ArticleDOI
TL;DR: A dynamic distributed monitoring strategy is proposed to separate the dynamic variations from the steady states, and concurrently, monitor them to distinguish changes in the normal operating condition and real faults for large-scale nonstationary processes under closed-loop control.
Abstract: Large-scale processes under closed-loop control are commonly subjected to frequently varying conditions due to load changes or other causes, resulting in typical nonstationary characteristics. For closed-loop control processes, the normal changes in operation conditions may distort the static and dynamic variations in a different way from real faults. In this paper, a dynamic distributed monitoring strategy is proposed to separate the dynamic variations from the steady states, and concurrently, monitor them to distinguish changes in the normal operating condition and real faults for large-scale nonstationary processes under closed-loop control. First, large-scale nonstationary process variables are decomposed into different blocks to mine the local information. Second, the static and dynamic equilibrium relations are separated by probing into the cointegration analysis solution in each block. Third, the concurrent monitoring models are constructed to supervise both the steady variations and their dynamic counterparts for each block. Finally, the local monitoring results are combined by Bayesian inference to obtain global results, which enable description and monitoring of both static and dynamic equilibrium relations from the global and local viewpoints. The feasibility and performance of the proposed method are illustrated with a real industrial process, which is a 1000-MW ultra-supercritical thermal power unit.

Journal ArticleDOI
04 Sep 2019-Neuron
TL;DR: It is found that prior statistics warp neural representations in the frontal cortex, allowing the mapping of sensory inputs to motor outputs to incorporate prior statistics in accordance with Bayesian inference.

Posted Content
TL;DR: It is demonstrated that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.
Abstract: We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization in deep learning. With SWAG, we fit a Gaussian using the SWA solution as the first moment and a low rank plus diagonal covariance also derived from the SGD iterates, forming an approximate posterior distribution over neural network weights; we then sample from this Gaussian distribution to perform Bayesian model averaging. We empirically find that SWAG approximates the shape of the true posterior, in accordance with results describing the stationary distribution of SGD iterates. Moreover, we demonstrate that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.

Journal ArticleDOI
TL;DR: A probabilistic trajectory prediction model is proposed that describes the uncertainty in future positions along the ship trajectories by continuous probability distributions and has high prediction accuracy and meets the demands of real-time applications.

Book ChapterDOI
10 Jun 2019
TL;DR: In this article, the authors deal with likelihood inference for spatial point processes using the methods of R. A. Moyeed and A. J. Moller using Markov chain Monte Carlo (MCMC).
Abstract: This chapter deals with likelihood inference for spatial point processes using the methods of R. A. Moyeed and A. J. Baddeley, C. J. Geyer and E. A. Thompson, A. E. Gelfand and B. P. Carlin, C. J. Geyer, and C. J. Geyer and J. Moller using Markov chain Monte Carlo (MCMC). The MCMC, including the Gibbs sampler and the Metropolis, the Metropolis–Hastings, and the Metropolis–Hastings–Green algorithms, permits the simulation of any stochastic process specified by an unnormalized density. Thus the family of unnormalized densities is involved in both conditional likelihood inference and likelihood inference with missing data. Latent variables, random effects, and ordinary empirical Bayes models all involve missing data of some form. Missing data involve the same considerations as conditional families. The oldest general class of models specified by unnormalized densities are exponential families. Models specified by unnormalized densities present a problem for Bayesian inference.

Posted Content
TL;DR: Automatic posterior transformation (APT) is presented, a new sequential neural posterior estimation method for simulation-based inference that can modify the posterior estimate using arbitrary, dynamically updated proposals, and is compatible with powerful flow-based density estimators.
Abstract: How can one perform Bayesian inference on stochastic simulators with intractable likelihoods? A recent approach is to learn the posterior from adaptively proposed simulations using neural network-based conditional density estimators. However, existing methods are limited to a narrow range of proposal distributions or require importance weighting that can limit performance in practice. Here we present automatic posterior transformation (APT), a new sequential neural posterior estimation method for simulation-based inference. APT can modify the posterior estimate using arbitrary, dynamically updated proposals, and is compatible with powerful flow-based density estimators. It is more flexible, scalable and efficient than previous simulation-based inference techniques. APT can operate directly on high-dimensional time series and image data, opening up new applications for likelihood-free inference.

Journal ArticleDOI
TL;DR: A Bayesian Estimator of Abrupt change, Seasonal change, and Trend (BEAST) is reported, which offers a new analytical option for robust changepoint detection and nonlinear trend analysis and will help exploit environmental time-series data for probing patterns and drivers of ecosystem dynamics.

Journal ArticleDOI
TL;DR: Nested sampling is introduced to phylogenetics and its performance is analysed under different scenarios and compared to established methods to conclude that NS is a competitive and attractive algorithm for phylogenetic inference.
Abstract: Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable estimates. One of the major challenges in phylogenetics is the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, even for simple models, incurs high computational cost. Another interesting challenge relates to the estimation of the posterior distribution. Often, long Markov chains are required to get sufficient samples to carry out parameter inference, especially for tree distributions. In general, these problems are addressed separately by using different procedures. Nested sampling (NS) is a Bayesian computation algorithm, which provides the means to estimate marginal likelihoods together with their uncertainties, and to sample from the posterior distribution at no extra cost. The methods currently used in phylogenetics for marginal likelihood estimation lack in practicality due to their dependence on many tuning parameters and their inability of most implementations to provide a direct way to calculate the uncertainties associated with the estimates, unlike NS. In this article, we introduce NS to phylogenetics. Its performance is analysed under different scenarios and compared to established methods. We conclude that NS is a competitive and attractive algorithm for phylogenetic inference. An implementation is available as a package for BEAST 2 under the LGPL licence, accessible at https://github.com/BEAST2-Dev/nested-sampling.

Journal ArticleDOI
TL;DR: New modules in the open-source PyCBC gravitational- wave astronomy toolkit that implement Bayesian inference for compact-object binary mergers are introduced and it is demonstrated that the PyCBC Inference modules produce unbiased estimates of the parameters of a simulated population of binary black hole mergers.
Abstract: We introduce new modules in the open-source PyCBC gravitational- wave astronomy toolkit that implement Bayesian inference for compact-object binary mergers. We review the Bayesian inference methods implemented and describe the structure of the modules. We demonstrate that the PyCBC Inference modules produce unbiased estimates of the parameters of a simulated population of binary black hole mergers. We show that the posterior parameter distributions obtained used our new code agree well with the published estimates for binary black holes in the first LIGO-Virgo observing run.

Journal ArticleDOI
TL;DR: An efficient Markov chain Monte Carlo scheme is developed, exploiting boosting based on the ancillarity-sufficiency interweaving strategy to automatically reduce time-varying parameters to static ones, if the model is overfitting.

Journal ArticleDOI
TL;DR: This is an introduction to Bayesian inference with a focus on hierarchical models and hyper-parameters, and includes extensive appendices discussing the creation of credible intervals, Gaussian noise, explicit marginalisation, posterior predictive distributions, and selection effects.
Abstract: This is an introduction to Bayesian inference with a focus on hierarchical models and hyper-parameters. We write primarily for an audience of Bayesian novices, but we hope to provide useful insights for seasoned veterans as well. Examples are drawn from gravitational-wave astronomy, though we endeavour for the presentation to be understandable to a broader audience. We begin with a review of the fundamentals: likelihoods, priors, and posteriors. Next, we discuss Bayesian evidence, Bayes factors, odds ratios, and model selection. From there, we describe how posteriors are estimated using samplers such as Markov Chain Monte Carlo algorithms and nested sampling. Finally, we generalise the formalism to discuss hyper-parameters and hierarchical models. We include extensive appendices discussing the creation of credible intervals, Gaussian noise, explicit marginalisation, posterior predictive distributions, and selection effects.

Book
02 Dec 2019
TL;DR: In this article, Bayesian Factor Analysis Model Likelihood Conjugate Priors and Posterior ConjugATE Estimation and Inference Generalized Priors, Posterior Generalized Estimation-and-Inference Interpretation Interpretation Discussion BAYESIAN this article.
Abstract: Introduction Part l: FUNDAMENTALS STATISTICAL DISTRIBUTIONS Scalar Distributions Vector Distributions Matrix Distributions INTRODUCTORY BAYESIAN STATISTICS Discrete Scalar Variables Continuous Scalar Variables Continuous Vector Variables Continuous Matrix Variables PRIOR DISTRIBUTIONS Vague Priors Conjugate Priors Generaliz ed Priors Correlation Priors HYPERPARAMETER ASSESSMENT Introduction Binomial Likelihood Scalar Normal Likelihood Multivariate Normal Likelihood Matrix Normal Likelihood BAYESIAN ESTIMATION METHODS Marginal Posterior Mean Maximum a Posteriori Advantages of ICM over Gibbs Sampling Advantages of Gibbs Sampling over ICM REGRESSION Introduction Normal Samples Simple Linear Regression Multiple Linear Regression Multivariate Linear Regression Part II: II Models BAYESIAN REGRESSION Introduction The Bayesian Regression Model Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion BAYESIAN FACTOR ANALYSIS Introduction The Bayesian Factor Analysis Model Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion BAYESIAN SOURCE SEPARATION Introduction Source Separation Model Source Separation Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion UNOBSERVABLE AND OBSERVABLE SOURCE SEPARATION Introduction Model Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion FMRI CASE STUDY Introduction Model Priors and Posterior Estimation and Inference Simulated FMRI Experiment Real FMRI Experiment FMRI Conclusion Part III: Generalizations DELAYED SOURCES AND DYNAMIC COEFFICIENTS Introduction Model Delayed Constant Mixing Delayed Nonconstant Mixing Instantaneous Nonconstant Mixing Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion CORRELATED OBSERVATION AND SOURCE VECTORS Introduction Model Likelihood Conjugate Priors and Posterior Conjugate Estimation and Inference Posterior Conditionals Generalized Priors and Posterior Generalized Estimation and Inference Interpretation Discussion CONCLUSION Appendix A FMRI Activation Determination Appendix B FMRI Hyperparameter Assessment Bibliography Index

Journal ArticleDOI
TL;DR: A novel variational Bayesian learning method for theDirichlet process (DP) mixture of the inverted Dirichlet distributions, which has been shown to be very flexible for modeling vectors with positive elements, that allows the automatic determination of the number of mixture components from data.
Abstract: In this paper, we develop a novel variational Bayesian learning method for the Dirichlet process (DP) mixture of the inverted Dirichlet distributions, which has been shown to be very flexible for modeling vectors with positive elements. The recently proposed extended variational inference (EVI) framework is adopted to derive an analytically tractable solution. The convergency of the proposed algorithm is theoretically guaranteed by introducing single lower bound approximation to the original objective function in the EVI framework. In principle, the proposed model can be viewed as an infinite inverted Dirichlet mixture model that allows the automatic determination of the number of mixture components from data. Therefore, the problem of predetermining the optimal number of mixing components has been overcome. Moreover, the problems of overfitting and underfitting are avoided by the Bayesian estimation approach. Compared with several recently proposed DP-related methods and conventional applied methods, the good performance and effectiveness of the proposed method have been demonstrated with both synthesized data and real data evaluations.

Journal ArticleDOI
TL;DR: The focus is on meeting challenges that arise from system identification and damage assessment for the civil infrastructure but the presented theories also have a considerably broader applicability for inverse problems in science and technology.
Abstract: Bayesian inference provides a powerful approach to system identification and damage assessment for structures. The application of Bayesian method is motivated by the fact that inverse problems in s...

Proceedings ArticleDOI
01 Oct 2019
TL;DR: A framework for recognizing human actions from skeleton data is proposed by modeling the underlying dynamic process that generates the motion pattern and an adversarial prior is developed to regularize the model parameters to improve the generalization of the model.
Abstract: We propose a framework for recognizing human actions from skeleton data by modeling the underlying dynamic process that generates the motion pattern. We capture three major factors that contribute to the complexity of the motion pattern including spatial dependencies among body joints, temporal dependencies of body poses, and variation among subjects in action execution. We utilize graph convolution to extract structure-aware feature representation from pose data by exploiting the skeleton anatomy. Long short-term memory (LSTM) network is then used to capture the temporal dynamics of the data. Finally, the whole model is extended under the Bayesian framework to a probabilistic model in order to better capture the stochasticity and variation in the data. An adversarial prior is developed to regularize the model parameters to improve the generalization of the model. A Bayesian inference problem is formulated to solve the classification task. We demonstrate the benefit of this framework in several benchmark datasets with recognition under various generalization conditions.

Journal ArticleDOI
TL;DR: After reading this tutorial and executing the associated code, researchers will be able to use their own data for the evaluation of hypotheses by means of the Bayes factor, not only in thecontext of ANOVA models, but also in the context of other statistical models.
Abstract: Learning about hypothesis evaluation using the Bayes factor could enhance psychological research. In contrast to null-hypothesis significance testing it renders the evidence in favor of each of the hypotheses under consideration (it can be used to quantify support for the null-hypothesis) instead of a dichotomous reject/do-not-reject decision; it can straightforwardly be used for the evaluation of multiple hypotheses without having to bother about the proper manner to account for multiple testing; and it allows continuous reevaluation of hypotheses after additional data have been collected (Bayesian updating). This tutorial addresses researchers considering to evaluate their hypotheses by means of the Bayes factor. The focus is completely applied and each topic discussed is illustrated using Bayes factors for the evaluation of hypotheses in the context of an ANOVA model, obtained using the R package bain. Readers can execute all the analyses presented while reading this tutorial if they download bain and the R-codes used. It will be elaborated in a completely nontechnical manner: what the Bayes factor is, how it can be obtained, how Bayes factors should be interpreted, and what can be done with Bayes factors. After reading this tutorial and executing the associated code, researchers will be able to use their own data for the evaluation of hypotheses by means of the Bayes factor, not only in the context of ANOVA models, but also in the context of other statistical models. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: It is shown that Bayesian inference enables more reliable prediction with quantitative uncertainty analysis and helps to improve the quality of prediction in deep neural networks.
Abstract: Deep neural networks have been increasingly used in various chemical fields. In the nature of a data-driven approach, their performance strongly depends on data used in training. Therefore, models developed in data-deficient situations can cause highly uncertain predictions, leading to vulnerable decision making. Here, we show that Bayesian inference enables more reliable prediction with quantitative uncertainty analysis. Decomposition of the predictive uncertainty into model- and data-driven uncertainties allows us to elucidate the source of errors for further improvements. For molecular applications, we devised a Bayesian graph convolutional network (GCN) and evaluated its performance for molecular property predictions. Our study on the classification problem of bio-activity and toxicity shows that the confidence of prediction can be quantified in terms of the predictive uncertainty, leading to more accurate virtual screening of drug candidates than standard GCNs. The result of log P prediction illustrates that data noise affects the data-driven uncertainty more significantly than the model-driven one. Based on this finding, we could identify artefacts that arose from quantum mechanical calculations in the Harvard Clean Energy Project dataset. Consequently, the Bayesian GCN is critical for molecular applications under data-deficient conditions.