scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian probability published in 2017"


Journal ArticleDOI
TL;DR: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan, allowing users to fit linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multileVEL context.
Abstract: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan. A wide range of distributions and link functions are supported, allowing users to fit - among others - linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multilevel context. Further modeling options include autocorrelation of the response variable, user defined covariance structures, censored data, as well as meta-analytic standard errors. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with the Watanabe-Akaike information criterion and leave-one-out cross-validation.

4,353 citations


Journal ArticleDOI
TL;DR: For instance, mean-field variational inference as discussed by the authors approximates probability densities through optimization, which is used in many applications and tends to be faster than classical methods, such as Markov chain Monte Carlo sampling.
Abstract: One of the core problems of modern statistics is to approximate difficult-to-compute probability densities. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation involving the posterior density. In this article, we review variational inference (VI), a method from machine learning that approximates probability densities through optimization. VI has been used in many applications and tends to be faster than classical methods, such as Markov chain Monte Carlo sampling. The idea behind VI is to first posit a family of densities and then to find a member of that family which is close to the target density. Closeness is measured by Kullback–Leibler divergence. We review the ideas behind mean-field variational inference, discuss the special case of VI applied to exponential family models, present a full example with a Bayesian mixture of Gaussians, and derive a variant that uses stochastic optimization to scale up to massive data...

3,421 citations


Journal ArticleDOI
TL;DR: bModelTest allows for a Bayesian approach to inferring and marginalizing site models in a phylogenetic analysis and does not need to be pre-determined, as is now often the case in practice, by likelihood-based methods.
Abstract: Reconstructing phylogenies through Bayesian methods has many benefits, which include providing a mathematically sound framework, providing realistic estimates of uncertainty and being able to incorporate different sources of information based on formal principles. Bayesian phylogenetic analyses are popular for interpreting nucleotide sequence data, however for such studies one needs to specify a site model and associated substitution model. Often, the parameters of the site model is of no interest and an ad-hoc or additional likelihood based analysis is used to select a single site model. bModelTest allows for a Bayesian approach to inferring and marginalizing site models in a phylogenetic analysis. It is based on trans-dimensional Markov chain Monte Carlo (MCMC) proposals that allow switching between substitution models as well as estimating the posterior probability for gamma-distributed rate heterogeneity, a proportion of invariable sites and unequal base frequencies. The model can be used with the full set of time-reversible models on nucleotides, but we also introduce and demonstrate the use of two subsets of time-reversible substitution models. With the new method the site model can be inferred (and marginalized) during the MCMC analysis and does not need to be pre-determined, as is now often the case in practice, by likelihood-based methods. The method is implemented in the bModelTest package of the popular BEAST 2 software, which is open source, licensed under the GNU Lesser General Public License and allows joint site model and tree inference under a wide range of models.

528 citations


Journal ArticleDOI
TL;DR: Integrated nested Laplace approximations (INLA) as mentioned in this paper approximates the integrand with a second-order Taylor expansion around the mode and computes the integral analytically.
Abstract: The key operation in Bayesian inference is to compute high-dimensional integrals. An old approximate technique is the Laplace method or approximation, which dates back to Pierre-Simon Laplace (1774). This simple idea approximates the integrand with a second-order Taylor expansion around the mode and computes the integral analytically. By developing a nested version of this classical idea, combined with modern numerical techniques for sparse matrices, we obtain the approach of integrated nested Laplace approximations (INLA) to do approximate Bayesian inference for latent Gaussian models (LGMs). LGMs represent an important model abstraction for Bayesian inference and include a large proportion of the statistical models used today. In this review, we discuss the reasons for the success of the INLA approach, the R-INLA package, why it is so accurate, why the approximations are very quick to compute, and why LGMs make such a useful concept for Bayesian computing.

458 citations


Book
26 Jun 2017
TL;DR: This authoritative text draws on theoretical advances of the past twenty years to synthesize all aspects of Bayesian nonparametrics, from prior construction to computation and large sample behavior of posteriors, making it valuable for both graduate students and researchers in statistics and machine learning.
Abstract: Explosive growth in computing power has made Bayesian methods for infinite-dimensional models - Bayesian nonparametrics - a nearly universal framework for inference, finding practical use in numerous subject areas. Written by leading researchers, this authoritative text draws on theoretical advances of the past twenty years to synthesize all aspects of Bayesian nonparametrics, from prior construction to computation and large sample behavior of posteriors. Because understanding the behavior of posteriors is critical to selecting priors that work, the large sample theory is developed systematically, illustrated by various examples of model and prior combinations. Precise sufficient conditions are given, with complete proofs, that ensure desirable posterior properties and behavior. Each chapter ends with historical notes and numerous exercises to deepen and consolidate the reader's understanding, making the book valuable for both graduate students and researchers in statistics and machine learning, as well as in application areas such as econometrics and biostatistics.

458 citations


Posted Content
TL;DR: In this paper, the authors extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rate per weight.
Abstract: We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.

424 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a method for visualizing Bayesian data analysis using trace plots of Markov chains, which is useful for drawing inferences from modern, high-dimensional models.
Abstract: Bayesian data analysis is about more than just computing a posterior distribution, and Bayesian visualization is about more than trace plots of Markov chains. Practical Bayesian data analysis, like all data analysis, is an iterative process of model building, inference, model checking and evaluation, and model expansion. Visualization is helpful in each of these stages of the Bayesian workflow and it is indispensable when drawing inferences from the types of modern, high-dimensional models that are used by applied researchers.

390 citations


Proceedings ArticleDOI
01 Jan 2017
TL;DR: Bayesian SegNet as discussed by the authors uses Monte Carlo sampling with dropout at test time to generate a posterior distribution of pixel class labels, which improves segmentation performance by 2-3% across a number of datasets and architectures.
Abstract: © 2017. The copyright of this document resides with its authors. We present a deep learning framework for probabilistic pixel-wise semantic segmentation, which we term Bayesian SegNet. Semantic segmentation is an important tool for visual scene understanding and a meaningful measure of uncertainty is essential for decision making. Our contribution is a practical system which is able to predict pixel-wise class labels with a measure of model uncertainty using Bayesian deep learning. We achieve this by Monte Carlo sampling with dropout at test time to generate a posterior distribution of pixel class labels. In addition, we show that modelling uncertainty improves segmentation performance by 2-3% across a number of datasets and architectures such as SegNet, FCN, Dilation Network and DenseNet.

276 citations


Journal ArticleDOI
TL;DR: A Bayes factor approach to multiway analysis of variance (ANOVA) that allows researchers to state graded evidence for effects or invariances as determined by the data is provided.
Abstract: This article provides a Bayes factor approach to multiway analysis of variance (ANOVA) that allows researchers to state graded evidence for effects or invariances as determined by the data. ANOVA is conceptualized as a hierarchical model where levels are clustered within factors. The development is comprehensive in that it includes Bayes factors for fixed and random effects and for within-subjects, between-subjects, and mixed designs. Different model construction and comparison strategies are discussed, and an example is provided. We show how Bayes factors may be computed with BayesFactor package in R and with the JASP statistical package. (PsycINFO Database Record

274 citations


Journal ArticleDOI
TL;DR: This paper resolves an apparent paradox in prior modeling: a model encoding true prior information should be chosen without reference to the model of the measurement process, but almost all common prior modeling techniques are implicitly motivated by a reference likelihood.
Abstract: A key sticking point of Bayesian analysis is the choice of prior distribution, and there is a vast literature on potential defaults including uniform priors, Jeffreys' priors, reference priors, maximum entropy priors, and weakly informative priors. These methods, however, often manifest a key conceptual tension in prior modeling: a model encoding true prior information should be chosen without reference to the model of the measurement process, but almost all common prior modeling techniques are implicitly motivated by a reference likelihood. In this paper we resolve this apparent paradox by placing the choice of prior into the context of the entire Bayesian analysis, from inference to prediction to model evaluation.

264 citations


Journal ArticleDOI
19 Oct 2017-Entropy
TL;DR: The authors place the choice of prior into the context of the entire Bayesian analysis, from inference to prediction, from prediction to model evaluation, and show that the prior distribution can be chosen without reference to the model of the measurement process, while most common prior modeling techniques are implicitly motivated by a reference likelihood.
Abstract: A key sticking point of Bayesian analysis is the choice of prior distribution, and there is a vast literature on potential defaults including uniform priors, Jeffreys’ priors, reference priors, maximum entropy priors, and weakly informative priors. These methods, however, often manifest a key conceptual tension in prior modeling: a model encoding true prior information should be chosen without reference to the model of the measurement process, but almost all common prior modeling techniques are implicitly motivated by a reference likelihood. In this paper we resolve this apparent paradox by placing the choice of prior into the context of the entire Bayesian analysis, from inference to prediction to model evaluation.

Journal ArticleDOI
TL;DR: This article uses simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies and closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity.
Abstract: This article offers a formal account of curiosity and insight in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how people attain insight and understanding using just a handful of observations, which are solicited through curious behavior. We use simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies. This epistemic behavior closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity. We then move from epistemic learning to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries (i.e., invariances or rules) in their generative models. The ensuing Bayesian model reduction evinces ...

Journal ArticleDOI
TL;DR: A succinct checklist, the WAMBS-checklist (When to worry and how to Avoid the Misuse of Bayesian Statistics), is developed to describe 10 main points that should be thoroughly checked when applying Bayesian analysis.
Abstract: Bayesian statistical methods are slowly creeping into all fields of science and are becoming ever more popular in applied research. Although it is very attractive to use Bayesian statistics, our personal experience has led us to believe that naively applying Bayesian methods can be dangerous for at least 3 main reasons: the potential influence of priors, misinterpretation of Bayesian features and results, and improper reporting of Bayesian results. To deal with these 3 points of potential danger, we have developed a succinct checklist: the WAMBS-checklist (When to worry and how to Avoid the Misuse of Bayesian Statistics). The purpose of the questionnaire is to describe 10 main points that should be thoroughly checked when applying Bayesian analysis. We provide an account of "when to worry" for each of these issues related to: (a) issues to check before estimating the model, (b) issues to check after estimating the model but before interpreting results, (c) understanding the influence of priors, and (d) actions to take after interpreting results. To accompany these key points of concern, we will present diagnostic tools that can be used in conjunction with the development and assessment of a Bayesian model. We also include examples of how to interpret results when "problems" in estimation arise, as well as syntax and instructions for implementation. Our aim is to stress the importance of openness and transparency of all aspects of Bayesian estimation, and it is our hope that the WAMBS questionnaire can aid in this process. (PsycINFO Database Record

Journal ArticleDOI
TL;DR: It is empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting.
Abstract: We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are heteroskedastic (though, significantly, there are no outliers). As sample size increases, the posterior puts its mass on worse and worse models of ever higher dimension. This is caused by hypercompression, the phenomenon that the posterior puts its mass on distributions that have much larger KL divergence from the ground truth than their average, i.e. the Bayes predictive distribution. To remedy the problem, we equip the likelihood in Bayes' theorem with an exponent called the learning rate, and we propose the SafeBayesian method to learn the learning rate from the data. SafeBayes tends to select small learning rates, and regularizes more, as soon as hypercompression takes place. Its results on our data are quite encouraging.

Journal ArticleDOI
TL;DR: It is found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions.
Abstract: Although the statistical tools most often used by researchers in the field of psychology over the last 25 years are based on frequentist statistics, it is often claimed that the alternative Bayesian approach to statistics is gaining in popularity. In the current article, we investigated this claim by performing the very first systematic review of Bayesian psychological articles published between 1990 and 2015 (n = 1,579). We aim to provide a thorough presentation of the role Bayesian statistics plays in psychology. This historical assessment allows us to identify trends and see how Bayesian methods have been integrated into psychological research in the context of different statistical frameworks (e.g., hypothesis testing, cognitive models, IRT, SEM, etc.). We also describe take-home messages and provide "big-picture" recommendations to the field as Bayesian statistics becomes more popular. Our review indicated that Bayesian statistics is used in a variety of contexts across subfields of psychology and related disciplines. There are many different reasons why one might choose to use Bayes (e.g., the use of priors, estimating otherwise intractable models, modeling uncertainty, etc.). We found in this review that the use of Bayes has increased and broadened in the sense that this methodology can be used in a flexible manner to tackle many different forms of questions. We hope this presentation opens the door for a larger discussion regarding the current state of Bayesian statistics, as well as future trends. (PsycINFO Database Record

Posted Content
TL;DR: An overview of recent trends in variational inference is given and a summary of promising future research directions is provided.
Abstract: Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully used in various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

Journal ArticleDOI
TL;DR: A new post-classification method with iterative slow feature analysis (ISFA) and Bayesian soft fusion is proposed to obtain reliable and accurate change detection maps and achieve a clearly higher change detection accuracy than the current state-of-the-art methods.

Journal ArticleDOI
TL;DR: A review of Monte Carlo-based methods for Bayesian data analysis can be found in this paper, where the authors explain the basics of Bayesian theory and discuss how to set up data analysis problems within this framework.
Abstract: Markov chain Monte Carlo–based Bayesian data analysis has now become the method of choice for analyzing and interpreting data in almost all disciplines of science. In astronomy, over the past decade, we have also seen a steady increase in the number of papers that employ Monte Carlo–based Bayesian analysis. New, efficient Monte Carlo–based methods are continuously being developed and explored. In this review, we first explain the basics of Bayesian theory and discuss how to set up data analysis problems within this framework. Next, we provide an overview of various Monte Carlo–based methods for performing Bayesian data analysis. Finally, we discuss advanced ideas that enable us to tackle complex problems and thus hold great promise for the future. We also distribute downloadable computer software (https://github.com/sanjibs/bmcmc) Python that implements some of the algorithms and examples discussed here.

Journal ArticleDOI
TL;DR: Developing approximate inference techniques to solve fundamental problems in signal processing, such as localization of objects in wireless sensor networks and the Internet of Things, and multiple source reconstruction from electroencephalograms.
Abstract: A fundamental problem in signal processing is the estimation of unknown parameters or functions from noisy observations. Important examples include localization of objects in wireless sensor networks [1] and the Internet of Things [2]; multiple source reconstruction from electroencephalograms [3]; estimation of power spectral density for speech enhancement [4]; or inference in genomic signal processing [5]. Within the Bayesian signal processing framework, these problems are addressed by constructing posterior probability distributions of the unknowns. The posteriors combine optimally all of the information about the unknowns in the observations with the information that is present in their prior probability distributions. Given the posterior, one often wants to make inference about the unknowns, e.g., if we are estimating parameters, finding the values that maximize their posterior or the values that minimize some cost function given the uncertainty of the parameters. Unfortunately, obtaining closed-form solutions to these types of problems is infeasible in most practical applications, and therefore, developing approximate inference techniques is of utmost interest.

Journal ArticleDOI
TL;DR: The Bayesian advantages of the newly developed statistical software program JASP are discussed using real data on the relation between Quality of Life and Executive Functioning in children with Autism Spectrum Disorder.
Abstract: We illustrate the Bayesian approach to data analysis using the newly developed statistical software program JASP. With JASP, researchers are able to take advantage of the benefits that the Bayesian framework has to offer in terms of parameter estimation and hypothesis testing. The Bayesian advantages are discussed using real data on the relation between Quality of Life and Executive Functioning in children with Autism Spectrum Disorder.

Journal ArticleDOI
TL;DR: RWTY as mentioned in this paper is an R package that implements established and new methods for diagnosing phylogenetic MCMC convergence in a single convenient interface, which can be used for large data sets.
Abstract: Bayesian inference using Markov chain Monte Carlo (MCMC) has become one of the primary methods used to infer phylogenies from sequence data. Assessing convergence is a crucial component of these analyses, as it establishes the reliability of the posterior distribution estimates of the tree topology and model parameters sampled from the MCMC. Numerous tests and visualizations have been developed for this purpose, but many of the most popular methods are implemented in ways that make them inconvenient to use for large data sets. RWTY is an R package that implements established and new methods for diagnosing phylogenetic MCMC convergence in a single convenient interface.

Posted Content
TL;DR: This article proposed a Bayesian causal forest model for estimating heterogeneous treatment effects from observational data, which is geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding.
Abstract: This paper presents a novel nonlinear regression model for estimating heterogeneous treatment effects from observational data, geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding. Standard nonlinear regression models, which may work quite well for prediction, have two notable weaknesses when used to estimate heterogeneous treatment effects. First, they can yield badly biased estimates of treatment effects when fit to data with strong confounding. The Bayesian causal forest model presented in this paper avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariate-dependent prior on the regression function. Second, standard approaches to response surface modeling do not provide adequate control over the strength of regularization over effect heterogeneity. The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively "shrink to homogeneity". We illustrate these benefits via the reanalysis of an observational study assessing the causal effects of smoking on medical expenditures as well as extensive simulation studies.

Journal Article
TL;DR: The method (Bayesian Rule Sets - BRS) is applied to characterize and predict user behavior with respect to in-vehicle context-aware personalized recommender systems and has a major advantage over classical associative classification methods and decision trees.
Abstract: We present a machine learning algorithm for building classifiers that are comprised of a small number of short rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: If X satisfies (condition A AND condition B) OR (condition C) OR ..., then Y = 1. Models of this form have the advantage of being interpretable to human experts since they produce a set of rules that concisely describe a specific class. We present two probabilistic models with prior parameters that the user can set to encourage the model to have a desired size and shape, to conform with a domain-specific definition of interpretability. We provide a scalable MAP inference approach and develop theoretical bounds to reduce computation by iteratively pruning the search space. We apply our method (Bayesian Rule Sets - BRS) to characterize and predict user behavior with respect to in-vehicle context-aware personalized recommender systems. Our method has a major advantage over classical associative classification methods and decision trees in that it does not greedily grow the model.

Journal ArticleDOI
TL;DR: It is argued that predictive coding is an algorithmic/representational motif that can serve several different computational goals of which Bayesian inference is but one, and that whileBayesian inference can utilize predictive coding, it can also be realized by a variety of other representations.

Journal ArticleDOI
TL;DR: This work presents a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization, and exposes a direct equivalence between this microcanonical approach and alternative derivations based on the canonical SBM.
Abstract: A principled approach to characterize the hidden structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.

Posted Content
TL;DR: This work shows that a simple adaptation of truncated backpropagation through time can yield good quality uncertainty estimates and superior regularisation at only a small extra computational cost during training, and demonstrates how a novel kind of posterior approximation yields further improvements to the performance of Bayesian RNNs.
Abstract: In this work we explore a straightforward variational Bayes scheme for Recurrent Neural Networks. Firstly, we show that a simple adaptation of truncated backpropagation through time can yield good quality uncertainty estimates and superior regularisation at only a small extra computational cost during training, also reducing the amount of parameters by 80\%. Secondly, we demonstrate how a novel kind of posterior approximation yields further improvements to the performance of Bayesian RNNs. We incorporate local gradient information into the approximate posterior to sharpen it around the current batch statistics. We show how this technique is not exclusive to recurrent neural networks and can be applied more widely to train Bayesian neural networks. We also empirically demonstrate how Bayesian RNNs are superior to traditional RNNs on a language modelling benchmark and an image captioning task, as well as showing how each of these methods improve our model over a variety of other schemes for training them. We also introduce a new benchmark for studying uncertainty for language models so future methods can be easily compared.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model was developed, which integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data.
Abstract: We develop a Bayesian method for inferring the species phylogeny under the multispecies coalescent (MSC) model. To improve the mixing properties of the Markov chain Monte Carlo (MCMC) algorithm that traverses the space of species trees, we implement two efficient MCMC proposals: the first is based on the Subtree Pruning and Regrafting (SPR) algorithm and the second is based on a node-slider algorithm. Like the Nearest-Neighbor Interchange (NNI) algorithm we implemented previously, both new algorithms propose changes to the species tree, while simultaneously altering the gene trees at multiple genetic loci to automatically avoid conflicts with the newly proposed species tree. The method integrates over gene trees, naturally taking account of the uncertainty of gene tree topology and branch lengths given the sequence data. A simulation study was performed to examine the statistical properties of the new method. The method was found to show excellent statistical performance, inferring the correct species tree with near certainty when 10 loci were included in the dataset. The prior on species trees has some impact, particularly for small numbers of loci. We analyzed several previously published datasets (both real and simulated) for rattlesnakes and Philippine shrews, in comparison with alternative methods. The results suggest that the Bayesian coalescent-based method is statistically more efficient than heuristic methods based on summary statistics, and that our implementation is computationally more efficient than alternative full-likelihood methods under the MSC. Parameter estimates for the rattlesnake data suggest drastically different evolutionary dynamics between the nuclear and mitochondrial loci, even though they support largely consistent species trees. We discuss the different challenges facing the marginal likelihood calculation and transmodel MCMC as alternative strategies for estimating posterior probabilities for species trees. [Bayes factor; Bayesian inference; MCMC; multispecies coalescent; nodeslider; species tree; SPR.].

Journal ArticleDOI
TL;DR: A Bayesian framework for face sketch synthesis is proposed, which provides a systematic interpretation for understanding the common properties and intrinsic difference in different methods from the perspective of probabilistic graphical models.
Abstract: Exemplar-based face sketch synthesis has been widely applied to both digital entertainment and law enforcement. In this paper, we propose a Bayesian framework for face sketch synthesis, which provides a systematic interpretation for understanding the common properties and intrinsic difference in different methods from the perspective of probabilistic graphical models. The proposed Bayesian framework consists of two parts: the neighbor selection model and the weight computation model. Within the proposed framework, we further propose a Bayesian face sketch synthesis method. The essential rationale behind the proposed Bayesian method is that we take the spatial neighboring constraint between adjacent image patches into consideration for both aforementioned models, while the state-of-the-art methods neglect the constraint either in the neighbor selection model or in the weight computation model. Extensive experiments on the Chinese University of Hong Kong face sketch database demonstrate that the proposed Bayesian method could achieve superior performance compared with the state-of-the-art methods in terms of both subjective perceptions and objective evaluations.

Journal ArticleDOI
TL;DR: The major features of Bayesian phylogenetic inference are summarized and Bayesian computation using Markov chain Monte Carlo sampling, the diagnosis of an MCMC run, and ways of summarizing the MCMC sample are discussed.
Abstract: Bayesian methods have become very popular in molecular phylogenetics due to the availability of user-friendly software for running sophisticated models of evolution. However, Bayesian phylogenetic models are complex, and analyses are often carried out using default settings, which may not be appropriate. Here we summarize the major features of Bayesian phylogenetic inference and discuss Bayesian computation using Markov chain Monte Carlo (MCMC) sampling, the diagnosis of an MCMC run, and ways of summarizing the MCMC sample. We discuss the specification of the prior, the choice of the substitution model and partitioning of the data. Finally, we provide a list of common Bayesian phylogenetic software packages and recommend appropriate applications.

Posted Content
TL;DR: Using the decomposition of uncertainty in aleatoric and epistemic components for decision making purposes, a novel risk-sensitive criterion for reinforcement learning is defined to identify policies that balance expected cost, model-bias and noise aversion.
Abstract: Bayesian neural networks with latent variables are scalable and flexible probabilistic models: They account for uncertainty in the estimation of the network weights and, by making use of latent variables, can capture complex noise patterns in the data. We show how to extract and decompose uncertainty into epistemic and aleatoric components for decision-making purposes. This allows us to successfully identify informative points for active learning of functions with heteroscedastic and bimodal noise. Using the decomposition we further define a novel risk-sensitive criterion for reinforcement learning to identify policies that balance expected cost, model-bias and noise aversion.