scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian inference published in 2010"


Journal ArticleDOI
TL;DR: It is demonstrated that both BEST and the new Bayesian Markov chain Monte Carlo method for the multispecies coalescent have much better estimation accuracy for species tree topology than concatenation, and the method outperforms BEST in divergence time and population size estimation.
Abstract: Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple individuals per species. These data sets often reveal the need to directly model intraspecies polymorphism and incomplete lineage sorting in phylogenetic estimation procedures. For a single species, coalescent theory is widely used in contemporary population genetics to model intraspecific gene trees. Here, we present a Bayesian Markov chain Monte Carlo method for the multispecies coalescent. Our method coestimates multiple gene trees embedded in a shared species tree along with the effective population size of both extant and ancestral species. The inference is made possible by multilocus data from multiple individuals per species. Using a multiindividual data set and a series of simulations of rapid species radiations, we demonstrate the efficacy of our new method. These simulations give some insight into the behavior of the method as a function of sampled individuals, sampled loci, and sequence length. Finally, we compare our new method to both an existing method (BEST 2.2) with similar goals and the supermatrix (concatenation) method. We demonstrate that both BEST and our method have much better estimation accuracy for species tree topology than concatenation, and our method outperforms BEST in divergence time and population size estimation.

2,401 citations


Journal ArticleDOI
TL;DR: Although the method arose in population genetics, ABC is increasingly used in other fields, including epidemiology, systems biology, ecology, and agent-based modeling, and many of these applications are briefly described.
Abstract: In the past 10 years a statistical technique, approximate Bayesian computation (ABC), has been developed that can be used to infer parameters and choose between models in the complicated scenarios that are often considered in the environmental sciences. For example, based on gene sequence and microsatellite data, the method has been used to choose between competing models of human demographic history as well as to infer growth rates, times of divergence, and other parameters. The method fits naturally in the Bayesian inferential framework, and a brief overview is given of the key concepts. Three main approaches to ABC have been developed, and these are described and compared. Although the method arose in population genetics, ABC is increasingly used in other fields, including epidemiology, systems biology, ecology, and agent-based modeling, and many of these applications are briefly described.

981 citations


Journal ArticleDOI
TL;DR: Three approaches for making the naive Bayes classifier discrimination-free are presented: modifying the probability of the decision being positive, training one model for every sensitive attribute value and balancing them, and adding a latent variable to the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using expectation maximization.
Abstract: In this paper, we investigate how to modify the naive Bayes classifier in order to perform classification that is restricted to be independent with respect to a given sensitive attribute. Such independency restrictions occur naturally when the decision process leading to the labels in the data-set was biased; e.g., due to gender or racial discrimination. This setting is motivated by many cases in which there exist laws that disallow a decision that is partly based on discrimination. Naive application of machine learning techniques would result in huge fines for companies. We present three approaches for making the naive Bayes classifier discrimination-free: (i) modifying the probability of the decision being positive, (ii) training one model for every sensitive attribute value and balancing them, and (iii) adding a latent variable to the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using expectation maximization. We present experiments for the three approaches on both artificial and real-life data.

750 citations


Journal ArticleDOI
TL;DR: A combination of two further approaches: family level inference and Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure are proposed.
Abstract: Mathematical models of scientific data can be formally compared using Bayesian model evidence. Previous applications in the biological sciences have mainly focussed on model selection in which one first selects the model with the highest evidence and then makes inferences based on the parameters of that model. This "best model" approach is very useful but can become brittle if there are a large number of models to compare, and if different subjects use different models. To overcome this shortcoming we propose the combination of two further approaches: (i) family level inference and (ii) Bayesian model averaging within families. Family level inference removes uncertainty about aspects of model structure other than the characteristic of interest. For example: What are the inputs to the system? Is processing serial or parallel? Is it linear or nonlinear? Is it mediated by a single, crucial connection? We apply Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure. We illustrate the methods using Dynamic Causal Models of brain imaging data.

680 citations


Journal ArticleDOI
TL;DR: In this article, the authors focus on the total predictive uncertainty and its decomposition into input and structural components under different inference scenarios, and highlight the inherent limitations of inferring inaccurate hydrologic models using rainfall runoff data with large unknown errors.
Abstract: [1] Meaningful quantification of data and structural uncertainties in conceptual rainfall-runoff modeling is a major scientific and engineering challenge. This paper focuses on the total predictive uncertainty and its decomposition into input and structural components under different inference scenarios. Several Bayesian inference schemes are investigated, differing in the treatment of rainfall and structural uncertainties, and in the precision of the priors describing rainfall uncertainty. Compared with traditional lumped additive error approaches, the quantification of the total predictive uncertainty in the runoff is improved when rainfall and/or structural errors are characterized explicitly. However, the decomposition of the total uncertainty into individual sources is more challenging. In particular, poor identifiability may arise when the inference scheme represents rainfall and structural errors using separate probabilistic models. The inference becomes ill-posed unless sufficiently precise prior knowledge of data uncertainty is supplied; this ill-posedness can often be detected from the behavior of the Monte Carlo sampling algorithm. Moreover, the priors on the data quality must also be sufficiently accurate if the inference is to be reliable and support meaningful uncertainty decomposition. Our findings highlight the inherent limitations of inferring inaccurate hydrologic models using rainfall-runoff data with large unknown errors. Bayesian total error analysis can overcome these problems using independent prior information. The need for deriving independent descriptions of the uncertainties in the input and output data is clearly demonstrated.

622 citations


Journal ArticleDOI
TL;DR: A performance-optimizing Bayesian model that takes the underlying distribution of samples into account provided an accurate description of subjects' performance, variability and bias and suggests that the CNS incorporates knowledge about temporal uncertainty to adapt internal timing mechanisms to the temporal statistics of the environment.
Abstract: The authors find that a person's estimate of a time interval exhibits biases that depend on both its duration and the distribution from which it is drawn. This behavioral pattern could be described using a Bayesian model. These findings suggest that internal timing mechanisms can adapt to the temporal statistics of the environment to minimize uncertainty. We use our sense of time to identify temporal relationships between events and to anticipate actions. The degree to which we can exploit temporal contingencies depends on the variability of our measurements of time. We asked humans to reproduce time intervals drawn from different underlying distributions. As expected, production times were more variable for longer intervals. However, production times exhibited a systematic regression toward the mean. Consequently, estimates for a sample interval differed depending on the distribution from which it was drawn. A performance-optimizing Bayesian model that takes the underlying distribution of samples into account provided an accurate description of subjects' performance, variability and bias. This finding suggests that the CNS incorporates knowledge about temporal uncertainty to adapt internal timing mechanisms to the temporal statistics of the environment.

612 citations


Journal ArticleDOI
TL;DR: A Bayesian statistical approach is presented to infer continuous phylogeographic diffusion using random walk models while simultaneously reconstructing the evolutionary history in time from molecular sequence data and demonstrates increased statistical efficiency in spatial reconstructions of overdispersed random walks.
Abstract: Research aimed at understanding the geographic context of evolutionary histories is burgeoning across biological disciplines. Recent endeavors attempt to interpret contemporaneous genetic variation in the light of increasingly detailed geographical and environmental observations. Such interest has promoted the development of phylogeographic inference techniques that explicitly aim to integrate such heterogeneous data. One promising development involves reconstructing phylogeographic history on a continuous landscape. Here, we present a Bayesian statistical approach to infer continuous phylogeographic diffusion using random walk models while simultaneously reconstructing the evolutionary history in time from molecular sequence data. Moreover, by accommodating branch-specific variation in dispersal rates, we relax the most restrictive assumption of the standard Brownian diffusion process and demonstrate increased statistical efficiency in spatial reconstructions of overdispersed random walks by analyzing both simulated and real viral genetic data. We further illustrate how drawing inference about summary statistics from a fully specified stochastic process over both sequence evolution and spatial movement reveals important characteristics of a rabies epidemic. Together with recent advances in discrete phylogeographic inference, the continuous model developments furnish a flexible statistical framework for biogeographical reconstructions that is easily expanded upon to accommodate various landscape genetic features.

594 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized likelihood function is presented to estimate both the parameter and predictive uncertainty of hydrologic models, which can be used for handling complex residual errors in other models.
Abstract: Estimation of parameter and predictive uncertainty of hydrologic models has traditionally relied on several simplifying assumptions. Residual errors are often assumed to be independent and to be adequately described by a Gaussian probability distribution with a mean of zero and a constant variance. Here we investigate to what extent estimates of parameter and predictive uncertainty are affected when these assumptions are relaxed. A formal generalized likelihood function is presented, which extends the applicability of previously used likelihood functions to situations where residual errors are correlated, heteroscedastic, and non?Gaussian with varying degrees of kurtosis and skewness. The approach focuses on a correct statistical description of the data and the total model residuals, without separating out various error sources. Application to Bayesian uncertainty analysis of a conceptual rainfall?runoff model simultaneously identifies the hydrologic model parameters and the appropriate statistical distribution of the residual errors. When applied to daily rainfall?runoff data from a humid basin we find that (1) residual errors are much better described by a heteroscedastic, first?order, auto?correlated error model with a Laplacian distribution function characterized by heavier tails than a Gaussian distribution; and (2) compared to a standard least?squares approach, proper representation of the statistical distribution of residual errors yields tighter predictive uncertainty bands and different parameter uncertainty estimates that are less sensitive to the particular time period used for inference. Application to daily rainfall?runoff data from a semiarid basin with more significant residual errors and systematic underprediction of peak flows shows that (1) multiplicative bias factors can be used to compensate for some of the largest errors and (2) a skewed error distribution yields improved estimates of predictive uncertainty in this semiarid basin with near?zero flows. We conclude that the presented methodology provides improved estimates of parameter and total prediction uncertainty and should be useful for handling complex residual errors in other hydrologic regression models as well.

510 citations


Journal ArticleDOI
TL;DR: A machine-learning approach to the estimation of the posterior density by introducing two innovations that fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling.
Abstract: Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.

504 citations


Journal ArticleDOI
TL;DR: This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data‐fit of more complex model classes that extract more information from the data.
Abstract: Probability logic with Bayesian updating provides a rigorous framework to quantify modeling uncertainty and perform system identification. It uses probability as a multi-valued propositional logic for plausible reasoning where the probability of a model is a measure of its relative plausibility within a set of models. System identification is thus viewed as inference about plausible system models and not as a quixotic quest for the true model. Instead of using system data to estimate the model parameters, Bayes' Theorem is used to update the relative plausibility of each model in a model class, which is a set of input–output probability models for the system and a probability distribution over this set that expresses the initial plausibility of each model. Robust predictive analyses informed by the system data use the entire model class with the probabilistic predictions of each model being weighed by its posterior probability. Additional robustness to modeling uncertainty comes from combining the robust predictions of each model class in a set of candidates for the system, where each contribution is weighed by the posterior probability of the model class. This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data-fit of more complex model classes that extract more information from the data. Robust analyses involve integrals over parameter spaces that usually must be evaluated numerically by Laplace's method of asymptotic approximation or by Markov Chain Monte Carlo methods. An illustrative application is given using synthetic data corresponding to a structural health monitoring benchmark structure.

497 citations


Journal ArticleDOI
TL;DR: In this paper, a two-state mixture Gaussian model is used to perform asymptotically optimal Bayesian inference using belief propagation decoding, which represents the CS encoding matrix as a graphical model.
Abstract: Compressive sensing (CS) is an emerging field based on the revelation that a small collection of linear projections of a sparse signal contains enough information for stable, sub-Nyquist signal acquisition When a statistical characterization of the signal is available, Bayesian inference can complement conventional CS methods based on linear programming or greedy algorithms We perform asymptotically optimal Bayesian inference using belief propagation (BP) decoding, which represents the CS encoding matrix as a graphical model Fast computation is obtained by reducing the size of the graphical model with sparse encoding matrices To decode a length-N signal containing K large coefficients, our CS-BP decoding algorithm uses O(K log(N)) measurements and O(N log2(N)) computation Finally, although we focus on a two-state mixture Gaussian model, CS-BP is easily adapted to other signal models

Book
19 Jul 2010
TL;DR: Statistical Approaches for Clinical Trials Introduction Comparisons between Bayesian and frequentist approaches Adaptivity in clinical trials Features and use of the Bayesian adaptive approach Basics of Bayesian Inference.
Abstract: Statistical Approaches for Clinical Trials Introduction Comparisons between Bayesian and frequentist approaches Adaptivity in clinical trials Features and use of the Bayesian adaptive approach Basics of Bayesian Inference Introduction to Bayes' theorem Bayesian inference Bayesian computation Hierarchical modeling and metaanalysis Principles of Bayesian clinical trial design Appendix: R Macros Phase I Studies Rule-based designs for determining the MTD Model-based designs for determining the MTD Efficacy versus toxicity Combination therapy Appendix: R Macros Phase II Studies Standard designs Predictive probability Sequential stopping Adaptive randomization and dose allocation Dose ranging and optimal biologic dosing Hierarchical models for Phase II designs Decision theoretic designs Case studies: BATTLE and ISPY-2 Appendix: R Macros Phase III Studies Introduction to confirmatory studies Bayesian adaptive confirmatory trials Arm dropping Modeling and prediction Prior distributions and the paradigm clash Phase III cancer trials Phase II/III seamless trials Case study: Ablation device to treat atrial fibrillation Appendix: R Macros Special Topics Incorporating historical data Equivalence studies Multiplicity Subgroup analysis Appendix: R Macros References Author Index Subject Index

Book
05 Apr 2010
TL;DR: This book discusses Bayesian Model Class Selection using Eigenvalue-Eigenvector Measurements, a relationship between the Hessian and Covariance Matrix for Gaussian Random Variables, and the Conditional PDF for Prediction.
Abstract: Contents Preface Nomenclature 1 Introduction 1.1 Thomas Bayes and Bayesian Methods in Engineering 1.2 Purpose of Model Updating 1.3 Source of Uncertainty and Bayesian Updating 1.4 Organization of the Book 2 Basic Concepts and Bayesian Probabilistic Framework 2.1 Conditional Probability and Basic Concepts 2.2 Bayesian Model Updating with Input-output Measurements 2.3 Deterministic versus Probabilistic Methods 2.4 Regression Problems 2.5 Numerical Representation of the Updated PDF 2.6 Application to Temperature Effects on Structural Behavior 2.7 Application to Noise Parameters Selection for Kalman Filter 2.8 Application to Prediction of Particulate Matter Concentration 3 Bayesian Spectral Density Approach 3.1 Modal and Model Updating of Dynamical Systems 3.2 Random Vibration Analysis 3.3 Bayesian Spectral Density Approach 3.4 Numerical Verifications 3.5 Optimal Sensor Placement 3.6 Updating of a Nonlinear Oscillator 3.7 Application to Structural Behavior under Typhoons 3.8 Application to Hydraulic Jump 4 Bayesian Time-domain Approach 4.1 Introduction 4.2 Exact Bayesian Formulation and its Computational Difficulties 4.3 Random Vibration Analysis of Nonstationary Response 4.4 Bayesian Updating with Approximated PDF Expansion 4.5 Numerical Verification 4.6 Application to Model Updating with Unmeasured Earthquake Ground Motion 4.7 Concluding Remarks 4.8 Comparison of Spectral Density Approach and Time-domain Approach 4.9 Extended Readings 5 Model Updating Using Eigenvalue-Eigenvector Measurements 5.1 Introduction 5.2 Formulation 5.3 Linear Optimization Problems 5.4 Iterative Algorithm 5.5 Uncertainty Estimation 5.6 Applications to Structural Health Monitoring 5.7 Concluding Remarks 6 Bayesian Model Class Selection 6.1 Introduction 6.2 Bayesian Model Class Selection 6.3 Model Class Selection for Regression Problems 6.4 Application to Modal Updating 6.5 Application to Seismic Attenuation Empirical Relationship 6.6 Prior Distributions - Revisited 6.7 Final Remarks A Relationship between the Hessian and Covariance Matrix for Gaussian Random Variables B Contours of Marginal PDFs for Gaussian Random Variables C Conditional PDF for Prediction C.1 Two Random Variables C.2 General Cases References Index

Journal ArticleDOI
TL;DR: A new method for relaxing the assumption of a strict molecular clock using Markov chain Monte Carlo to implement Bayesian modeling averaging over random local molecular clocks is presented, suggesting that large sequence datasets may only require a small number of local molecular clock models to reconcile their branch lengths with a time scale.
Abstract: Relaxed molecular clock models allow divergence time dating and "relaxed phylogenetic" inference, in which a time tree is estimated in the face of unequal rates across lineages. We present a new method for relaxing the assumption of a strict molecular clock using Markov chain Monte Carlo to implement Bayesian modeling averaging over random local molecular clocks. The new method approaches the problem of rate variation among lineages by proposing a series of local molecular clocks, each extending over a subregion of the full phylogeny. Each branch in a phylogeny (subtending a clade) is a possible location for a change of rate from one local clock to a new one. Thus, including both the global molecular clock and the unconstrained model results, there are a total of 22n-2 possible rate models available for averaging with 1, 2, ..., 2n - 2 different rate categories. We propose an efficient method to sample this model space while simultaneously estimating the phylogeny. The new method conveniently allows a direct test of the strict molecular clock, in which one rate rules them all, against a large array of alternative local molecular clock models. We illustrate the method's utility on three example data sets involving mammal, primate and influenza evolution. Finally, we explore methods to visualize the complex posterior distribution that results from inference under such models. The examples suggest that large sequence datasets may only require a small number of local molecular clocks to reconcile their branch lengths with a time scale. All of the analyses described here are implemented in the open access software package BEAST 1.5.4 ( http://beast-mcmc.googlecode.com/ ).

Journal ArticleDOI
TL;DR: The authors argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism, and examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision.
Abstract: A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.

Book
01 Jan 2010
TL;DR: In this article, the authors describe tools commonly used in transportation data analysis, including count and discrete dependent variable models, mixed logit models, logistic regression, and ordered probability models.
Abstract: Now in its second edition, this book describes tools that are commonly used in transportation data analysis. The first part of the text provides statistical fundamentals while the second part presents continuous dependent variable models. With a focus on count and discrete dependent variable models, the third part features new chapters on mixed logit models, logistic regression, and ordered probability models. The last section provides additional coverage of Bayesian statistical modeling, including Bayesian inference and Markov chain Monte Carlo methods. Data sets are available online to use with the modeling techniques discussed.

Book
17 Jun 2010
TL;DR: This monograph discusses VARs, factor augmented V ARs and time-varying parameter extensions and shows how Bayesian inference proceeds and offers advice on how to use these models and methods in practice.
Abstract: Macroeconomic practitioners frequently work with multivariate time series models such as VARs, factor augmented VARs as well as time-varying parameter versions of these models (including variants with multivariate stochastic volatility). These models have a large number of parameters and, thus, over-parameterization problems may arise. Bayesian methods have become increasingly popular as a way of overcoming these problems. In this monograph, we discuss VARs, factor augmented VARs and time-varying parameter extensions and show how Bayesian inference proceeds. Apart from the simplest of VARs, Bayesian inference requires the use of Markov chain Monte Carlo methods developed for state space models and we describe these algorithms. The focus is on the empirical macroeconomist and we offer advice on how to use these models and methods in practice and include empirical illustrations. A website provides Matlab code for carrying out Bayesian inference in these models.

Journal Article
TL;DR: A probabilistic formulation of PCA provides a good foundation for handling missing values, and formulas for doing that are provided, and a novel fast algorithm is introduced and extended to variational Bayesian learning.
Abstract: Principal component analysis (PCA) is a classical data analysis technique that finds linear transformations of data that retain the maximal amount of variance. We study a case where some of the data values are missing, and show that this problem has many features which are usually associated with nonlinear models, such as overfitting and bad locally optimal solutions. A probabilistic formulation of PCA provides a good foundation for handling missing values, and we provide formulas for doing that. In case of high dimensional and very sparse data, overfitting becomes a severe problem and traditional algorithms for PCA are very slow. We introduce a novel fast algorithm and extend it to variational Bayesian learning. Different versions of PCA are compared in artificial experiments, demonstrating the effects of regularization and modeling of posterior variance. The scalability of the proposed algorithm is demonstrated by applying it to the Netflix problem.

Journal ArticleDOI
TL;DR: ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Abstract: The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.

Journal ArticleDOI
TL;DR: It is argued that Monte Carlo methods provide a source of rational process models that connect optimal solutions to psychological processes and is proposed that a particle filter with a single particle provides a good description of human inferences.
Abstract: Rational models of cognition typically consider the abstract computational problems posed by the environment, assuming that people are capable of optimally solving those problems. This differs from more traditional formal models of cognition, which focus on the psychological processes responsible for behavior. A basic challenge for rational models is thus explaining how optimal solutions can be approximated by psychological processes. We outline a general strategy for answering this question, namely to explore the psychological plausibility of approximation algorithms developed in computer science and statistics. In particular, we argue that Monte Carlo methods provide a source of rational process models that connect optimal solutions to psychological processes. We support this argument through a detailed example, applying this approach to Anderson's (1990, 1991) rational model of categorization (RMC), which involves a particularly challenging computational problem. Drawing on a connection between the RMC and ideas from nonparametric Bayesian statistics, we propose 2 alternative algorithms for approximate inference in this model. The algorithms we consider include Gibbs sampling, a procedure appropriate when all stimuli are presented simultaneously, and particle filters, which sequentially approximate the posterior distribution with a small number of samples that are updated as new data become available. Applying these algorithms to several existing datasets shows that a particle filter with a single particle provides a good description of human inferences.

Journal ArticleDOI
TL;DR: The third edition of this textbook has more in-depth coverage of many biostatistical problems, especially in Chapters 7 and 8, and it is at a slightly higher statistical level with enough materials to choose from for a course at either the intermediate level or advanced research level.

Journal ArticleDOI
TL;DR: It seems that a common computational strategy, which is highly consistent with a normative model of causal inference, is exploited by the perceptual system in a variety of domains.

Journal ArticleDOI
TL;DR: It is shown that online Bayesian inference within a model that assumes an unbounded number of latent causes can characterize a diverse set of behavioral results from such manipulations, some of which pose problems for the model of Redish et al. (2007).
Abstract: A. Redish et al. (2007) proposed a reinforcement learning model of context-dependent learning and extinction in conditioning experiments, using the idea of "state classification" to categorize new observations into states. In the current article, the authors propose an interpretation of this idea in terms of normative statistical inference. They focus on renewal and latent inhibition, 2 conditioning paradigms in which contextual manipulations have been studied extensively, and show that online Bayesian inference within a model that assumes an unbounded number of latent causes can characterize a diverse set of behavioral results from such manipulations, some of which pose problems for the model of Redish et al. Moreover, in both paradigms, context dependence is absent in younger animals, or if hippocampal lesions are made prior to training. The authors suggest an explanation in terms of a restricted capacity to infer new causes.

BookDOI
16 Dec 2010
TL;DR: Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks, focusing on both the causal discovery of networks and Bayesian inference procedures.
Abstract: Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology. New to the Second Edition New chapter on Bayesian network classifiers New section on object-oriented Bayesian networks New section that addresses foundational problems with causal discovery and Markov blanket discovery New section that covers methods of evaluating causal discovery programs Discussions of many common modeling errors New applications and case studies More coverage on the uses of causal interventions to understand and reason with causal Bayesian networks Illustrated with real case studies, the second edition of this bestseller continues to cover the groundwork of Bayesian networks. It presents the elements of Bayesian network technology, automated causal discovery, and learning probabilities from data and shows how to employ these technologies to develop probabilistic expert systems. Web ResourceThe books website at www.csse.monash.edu.au/bai/book/book.html offers a variety of supplemental materials, including example Bayesian networks and data sets. Instructors can email the authors for sample solutions to many of the problems in the text.

Journal ArticleDOI
TL;DR: A neural model of action selection and decision making based on the theory of partially observable Markov decision processes (POMDPs) is proposed and suggests an important role for interactions between the neocortex and the basal ganglia in learning the mapping between probabilistic sensory representations and actions that maximize rewards.
Abstract: A fundamental problem faced by animals is learning to select actions based on noisy sensory information and incomplete knowledge of the world. It has been suggested that the brain engages in Bayesian inference during perception but how such probabilistic representations are used to select actions has remained unclear. Here we propose a neural model of action selection and decision making based on the theory of partially observable Markov decision processes (POMDPs). Actions are selected based not on a single “optimal” estimate of state but on the posterior distribution over states (the “belief” state). We show how such a model provides a unified framework for explaining experimental results in decision making that involve both information gathering and overt actions. The model utilizes temporal difference (TD) learning for maximizing expected reward. The resulting neural architecture posits an active role for the neocortex in belief computation while ascribing a role to the basal ganglia in belief representation, value computation, and action selection. When applied to the random dots motion discrimination task, model neurons representing belief exhibit responses similar to those of LIP neurons in primate neocortex. The appropriate threshold for switching from information gathering to overt actions emerges naturally during reward maximization. Additionally, the time course of reward prediction error in the model shares similarities with dopaminergic responses in the basal ganglia during the random dots task. For tasks with a deadline, the model learns a decision making strategy that changes with elapsed time, predicting a collapsing decision threshold consistent with some experimental studies. The model provides a new framework for understanding neural decision making and suggests an important role for interactions between the neocortex and the basal ganglia in learning the mapping between probabilistic sensory representations and actions that maximize rewards.

Journal ArticleDOI
TL;DR: Toni et al. as discussed by the authors developed a model selection framework based on approximate Bayesian computation and employing sequential Monte Carlo sampling, which can be applied across a wide range of biological scenarios, and illustrate its use on real data describing influenza dynamics and the JAK-STAT signalling pathway.
Abstract: Motivation: Computer simulations have become an important tool across the biomedical sciences and beyond. For many important problems several different models or hypotheses exist and choosing which one best describes reality or observed data is not straightforward. We therefore require suitable statistical tools that allow us to choose rationally between different mechanistic models of, e.g. signal transduction or gene regulation networks. This is particularly challenging in systems biology where only a small number of molecular species can be assayed at any given time and all measurements are subject to measurement uncertainty. Results: Here, we develop such a model selection framework based on approximate Bayesian computation and employing sequential Monte Carlo sampling. We show that our approach can be applied across a wide range of biological scenarios, and we illustrate its use on real data describing influenza dynamics and the JAK-STAT signalling pathway. Bayesian model selection strikes a balance between the complexity of the simulation models and their ability to describe observed data. The present approach enables us to employ the whole formal apparatus to any system that can be (efficiently) simulated, even when exact likelihoods are computationally intractable. Contact:ttoni@imperial.ac.uk; m.stumpf@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: In this paper, the authors compare and evaluate Bayesian predictive distributions from alternative models, using as an illustration five alternative models of asset returns applied to daily S&P 500 returns from the period 1976 through 2005.

Journal ArticleDOI
TL;DR: It is concluded that Bayesian inference is now practically feasible for GLMMs and provides an attractive alternative to likelihood-based approaches such as penalized quasi-likelihood.
Abstract: Generalized linear mixed models (GLMMs) continue to grow in popularity due to their ability to directly acknowledge multiple levels of dependency and model different data types. For small sample sizes especially, likelihood-based inference can be unreliable with variance components being particularly difficult to estimate. A Bayesian approach is appealing but has been hampered by the lack of a fast implementation, and the difficulty in specifying prior distributions with variance components again being particularly problematic. Here, we briefly review previous approaches to computation in Bayesian implementations of GLMMs and illustrate in detail, the use of integrated nested Laplace approximations in this context. We consider a number of examples, carefully specifying prior distributions on meaningful quantities in each case. The examples cover a wide range of data types including those requiring smoothing over time and a relatively complicated spline model for which we examine our prior specification in terms of the implied degrees of freedom. We conclude that Bayesian inference is now practically feasible for GLMMs and provides an attractive alternative to likelihood-based approaches such as penalized quasi-likelihood. As with likelihood-based approaches, great care is required in the analysis of clustered binary data since approximation strategies may be less accurate for such data.

Journal ArticleDOI
TL;DR: Theoretical properties of surprise are discussed, in particular how it differs and complements Shannon's definition of information.

Journal ArticleDOI
TL;DR: The results highlight the critical importance of fossil calibrations to molecular dating and the need for probabilistic modeling of fossil depositions, preservations, and sampling to provide statistical summaries of information in the fossil record concerning species divergence times.
Abstract: Bayesian inference provides a powerful framework for integrating different sources of information (in particular, molecules and fossils) to derive estimates of species divergence times. Indeed, it is currently the only framework that can adequately account for uncertainties in fossil calibrations. We use 2 Bayesian Markov chain Monte Carlo programs, MULTIDIVTIME and MCMCTREE, to analyze 3 empirical datasets to estimate divergence times in amphibians, actinopterygians, and felids. We evaluate the impact of various factors, including the priors on rates and times, fossil calibrations, substitution model, the violation of the molecular clock and the rate-drift model, and the exact and approximate likelihood calculation. Assuming the molecular clock caused seriously biased time estimates when the clock is violated, but 2 different rate-drift models produced similar estimates. The prior on times, which incorporates fossil-calibration information, had the greatest impact on posterior time estimation. In particular, the strategies used by the 2 programs to incorporate minimum- and maximum-age bounds led to very different time priors and were responsible for large differences in posterior time estimates in a previous study. The results highlight the critical importance of fossil calibrations to molecular dating and the need for probabilistic modeling of fossil depositions, preservations, and sampling to provide statistical summaries of information in the fossil record concerning species divergence times.