scispace - formally typeset
Search or ask a question

Showing papers on "Bayesian inference published in 2012"


Book
22 Dec 2012
TL;DR: An overview of statistical decision theory, which emphasizes the use and application of the philosophical ideas and mathematical structure of decision theory.
Abstract: 1. Basic concepts 2. Utility and loss 3. Prior information and subjective probability 4. Bayesian analysis 5. Minimax analysis 6. Invariance 7. Preposterior and sequential analysis 8. Complete and essentially complete classes Appendices.

5,573 citations


Journal ArticleDOI
TL;DR: Bayes factors have been advocated as superior to pp-values for assessing statistical evidence in data as mentioned in this paper, and they have been widely used in the literature for assessing power law and skill acquisition.

1,369 citations


Journal ArticleDOI
TL;DR: The libFM as mentioned in this paper tool is a software implementation for factorization machines that features stochastic gradient descent (SGD) and alternating least-squares (ALS) optimization, as well as Bayesian inference using Markov Chain Monto Carlo (MCMC).
Abstract: Factorization approaches provide high accuracy in several important prediction problems, for example, recommender systems. However, applying factorization approaches to a new prediction problem is a nontrivial task and requires a lot of expert knowledge. Typically, a new model is developed, a learning algorithm is derived, and the approach has to be implemented.Factorization machines (FM) are a generic approach since they can mimic most factorization models just by feature engineering. This way, factorization machines combine the generality of feature engineering with the superiority of factorization models in estimating interactions between categorical variables of large domain. libFM is a software implementation for factorization machines that features stochastic gradient descent (SGD) and alternating least-squares (ALS) optimization, as well as Bayesian inference using Markov Chain Monto Carlo (MCMC). This article summarizes the recent research on factorization machines both in terms of modeling and learning, provides extensions for the ALS and MCMC algorithms, and describes the software tool libFM.

1,271 citations


Journal ArticleDOI
TL;DR: It is shown that PS and SS sampling substantially outperform these estimators and adjust the conclusions made concerning previous analyses for the three real-world data sets that were reanalyzed.
Abstract: Recent developments in marginal likelihood estimation for model selection in the field of Bayesian phylogenetics and molecular evolution have emphasized the poor performance of the harmonic mean estimator (HME). Although these studies have shown the merits of new approaches applied to standard normally distributed examples and small real-world data sets, not much is currently known concerning the performance and computational issues of these methods when fitting complex evolutionary and population genetic models to empirical real-world data sets. Further, these approaches have not yet seen widespread application in the field due to the lack of implementations of these computationally demanding techniques in commonly used phylogenetic packages. We here investigate the performance of some of these new marginal likelihood estimators, specifically, path sampling (PS) and stepping-stone (SS) sampling for comparing models of demographic change and relaxed molecular clocks, using synthetic data and real-world examples for which unexpected inferences were made using the HME. Given the drastically increased computational demands of PS and SS sampling, we also investigate a posterior simulation-based analogue of Akaike’s information criterion (AIC) through Markov chain Monte Carlo (MCMC), a model comparison approach that shares with the HME the appealing feature of having a low computational overhead over the original MCMC analysis. We confirm that the HME systematically overestimates the marginal likelihood and fails to yield reliable model classification and show that the AICM performs better and may be a useful initial evaluation of model choice but that it is also, to a lesser degree, unreliable. We show that PS and SS sampling substantially outperform these estimators and adjust the conclusions made concerning previous analyses for the three real-world data sets that we reanalyzed. The methods used in this article are now available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses.

988 citations


Book
02 Oct 2012
TL;DR: Introduction: Probability and Parameters Probability Probability distributions Calculating properties of probability distributions Monte Carlo integration Monte Carlo Simulations Using BUGS using BUGs to simulate from distributions Transformations of random variables Complex calculations using Monte Carlo Multivariate Monte Carlo analysis Predictions with unknown parameters
Abstract: Introduction: Probability and Parameters Probability Probability distributions Calculating properties of probability distributions Monte Carlo integration Monte Carlo Simulations Using BUGS Introduction to BUGS DoodleBUGS Using BUGS to simulate from distributions Transformations of random variables Complex calculations using Monte Carlo Multivariate Monte Carlo analysis Predictions with unknown parameters Introduction to Bayesian Inference Bayesian learning Posterior predictive distributions Conjugate Bayesian inference Inference about a discrete parameter Combinations of conjugate analyses Bayesian and classical methods Introduction to Markov Chain Monte Carlo Methods Bayesian computation Initial values Convergence Efficiency and accuracy Beyond MCMC Prior Distributions Different purposes of priors Vague, 'objective' and 'reference' priors Representation of informative priors Mixture of prior distributions Sensitivity analysis Regression Models Linear regression with normal errors Linear regression with non-normal errors Nonlinear regression with normal errors Multivariate responses Generalised linear regression models Inference on functions of parameters Further reading Categorical Data 2 x 2 tables Multinomial models Ordinal regression Further reading Model Checking and Comparison Introduction Deviance Residuals Predictive checks and Bayesian p-values Model assessment by embedding in larger models Model comparison using deviances Bayes factors Model uncertainty Discussion on model comparison Prior-data conflict Issues in Modelling Missing data Prediction Measurement error Cutting feedback New distributions Censored, truncated and grouped observations Constrained parameters Bootstrapping Ranking Hierarchical Models Exchangeability Priors Hierarchical regression models Hierarchical models for variances Redundant parameterisations More general formulations Checking of hierarchical models Comparison of hierarchical models Further resources Specialised Models Time-to-event data Time series models Spatial models Evidence synthesis Differential equation and pharmacokinetic models Finite mixture and latent class models Piecewise parametric models Bayesian nonparametric models Different Implementations of BUGS Introduction BUGS engines and interfaces Expert systems and MCMC methods Classic BUGS WinBUGS OpenBUGS JAGS A Appendix: BUGS Language Syntax Introduction Distributions Deterministic functions Repetition Multivariate quantities Indexing Data transformations Commenting B Appendix: Functions in BUGS Standard functions Trigonometric functions Matrix algebra Distribution utilities and model checking Functionals and differential equations Miscellaneous C Appendix: Distributions in BUGS Continuous univariate, unrestricted range Continuous univariate, restricted to be positive Continuous univariate, restricted to a finite interval Continuous multivariate distributions Discrete univariate distributions Discrete multivariate distributions Bibliography Index

772 citations


Journal ArticleDOI
TL;DR: A comparison with recent implementations of path sampling and stepping-stone sampling shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model.
Abstract: Recent implementations of path sampling (PS) and stepping-stone sampling (SS) have been shown to outperform the harmonic mean estimator (HME) and a posterior simulation-based analog of Akaike’s information criterion through Markov chain Monte Carlo (AICM), in Bayesian model selection of demographic and molecular clock models. Almost simultaneously, a Bayesian model averaging approach was developed that avoids conditioning on a single model but averages over a set of relaxed clock models. This approach returns estimates of the posterior probability of each clock model through which one can estimate the Bayes factor in favor of the maximum a posteriori (MAP) clock model; however, this Bayes factor estimate may suffer when the posterior probability of the MAP model approaches 1. Here, we compare these two recent developments with the HME, stabilized/smoothed HME (sHME), and AICM, using both synthetic and empirical data. Our comparison shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model. We also illustrate the importance of using proper priors on a large set of empirical data sets.

556 citations


Journal ArticleDOI
TL;DR: This work shows how to construct appropriate summary statistics for ABC in a semi‐automatic manner, and shows that optimal summary statistics are the posterior means of the parameters.
Abstract: Summary. Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.

527 citations


Journal ArticleDOI
TL;DR: This work designs a sensorimotor task that requires subjects to compensate visuomotor shifts in a three-dimensional virtual reality setup and finds that model selection procedures based on Bayesian statistics provided a better explanation for subjects' choice behavior than simple non-probabilistic heuristics.
Abstract: Sensorimotor control is thought to rely on predictive internal models in order to cope efficiently with uncertain environments. Recently, it has been shown that humans not only learn different internal models for different tasks, but that they also extract common structure between tasks. This raises the question of how the motor system selects between different structures or models, when each model can be associated with a range of different task-specific parameters. Here we design a sensorimotor task that requires subjects to compensate visuomotor shifts in a three-dimensional virtual reality setup, where one of the dimensions can be mapped to a model variable and the other dimension to the parameter variable. By introducing probe trials that are neutral in the parameter dimension, we can directly test for model selection. We found that model selection procedures based on Bayesian statistics provided a better explanation for subjects’ choice behavior than simple non-probabilistic heuristics. Our experimental design lends itself to the general study of model selection in a sensorimotor context as it allows to separately query model and parameter variables from subjects.

500 citations


Journal ArticleDOI
TL;DR: MT‐DREAM(ZS), which combines the strengths of multiple‐try sampling, snooker updating, and sampling from an archive of past states is introduced, which is especially designed to solve high‐dimensional search problems and receives particularly spectacular performance improvement over other adaptive MCMC approaches when using distributed computing.
Abstract: [1] Spatially distributed hydrologic models are increasingly being used to study and predict soil moisture flow, groundwater recharge, surface runoff, and river discharge. The usefulness and applicability of such complex models is increasingly held back by the potentially many hundreds (thousands) of parameters that require calibration against some historical record of data. The current generation of search and optimization algorithms is typically not powerful enough to deal with a very large number of variables and summarize parameter and predictive uncertainty. We have previously presented a general-purpose Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference of the posterior probability density function of hydrologic model parameters. This method, entitled differential evolution adaptive Metropolis (DREAM), runs multiple different Markov chains in parallel and uses a discrete proposal distribution to evolve the sampler to the posterior distribution. The DREAM approach maintains detailed balance and shows excellent performance on complex, multimodal search problems. Here we present our latest algorithmic developments and introduce MT-DREAM(ZS), which combines the strengths of multiple-try sampling, snooker updating, and sampling from an archive of past states. This new code is especially designed to solve high-dimensional search problems and receives particularly spectacular performance improvement over other adaptive MCMC approaches when using distributed computing. Four different case studies with increasing dimensionality up to 241 parameters are used to illustrate the advantages of MT-DREAM(ZS).

456 citations


Journal ArticleDOI
TL;DR: This work addresses the solution of large-scale statistical inverse problems in the framework of Bayesian inference with a so-called Stochastic Monte Carlo method.
Abstract: We address the solution of large-scale statistical inverse problems in the framework of Bayesian inference. The Markov chain Monte Carlo (MCMC) method is the most popular approach for sampling the posterior probability distribution that describes the solution of the statistical inverse problem. MCMC methods face two central difficulties when applied to large-scale inverse problems: first, the forward models (typically in the form of partial differential equations) that map uncertain parameters to observable quantities make the evaluation of the probability density at any point in parameter space very expensive; and second, the high-dimensional parameter spaces that arise upon discretization of infinite-dimensional parameter fields make the exploration of the probability density function prohibitive. The challenge for MCMC methods is to construct proposal functions that simultaneously provide a good approximation of the target density while being inexpensive to manipulate. Here we present a so-called Stoch...

411 citations


01 Jan 2012
TL;DR: This article provides a simple and intuitive derivation of the Kalman filter, with the aim of teaching this useful tool to students from disciplines that do not require a strong mathematical background.
Abstract: T his article provides a simple and intuitive derivation of the Kalman filter, with the aim of teaching this useful tool to students from disciplines that do not require a strong mathematical background. The most complicated level of mathematics required to understand this derivation is the ability to multiply two Gaussian functions together and reduce the result to a compact form. The Kalman filter is over 50 years old but is still one of the most important and common data fusion algorithms in use today. Named after Rudolf E. Kalman, the great success of the Kalman filter is due to its small computational requirement, elegant recursive properties, and its status as the optimal estimator for one-dimensional linear systems with Gaussian error statistics [1] . Typical uses of the Kalman filter include smoothing noisy data and providing estimates of parameters of interest. Applications include global positioning system receivers, phaselocked loops in radio equipment, smoothing the output from laptop trackpads, and many more. From a theoretical standpoint, the Kalman filter is an algorithm permitting exact inference in a linear dynamical system, which is a Bayesian model similar to a hidden Markov model but where the state space of the latent variables is continuous and where all latent and observed variables have a Gaussian distribution (often a multivariate Gaussian distribution). The aim of this lecture note is to permit people who find this description confusing or terrifying to understand the basis of the Kalman filter via a simple and intuitive derivation.

Journal ArticleDOI
TL;DR: This work explores the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused, and provides a plausible model of visual search that can be motivated from the basic principles of self-organized behavior.
Abstract: If perception corresponds to hypothesis testing (Gregory, 1980); then visual searches might be construed as experiments that generate sensory data. In this work, we explore the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused. This provides a plausible model of visual search that can be motivated from the basic principles of self-organized behavior: namely, the imperative to minimize the entropy of hidden states of the world and their sensory consequences. This imperative is met if agents sample hidden states of the world efficiently. This efficient sampling of salient information can be derived in a fairly straightforward way, using approximate Bayesian inference and variational free-energy minimization. Simulations of the resulting active inference scheme reproduce sequential eye movements that are reminiscent of empirically observed saccades and provide some counterintuitive insights into the way that sensory evidence is accumulated or assimilated into beliefs about the world.

Journal ArticleDOI
TL;DR: This book gives the reader a thorough appreciation of asymptotics through the use of lots of practical examples and down-to-earth explanations and shows the application to statistical inference.
Abstract: and shows the application to statistical inference. Saddle point approximations such as the method of Darboux and Hayman’s approximation and application of these methods cause the reader to be an active participant rather than a passive learner. The final chapter is devoted to the summation of series and addresses methods for accelerating the speed of convergence for these methods. Probably the major goal of this book is to introduce the hows and whys of asymptotic theory which are seldom taught in the traditional asymptotic courses at the doctoral level. This book gives the reader a thorough appreciation of asymptotics through the use of lots of practical examples and down-to-earth explanations. While it may not be able to serve as an essential text, students may find it very useful as a reference book.

Posted Content
TL;DR: This work presents an alternative algorithm based on stochastic optimization that allows for direct optimization of the variational lower bound and demonstrates the approach on two non-conjugate models: logistic regression and an approximation to the HDP.
Abstract: Mean-field variational inference is a method for approximate Bayesian posterior inference. It approximates a full posterior distribution with a factorized set of distributions by maximizing a lower bound on the marginal likelihood. This requires the ability to integrate a sum of terms in the log joint likelihood using this factorized distribution. Often not all integrals are in closed form, which is typically handled by using a lower bound. We present an alternative algorithm based on stochastic optimization that allows for direct optimization of the variational lower bound. This method uses control variates to reduce the variance of the stochastic search gradient, in which existing lower bounds can play an important role. We demonstrate the approach on two non-conjugate models: logistic regression and an approximation to the HDP.

Journal ArticleDOI
TL;DR: This work presents a novel method for joint inversion of receiver functions and surface wave dispersion data, using a transdimensional Bayesian formulation and shows that the Hierarchical Bayes procedure is a powerful tool in this situation, able to evaluate the level of information brought by different data types in the misfit, thus removing the arbitrary choice of weighting factors.
Abstract: We present a novel method for joint inversion of receiver functions and surface wave dispersion data, using a transdimensional Bayesian formulation. This class of algorithm treats the number of model parameters (e.g. number of layers) as an unknown in the problem. The dimension of the model space is variable and a Markov chain Monte Carlo (McMC) scheme is used to provide a parsimonious solution that fully quantifies the degree of knowledge one has about seismic structure (i.e constraints on the model, resolution, and trade-offs). The level of data noise (i.e. the covariance matrix of data errors) effectively controls the information recoverable from the data and here it naturally determines the complexity of the model (i.e. the number of model parameters). However, it is often difficult to quantify the data noise appropriately, particularly in the case of seismic waveform inversion where data errors are correlated. Here we address the issue of noise estimation using an extended Hierarchical Bayesian formulation, which allows both the variance and covariance of data noise to be treated as unknowns in the inversion. In this way it is possible to let the data infer the appropriate level of data fit. In the context of joint inversions, assessment of uncertainty for different data types becomes crucial in the evaluation of the misfit function. We show that the Hierarchical Bayes procedure is a powerful tool in this situation, because it is able to evaluate the level of information brought by different data types in the misfit, thus removing the arbitrary choice of weighting factors. After illustrating the method with synthetic tests, a real data application is shown where teleseismic receiver functions and ambient noise surface wave dispersion measurements from the WOMBAT array (South-East Australia) are jointly inverted to provide a probabilistic 1D model of shear-wave velocity beneath a given station.

Journal ArticleDOI
TL;DR: A unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them, with an emphasis on how each method approximates the expected utility of using a Bayesian model for the purpose of predicting future data.
Abstract: To date, several methods exist in the statistical literature for model assessment, which purport themselves specifically as Bayesian predictive methods. The decision theoretic assumptions on which these methods are based are not always clearly stated in the original articles, however. The aim of this survey is to provide a unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them. We review the various assumptions that are made in this context and discuss the connections between different approaches, with an emphasis on how each method approximates the expected utility of using a Bayesian model for the purpose of predicting future data.

Journal ArticleDOI
10 Sep 2012-Chance
TL;DR: Technical aspects are not the focus of Principles of Applied Statistics, so this also explains why it does not dwell intently on nonparametric models.
Abstract: Paperback: 276 pages Publisher: Cambridge University Press and Institute of Mathematical Statistics Year: 2010 Language: English ISBN-13: 978-0-5211-9249-1 Large-Scale Inference: Empirical Bayes Me...

Journal ArticleDOI
TL;DR: This tutorial describes the mean-field variational Bayesian approximation to inference in graphical models, using modern machine learning terminology rather than statistical physics concepts, and derives local node updates and reviews the recent Variational Message Passing framework.
Abstract: This tutorial describes the mean-field variational Bayesian approximation to inference in graphical models, using modern machine learning terminology rather than statistical physics concepts. It begins by seeking to find an approximate mean-field distribution close to the target joint in the KL-divergence sense. It then derives local node updates and reviews the recent Variational Message Passing framework.

Journal ArticleDOI
TL;DR: A new approach to Bayesian inference is presented that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure, and demonstrates the accuracy and efficiency of the approach on nonlinear inverse problems of varying dimension.

BookDOI
09 Jul 2012
TL;DR: This work presents a reformulation of the stochastic optimal control problem in terms of KL divergence minimisation, not only providing a unifying perspective of previous approaches in this area, but also demonstrating that the formalism leads to novel practical approaches to the control problem.
Abstract: We present a reformulation of the stochastic optimal control problem in terms of KL divergence minimisation, not only providing a unifying perspective of previous approaches in this area, but also demonstrating that the formalism leads to novel practical approaches to the control problem. Specifically, a natural relaxation of the dual formulation gives rise to exact iter- ative solutions to the finite and infinite horizon stochastic optimal control problem, while direct application of Bayesian inference methods yields instances of risk sensitive control. We furthermore study corresponding formulations in the reinforcement learning setting and present model free algorithms for problems with both discrete and continuous state and action spaces. Evaluation of the proposed methods on the standard Gridworld and Cart-Pole benchmarks verifies the theoretical insights and shows that the proposed methods improve upon current approaches.

Journal ArticleDOI
TL;DR: Modifications of Bayesian model selection methods by imposing nonlocal prior densities on model parameters are proposed and it is demonstrated that these model selection procedures perform as well or better than commonly used penalized likelihood methods in a range of simulation settings.
Abstract: Standard assumptions incorporated into Bayesian model selection procedures result in procedures that are not competitive with commonly used penalized likelihood methods. We propose modifications of these methods by imposing nonlocal prior densities on model parameters. We show that the resulting model selection procedures are consistent in linear model settings when the number of possible covariates p is bounded by the number of observations n, a property that has not been extended to other model selection procedures. In addition to consistently identifying the true model, the proposed procedures provide accurate estimates of the posterior probability that each identified model is correct. Through simulation studies, we demonstrate that these model selection procedures perform as well or better than commonly used penalized likelihood methods in a range of simulation settings. Proofs of the primary theorems are provided in the Supplementary Material that is available online.

Journal ArticleDOI
TL;DR: In this paper, a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint are used to determine the correct rank while providing high recovery performance.
Abstract: Recovery of low-rank matrices has recently seen significant activity in many areas of science and engineering, motivated by recent theoretical results for exact reconstruction guarantees and interesting practical applications. In this paper, we present novel recovery algorithms for estimating low-rank matrices in matrix completion and robust principal component analysis based on sparse Bayesian learning (SBL) principles. Starting from a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint, we develop an approach that is very effective in determining the correct rank while providing high recovery performance. We provide connections with existing methods in other similar problems and empirical results and comparisons with current state-of-the-art methods that illustrate the effectiveness of this approach.

Journal ArticleDOI
TL;DR: The technique can handle noisy data, potentially from multiple sources, and fuse it into a robust common probabilistic representation of the robot’s surroundings, and provides inferences with associated variances into occluded regions and between sensor beams, even with relatively few observations.
Abstract: We introduce a new statistical modelling technique for building occupancy maps. The problem of mapping is addressed as a classification task where the robot's environment is classified into regions of occupancy and free space. This is obtained by employing a modified Gaussian process as a non-parametric Bayesian learning technique to exploit the fact that real-world environments inherently possess structure. This structure introduces dependencies between points on the map which are not accounted for by many common mapping techniques such as occupancy grids. Our approach is an 'anytime' algorithm that is capable of generating accurate representations of large environments at arbitrary resolutions to suit many applications. It also provides inferences with associated variances into occluded regions and between sensor beams, even with relatively few observations. Crucially, the technique can handle noisy data, potentially from multiple sources, and fuse it into a robust common probabilistic representation of the robot's surroundings. We demonstrate the benefits of our approach on simulated datasets with known ground truth and in outdoor urban environments.

Journal ArticleDOI
TL;DR: In this paper, the authors formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion, and illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection.
Abstract: In objective Bayesian model selection, no single criterion has emerged as dominant in defining objective prior distributions. Indeed, many criteria have been separately proposed and utilized to propose differing prior choices. We first formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion. We then illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection in normal linear models. This results in a new model selection objective prior with a number of compelling properties.

Journal ArticleDOI
TL;DR: This tutorial explains the foundation of approximate Bayesian computation (ABC), an approach to Bayesian inference that does not require the specification of a likelihood function, and hence that can be used to estimate posterior distributions of parameters for simulation-based models.

Journal ArticleDOI
TL;DR: A novel method for DE analysis across replicates is proposed which propagates uncertainty from the sample-level model while modelling biological variance using an expression-level-dependent prior, and the advantages of this method are demonstrated.
Abstract: Motivation: High-throughput sequencing enables expression analysis at the level of individual transcripts. The analysis of transcriptome expression levels and differential expression (DE) estimation requires a probabilistic approach to properly account for ambiguity caused by shared exons and finite read sampling as well as the intrinsic biological variance of transcript expression. Results: We present Bayesian inference of transcripts from sequencing data (BitSeq), a Bayesian approach for estimation of transcript expression level from RNA-seq experiments. Inferred relative expression is represented by Markov chain Monte Carlo samples from the posterior probability distribution of a generative model of the read data. We propose a novel method for DE analysis across replicates which propagates uncertainty from the sample-level model while modelling biological variance using an expression-level-dependent prior. We demonstrate the advantages of our method using simulated data as well as an RNA-seq dataset with technical and biological replication for both studied conditions. Availability: The implementation of the transcriptome expression estimation and differential expression analysis, BitSeq, has been written in C++ and Python. The software is available online from http://code.google.com/p/bitseq/, version 0.4 was used for generating results presented in this article. Contact:glaus@cs.man.ac.uk, antti.honkela@hiit.fi or m.rattray@sheffield.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.

Proceedings Article
21 Mar 2012
TL;DR: Bayesian model averaging as discussed by the authors is the coherent Bayesian way of combining multiple models only under certain restrictive assumptions, which is the framework for Bayesian model combination (which differs from model averaging) in the context of classification.
Abstract: Bayesian model averaging linearly mixes the probabilistic predictions of multiple models, each weighted by its posterior probability. This is the coherent Bayesian way of combining multiple models only under certain restrictive assumptions, which we outline. We explore a general framework for Bayesian model combination (which differs from model averaging) in the context of classification. This framework explicitly models the relationship between each model’s output and the unknown true label. The framework does not require that the models be probabilistic (they can even be human assessors), that they share prior information or receive the same training data, or that they be independent in their errors. Finally, the Bayesian combiner does not need to believe any of the models is in fact correct. We test several variants of this classifier combination procedure starting from a classic statistical model proposed by Dawid and Skene (1979) and using MCMC to add more complex but important features to the model. Comparisons on several data sets to simpler methods like majority voting show that the Bayesian methods not only perform well but result in interpretable diagnostics on the data points and the models.

Journal ArticleDOI
TL;DR: This article describes the mechanics and rationale of four different approaches to the statistical testing of electrophysiological data: (1) the Neyman-Pearson approach, (2) the permutation-based approach,(3), the bootstrap- based approach, and (4) the Bayesian approach.
Abstract: This article describes the mechanics and rationale of four different approaches to the statistical testing of electrophysiological data: (1) the Neyman-Pearson approach, (2) the permutation-based approach, (3), the bootstrap-based approach, and (4) the Bayesian approach. These approaches are evaluated from the perspective of electrophysiological studies, which involve multivariate (i.e., spatiotemporal) observations in which source-level signals are picked up to a certain extent by all sensors. Besides formal statistical techniques, there are also techniques that do not involve probability calculations but are very useful in dealing with multivariate data (i.e., verification of data-based predictions, cross-validation, and localizers). Moreover, data-based decision making can also be informed by mechanistic evidence that is provided by the structure in the data.

Journal ArticleDOI
TL;DR: The experimental results show that the proposed algorithm outperforms many state-of-the-art algorithms, and solves the inverse problem automatically-prior information on the number of clusters and the size of each cluster is unknown.

Journal ArticleDOI
TL;DR: In this paper, the authors formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion, and illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection.
Abstract: In objective Bayesian model selection, no single criterion has emerged as dominant in defining objective prior distributions. Indeed, many criteria have been separately proposed and utilized to propose differing prior choices. We first formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion. We then illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection in normal linear models. This results in a new model selection objective prior with a number of compelling properties.