scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 2010"


Journal Article
TL;DR: In this article, the authors theoretically compare the Bayes cross-validation loss and the widely applicable information criterion and prove two theorems: 1) The Bayes generalization error is asymptotically equal to 2λ/n, where λ is the real log canonical threshold and n is the number of training samples.
Abstract: In regular statistical models, the leave-one-out cross-validation is asymptotically equivalent to the Akaike information criterion. However, since many learning machines are singular statistical models, the asymptotic behavior of the cross-validation remains unknown. In previous studies, we established the singular learning theory and proposed a widely applicable information criterion, the expectation value of which is asymptotically equal to the average Bayes generalization loss. In the present paper, we theoretically compare the Bayes cross-validation loss and the widely applicable information criterion and prove two theorems. First, the Bayes cross-validation loss is asymptotically equivalent to the widely applicable information criterion as a random variable. Therefore, model selection and hyperparameter optimization using these two values are asymptotically equivalent. Second, the sum of the Bayes generalization error and the Bayes cross-validation error is asymptotically equal to 2λ/n, where λ is the real log canonical threshold and n is the number of training samples. Therefore the relation between the cross-validation error and the generalization error is determined by the algebraic geometrical structure of a learning machine. We also clarify that the deviance information criteria are different from the Bayes cross-validation and the widely applicable information criterion.

1,527 citations


MonographDOI
01 Aug 2010
TL;DR: This book discusses empirical Bayes and the James-Stein estimator, as well as large-scale hypothesis testing algorithms, and prediction and effect size estimation.
Abstract: Introduction and foreword 1 Empirical Bayes and the James-Stein estimator 2 Large-scale hypothesis testing 3 Significance testing algorithms 4 False discovery rate control 5 Local false discovery rates 6 Theoretical, permutation and empirical null distributions 7 Estimation accuracy 8 Correlation questions 9 Sets of cases (enrichment) 10 Combination, relevance, and comparability 11 Prediction and effect size estimation A Exponential families B Programs and data sets Bibliography Index

861 citations


Journal ArticleDOI
TL;DR: A framework for defining patterns of differential expression is proposed and a novel algorithm, baySeq, is developed, which uses an empirical Bayes approach to detect these patternsof differential expression within a set of sequencing samples.
Abstract: High throughput sequencing has become an important technology for studying expression levels in many types of genomic, and particularly transcriptomic, data. One key way of analysing such data is to look for elements of the data which display particular patterns of differential expression in order to take these forward for further analysis and validation. We propose a framework for defining patterns of differential expression and develop a novel algorithm, baySeq, which uses an empirical Bayes approach to detect these patterns of differential expression within a set of sequencing samples. The method assumes a negative binomial distribution for the data and derives an empirically determined prior distribution from the entire dataset. We examine the performance of the method on real and simulated data. Our method performs at least as well, and often better, than existing methods for analyses of pairwise differential expression in both real and simulated data. When we compare methods for the analysis of data from experimental designs involving multiple sample groups, our method again shows substantial gains in performance. We believe that this approach thus represents an important step forward for the analysis of count data from sequencing experiments.

792 citations


Journal ArticleDOI
TL;DR: A combination of two further approaches: family level inference and Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure are proposed.
Abstract: Mathematical models of scientific data can be formally compared using Bayesian model evidence. Previous applications in the biological sciences have mainly focussed on model selection in which one first selects the model with the highest evidence and then makes inferences based on the parameters of that model. This "best model" approach is very useful but can become brittle if there are a large number of models to compare, and if different subjects use different models. To overcome this shortcoming we propose the combination of two further approaches: (i) family level inference and (ii) Bayesian model averaging within families. Family level inference removes uncertainty about aspects of model structure other than the characteristic of interest. For example: What are the inputs to the system? Is processing serial or parallel? Is it linear or nonlinear? Is it mediated by a single, crucial connection? We apply Bayesian model averaging within families to provide inferences about parameters that are independent of further assumptions about model structure. We illustrate the methods using Dynamic Causal Models of brain imaging data.

680 citations


Journal ArticleDOI
TL;DR: In this article, the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression is investigated, and empirical and fully-Bayes approaches to variable selection through examples, theoretical results and simulations are compared.
Abstract: This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham’s-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains.

620 citations


Journal ArticleDOI
TL;DR: A performance-optimizing Bayesian model that takes the underlying distribution of samples into account provided an accurate description of subjects' performance, variability and bias and suggests that the CNS incorporates knowledge about temporal uncertainty to adapt internal timing mechanisms to the temporal statistics of the environment.
Abstract: The authors find that a person's estimate of a time interval exhibits biases that depend on both its duration and the distribution from which it is drawn. This behavioral pattern could be described using a Bayesian model. These findings suggest that internal timing mechanisms can adapt to the temporal statistics of the environment to minimize uncertainty. We use our sense of time to identify temporal relationships between events and to anticipate actions. The degree to which we can exploit temporal contingencies depends on the variability of our measurements of time. We asked humans to reproduce time intervals drawn from different underlying distributions. As expected, production times were more variable for longer intervals. However, production times exhibited a systematic regression toward the mean. Consequently, estimates for a sample interval differed depending on the distribution from which it was drawn. A performance-optimizing Bayesian model that takes the underlying distribution of samples into account provided an accurate description of subjects' performance, variability and bias. This finding suggests that the CNS incorporates knowledge about temporal uncertainty to adapt internal timing mechanisms to the temporal statistics of the environment.

612 citations


Journal ArticleDOI
TL;DR: This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data‐fit of more complex model classes that extract more information from the data.
Abstract: Probability logic with Bayesian updating provides a rigorous framework to quantify modeling uncertainty and perform system identification. It uses probability as a multi-valued propositional logic for plausible reasoning where the probability of a model is a measure of its relative plausibility within a set of models. System identification is thus viewed as inference about plausible system models and not as a quixotic quest for the true model. Instead of using system data to estimate the model parameters, Bayes' Theorem is used to update the relative plausibility of each model in a model class, which is a set of input–output probability models for the system and a probability distribution over this set that expresses the initial plausibility of each model. Robust predictive analyses informed by the system data use the entire model class with the probabilistic predictions of each model being weighed by its posterior probability. Additional robustness to modeling uncertainty comes from combining the robust predictions of each model class in a set of candidates for the system, where each contribution is weighed by the posterior probability of the model class. This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data-fit of more complex model classes that extract more information from the data. Robust analyses involve integrals over parameter spaces that usually must be evaluated numerically by Laplace's method of asymptotic approximation or by Markov Chain Monte Carlo methods. An illustrative application is given using synthetic data corresponding to a structural health monitoring benchmark structure.

497 citations


Proceedings Article
21 Jun 2010
TL;DR: This paper formalize and analyze MLC within a probabilistic setting, and proposes a new method for MLC that generalizes and outperforms another approach, called classifier chains, that was recently introduced in the literature.
Abstract: In the realm of multilabel classification (MLC), it has become an opinio communis that optimal predictive performance can only be achieved by learners that explicitly take label dependence into account. The goal of this paper is to elaborate on this postulate in a critical way. To this end, we formalize and analyze MLC within a probabilistic setting. Thus, it becomes possible to look at the problem from the point of view of risk minimization and Bayes optimal prediction. Moreover, inspired by our probabilistic setting, we propose a new method for MLC that generalizes and outperforms another approach, called classifier chains, that was recently introduced in the literature.

480 citations


Book
19 Jul 2010
TL;DR: Statistical Approaches for Clinical Trials Introduction Comparisons between Bayesian and frequentist approaches Adaptivity in clinical trials Features and use of the Bayesian adaptive approach Basics of Bayesian Inference.
Abstract: Statistical Approaches for Clinical Trials Introduction Comparisons between Bayesian and frequentist approaches Adaptivity in clinical trials Features and use of the Bayesian adaptive approach Basics of Bayesian Inference Introduction to Bayes' theorem Bayesian inference Bayesian computation Hierarchical modeling and metaanalysis Principles of Bayesian clinical trial design Appendix: R Macros Phase I Studies Rule-based designs for determining the MTD Model-based designs for determining the MTD Efficacy versus toxicity Combination therapy Appendix: R Macros Phase II Studies Standard designs Predictive probability Sequential stopping Adaptive randomization and dose allocation Dose ranging and optimal biologic dosing Hierarchical models for Phase II designs Decision theoretic designs Case studies: BATTLE and ISPY-2 Appendix: R Macros Phase III Studies Introduction to confirmatory studies Bayesian adaptive confirmatory trials Arm dropping Modeling and prediction Prior distributions and the paradigm clash Phase III cancer trials Phase II/III seamless trials Case study: Ablation device to treat atrial fibrillation Appendix: R Macros Special Topics Incorporating historical data Equivalence studies Multiplicity Subgroup analysis Appendix: R Macros References Author Index Subject Index

452 citations


Book
05 Apr 2010
TL;DR: This book discusses Bayesian Model Class Selection using Eigenvalue-Eigenvector Measurements, a relationship between the Hessian and Covariance Matrix for Gaussian Random Variables, and the Conditional PDF for Prediction.
Abstract: Contents Preface Nomenclature 1 Introduction 1.1 Thomas Bayes and Bayesian Methods in Engineering 1.2 Purpose of Model Updating 1.3 Source of Uncertainty and Bayesian Updating 1.4 Organization of the Book 2 Basic Concepts and Bayesian Probabilistic Framework 2.1 Conditional Probability and Basic Concepts 2.2 Bayesian Model Updating with Input-output Measurements 2.3 Deterministic versus Probabilistic Methods 2.4 Regression Problems 2.5 Numerical Representation of the Updated PDF 2.6 Application to Temperature Effects on Structural Behavior 2.7 Application to Noise Parameters Selection for Kalman Filter 2.8 Application to Prediction of Particulate Matter Concentration 3 Bayesian Spectral Density Approach 3.1 Modal and Model Updating of Dynamical Systems 3.2 Random Vibration Analysis 3.3 Bayesian Spectral Density Approach 3.4 Numerical Verifications 3.5 Optimal Sensor Placement 3.6 Updating of a Nonlinear Oscillator 3.7 Application to Structural Behavior under Typhoons 3.8 Application to Hydraulic Jump 4 Bayesian Time-domain Approach 4.1 Introduction 4.2 Exact Bayesian Formulation and its Computational Difficulties 4.3 Random Vibration Analysis of Nonstationary Response 4.4 Bayesian Updating with Approximated PDF Expansion 4.5 Numerical Verification 4.6 Application to Model Updating with Unmeasured Earthquake Ground Motion 4.7 Concluding Remarks 4.8 Comparison of Spectral Density Approach and Time-domain Approach 4.9 Extended Readings 5 Model Updating Using Eigenvalue-Eigenvector Measurements 5.1 Introduction 5.2 Formulation 5.3 Linear Optimization Problems 5.4 Iterative Algorithm 5.5 Uncertainty Estimation 5.6 Applications to Structural Health Monitoring 5.7 Concluding Remarks 6 Bayesian Model Class Selection 6.1 Introduction 6.2 Bayesian Model Class Selection 6.3 Model Class Selection for Regression Problems 6.4 Application to Modal Updating 6.5 Application to Seismic Attenuation Empirical Relationship 6.6 Prior Distributions - Revisited 6.7 Final Remarks A Relationship between the Hessian and Covariance Matrix for Gaussian Random Variables B Contours of Marginal PDFs for Gaussian Random Variables C Conditional PDF for Prediction C.1 Two Random Variables C.2 General Cases References Index

450 citations


Posted Content
TL;DR: The Bayes cross-validation loss is asymptotically equivalent to the widely applicable information criterion as a random variable and model selection and hyperparameter optimization using these two values are asymPTOTically equivalent.
Abstract: In regular statistical models, the leave-one-out cross-validation is asymptotically equivalent to the Akaike information criterion. However, since many learning machines are singular statistical models, the asymptotic behavior of the cross-validation remains unknown. In previous studies, we established the singular learning theory and proposed a widely applicable information criterion, the expectation value of which is asymptotically equal to the average Bayes generalization loss. In the present paper, we theoretically compare the Bayes cross-validation loss and the widely applicable information criterion and prove two theorems. First, the Bayes cross-validation loss is asymptotically equivalent to the widely applicable information criterion as a random variable. Therefore, model selection and hyperparameter optimization using these two values are asymptotically equivalent. Second, the sum of the Bayes generalization error and the Bayes cross-validation error is asymptotically equal to $2\lambda/n$, where $\lambda$ is the real log canonical threshold and $n$ is the number of training samples. Therefore the relation between the cross-validation error and the generalization error is determined by the algebraic geometrical structure of a learning machine. We also clarify that the deviance information criteria are different from the Bayes cross-validation and the widely applicable information criterion.

Journal ArticleDOI
TL;DR: In this paper, the authors examine philosophical problems and sampling deficiencies associated with current Bayesian hypothesis testing methodology, paying particular attention to objective Bayes methodology, and propose two new classes of prior densities that ameliorate the imbalance in convergence rates that is inherited by most Bayesian tests.
Abstract: We examine philosophical problems and sampling deficiencies that are associated with current Bayesian hypothesis testing methodology, paying particular attention to objective Bayes methodology Because the prior densities that are used to define alternative hypotheses in many Bayesian tests assign non-negligible probability to regions of the parameter space that are consistent with null hypotheses, resulting tests provide exponential accumulation of evidence in favour of true alternative hypotheses, but only sublinear accumulation of evidence in favour of true null hypotheses Thus, it is often impossible for such tests to provide strong evidence in favour of a true null hypothesis, even when moderately large sample sizes have been obtained We review asymptotic convergence rates of Bayes factors in testing precise null hypotheses and propose two new classes of prior densities that ameliorate the imbalance in convergence rates that is inherited by most Bayesian tests Using members of these classes, we obtain analytic expressions for Bayes factors in linear models and derive approximations to Bayes factors in large sample settings

Journal ArticleDOI
TL;DR: Several advantages of Bayesian data analysis over traditional null-hypothesis significance testing are reviewed.

Journal ArticleDOI
TL;DR: It is concluded that Bayesian inference is now practically feasible for GLMMs and provides an attractive alternative to likelihood-based approaches such as penalized quasi-likelihood.
Abstract: Generalized linear mixed models (GLMMs) continue to grow in popularity due to their ability to directly acknowledge multiple levels of dependency and model different data types. For small sample sizes especially, likelihood-based inference can be unreliable with variance components being particularly difficult to estimate. A Bayesian approach is appealing but has been hampered by the lack of a fast implementation, and the difficulty in specifying prior distributions with variance components again being particularly problematic. Here, we briefly review previous approaches to computation in Bayesian implementations of GLMMs and illustrate in detail, the use of integrated nested Laplace approximations in this context. We consider a number of examples, carefully specifying prior distributions on meaningful quantities in each case. The examples cover a wide range of data types including those requiring smoothing over time and a relatively complicated spline model for which we examine our prior specification in terms of the implied degrees of freedom. We conclude that Bayesian inference is now practically feasible for GLMMs and provides an attractive alternative to likelihood-based approaches such as penalized quasi-likelihood. As with likelihood-based approaches, great care is required in the analysis of clustered binary data since approximation strategies may be less accurate for such data.

Journal ArticleDOI
TL;DR: A bayesian hierarchical approach that explicitly specifies multilevel structure and reliably yields parameter estimates is introduced and recommended to properly accommodate the potential cross-group heterogeneity and spatiotemporal correlation due to the multileVEL data structure.

Journal ArticleDOI
TL;DR: Insight is gained into the decision-making strategy used by human observers in a low-level perceptual task to examine which of three plausible strategies could account for each observer's behavior the best.
Abstract: The question of which strategy is employed in human decision making has been studied extensively in the context of cognitive tasks; however, this question has not been investigated systematically in the context of perceptual tasks. The goal of this study was to gain insight into the decision-making strategy used by human observers in a low-level perceptual task. Data from more than 100 individuals who participated in an auditory-visual spatial localization task was evaluated to examine which of three plausible strategies could account for each observer's behavior the best. This task is very suitable for exploring this question because it involves an implicit inference about whether the auditory and visual stimuli were caused by the same object or independent objects, and provides different strategies of how using the inference about causes can lead to distinctly different spatial estimates and response patterns. For example, employing the commonly used cost function of minimizing the mean squared error of spatial estimates would result in a weighted averaging of estimates corresponding to different causal structures. A strategy that would minimize the error in the inferred causal structure would result in the selection of the most likely causal structure and sticking with it in the subsequent inference of location-"model selection." A third strategy is one that selects a causal structure in proportion to its probability, thus attempting to match the probability of the inferred causal structure. This type of probability matching strategy has been reported to be used by participants predominantly in cognitive tasks. Comparing these three strategies, the behavior of the vast majority of observers in this perceptual task was most consistent with probability matching. While this appears to be a suboptimal strategy and hence a surprising choice for the perceptual system to adopt, we discuss potential advantages of such a strategy for perception.

Journal ArticleDOI
01 Jan 2010-Genetics
TL;DR: This work proposes a reformulation of the regression adjustment of population subdivision among western chimpanzees in terms of a general linear model (GLM), which allows the integration into the sound theoretical framework of Bayesian statistics and the use of its methods, including model selection via Bayes factors.
Abstract: Until recently, the use of Bayesian inference was limited to a few cases because for many realistic probability models the likelihood function cannot be calculated analytically. The situation changed with the advent of likelihood-free inference algorithms, often subsumed under the term approximate Bayesian computation (ABC). A key innovation was the use of a postsampling regression adjustment, allowing larger tolerance values and as such shifting computation time to realistic orders of magnitude. Here we propose a reformulation of the regression adjustment in terms of a general linear model (GLM). This allows the integration into the sound theoretical framework of Bayesian statistics and the use of its methods, including model selection via Bayes factors. We then apply the proposed methodology to the question of population subdivision among western chimpanzees, Pan troglodytes verus.

Journal ArticleDOI
TL;DR: In this article, a representative consumer uses Bayes' law to learn about parameters of several models and to construct probabilities with which to perform ongoing model averaging, and the arrival of signals induces the consumer to alter his posterior distribution over models and parameters.
Abstract: A representative consumer uses Bayes’ law to learn about parameters of several models and to construct probabilities with which to perform ongoing model averaging The arrival of signals induces the consumer to alter his posterior distribution over models and parameters The consumer’s specification doubts induce him to slant probabilities pessimistically The pessimistic probabilities tilt toward a model that puts long-run risks into consumption growth That contributes a countercyclical history-dependent component to prices of risk Keywords Learning, Bayes’ law, robustness, risk sensitivity, pessimism, prices of risk JEL classification C11, C44, C72, E44, G12

Journal ArticleDOI
TL;DR: The main application is an evaluation of the conversion of road segments from a four-lane to a three-lane cross-section with two-way left-turn lanes (also known as road diets); the results of an earlier application pertaining to the evaluation of conversion of rural intersections from unsignalized to signalized control are summarized.

Journal ArticleDOI
TL;DR: Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities, and are shown to improve protein loop conformation prediction significantly.
Abstract: Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.

Journal ArticleDOI
TL;DR: Atlas-SNP2 is developed, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets and estimates the posterior error probability for each substitution through a Bayesian formula.
Abstract: Accurate identification of genetic variants from next-generation sequencing (NGS) data is essential for immediate large-scale genomic endeavors such as the 1000 Genomes Project, and is crucial for further genetic analysis based on the discoveries. The key challenge in single nucleotide polymorphism (SNP) discovery is to distinguish true individual variants (occurring at a low frequency) from sequencing errors (often occurring at frequencies orders of magnitude higher). Therefore, knowledge of the error probabilities of base calls is essential. We have developed Atlas-SNP2, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets. Subsequently, it estimates the posterior error probability for each substitution through a Bayesian formula that integrates prior knowledge of the overall sequencing error probability and the estimated SNP rate with the results from the logistic regression model for the given substitutions. The estimated posterior SNP probability can be used to distinguish true SNPs from sequencing errors. Validation results show that Atlas-SNP2 achieves a false-positive rate of lower than 10%, with an approximately 5% or lower false-negative rate.

Journal ArticleDOI
TL;DR: The Bayesian inference and prediction of the inverse Weibull distribution for Type-II censored data and a simulation study is performed in order to compare the proposed Bayes estimators with the maximum likelihood estimators.

Journal ArticleDOI
TL;DR: A novel and extensive data collection and modeling effort to define accident models for two-lane road sections based on a unique combination of exposure, geometry, consistency and context variables directly related to the safety performance is described.

Journal ArticleDOI
TL;DR: A Python package, ABC-SysBio, that implements parameter inference and model selection for dynamical systems in an approximate Bayesian computation (ABC) framework and is designed to work with models written in Systems Biology Markup Language (SBML).
Abstract: Motivation: The growing field of systems biology has driven demand for flexible tools to model and simulate biological systems. Two established problems in the modeling of biological processes are model selection and the estimation of associated parameters. A number of statistical approaches, both frequentist and Bayesian, have been proposed to answer these questions. Results: Here we present a Python package, ABC-SysBio, that implements parameter inference and model selection for dynamical systems in an approximate Bayesian computation (ABC) framework. ABC-SysBio combines three algorithms: ABC rejection sampler, ABC SMC for parameter inference and ABC SMC for model selection. It is designed to work with models written in Systems Biology Markup Language (SBML). Deterministic and stochastic models can be analyzed in ABC-SysBio. Availability: http://abc-sysbio.sourceforge.net Contact: ku.ca.lairepmi@senrab.rehpotsirhc; ku.ca.lairepmi@inott

Journal ArticleDOI
TL;DR: The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests in random matrices, but did not always correctly identify the non-random pair in a seeded matrix, and all of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices.
Abstract: A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence-absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence-absence matrix. We evaluated the per- formance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni correc- tions reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence-absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated ''checkerboard'' species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species.

Journal ArticleDOI
TL;DR: Model-based gene set analysis (MGSA) is presented that analyzes all categories at once by embedding them in a Bayesian network, in which gene response is modeled as a function of the activation of biological categories.
Abstract: The interpretation of data-driven experiments in genomics often involves a search for biological categories that are enriched for the responder genes identified by the experiments. However, knowledge bases such as the Gene Ontology (GO) contain hundreds or thousands of categories with very high overlap between categories. Thus, enrichment analysis performed on one category at a time frequently returns large numbers of correlated categories, leaving the choice of the most relevant ones to the user’s interpretation. Here we present model-based gene set analysis (MGSA) that analyzes all categories at once by embedding them in a Bayesian network, in which gene response is modeled as a function of the activation of biological categories. Probabilistic inference is used to identify the active categories. The Bayesian modeling approach naturally takes category overlap into account and avoids the need for multiple testing corrections met in single-category enrichment analysis. On simulated data, MGSA identifies active categories with up to 95% precision at a recall of 20% for moderate settings of noise, leading to a 10-fold precision improvement over single-category statistical enrichment analysis. Application to a gene expression data set in yeast demonstrates that the method provides high-level, summarized views of core biological processes and correctly eliminates confounding associations.

Journal ArticleDOI
TL;DR: The validity, reliability, and responsiveness of elicitation methods have been infrequently evaluated and strategies to reduce the effects of bias on the elicitation should be used.

Journal ArticleDOI
TL;DR: It is demonstrated that the Bayesian information criterion and decision theory are the most appropriate model-selection criteria because of their high accuracy and precision and that accurate model selection will serve to improve the reliability of phylogenetic inference and related analyses.
Abstract: Explicit evolutionary models are required in maximum-likelihood and Bayesian inference, the two methods that are overwhelmingly used in phylogenetic studies of DNA sequence data. Appropriate selection of nucleotide substitution models is important because the use of incorrect models can mislead phylogenetic inference. To better understand the performance of different model-selection criteria, we used 33,600 simulated data sets to analyse the accuracy, precision, dissimilarity, and biases of the hierarchical likelihood-ratio test, Akaike information criterion, Bayesian information criterion, and decision theory. We demonstrate that the Bayesian information criterion and decision theory are the most appropriate model-selection criteria because of their high accuracy and precision. Our results also indicate that in some situations different models are selected by different criteria for the same dataset. Such dissimilarity was the highest between the hierarchical likelihood-ratio test and Akaike information criterion, and lowest between the Bayesian information criterion and decision theory. The hierarchical likelihood-ratio test performed poorly when the true model included a proportion of invariable sites, while the Bayesian information criterion and decision theory generally exhibited similar performance to each other. Our results indicate that the Bayesian information criterion and decision theory should be preferred for model selection. Together with model-adequacy tests, accurate model selection will serve to improve the reliability of phylogenetic inference and related analyses.

Journal ArticleDOI
TL;DR: This paper proposes a flexible Bayesian quantile regression model that assumes that the error distribution is an infinite mixture of Gaussian densities subject to a stochastic constraint that enables inference on the quantile of interest.
Abstract: Quantile regression has emerged as a useful supplement to ordinary mean regression. Traditional frequentist quantile regression makes very minimal assumptions on the form of the error distribution and thus is able to accommodate nonnormal errors, which are common in many applications. However, inference for these models is challenging, particularly for clustered or censored data. A Bayesian approach enables exact inference and is well suited to incorporate clustered, missing, or censored data. In this paper, we propose a flexible Bayesian quantile regression model. We assume that the error distribution is an infinite mixture of Gaussian densities subject to a stochastic constraint that enables inference on the quantile of interest. This method outperforms the traditional frequentist method under a wide array of simulated data models. We extend the proposed approach to analyze clustered data. Here, we differentiate between and develop conditional and marginal models for clustered data. We apply our methods to analyze a multipatient apnea duration data set.

Book
03 Nov 2010
TL;DR: In this paper, large-sample theory for Likelihood ratio tests is used to estimate Equations and Maximum Likelihood, and Equivariant Estimation is used for Hypothesis Testing in Higher Dimensions.
Abstract: Probability and Measure.- Exponential Families.- Risk, Sufficiency, Completeness, and Ancillarity.- Unbiased Estimation.- Curved Exponential Families.- Conditional Distributions.- Bayesian Estimation.- Large-Sample Theory.- Estimating Equations and Maximum Likelihood.- Equivariant Estimation.- Empirical Bayes and Shrinkage Estimators.- Hypothesis Testing.- Optimal Tests in Higher Dimensions.- General Linear Model.- Bayesian Inference: Modeling and Computation.- Asymptotic Optimality1.- Large-Sample Theory for Likelihood Ratio Tests.- Nonparametric Regression.- Bootstrap Methods.- Sequential Methods.