scispace - formally typeset
Open AccessJournal ArticleDOI

FUBAR : A Fast, Unconstrained Bayesian AppRoximation for inferring selection

Reads0
Chats0
TLDR
This work presents an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes, and leaves the distribution of selection parameters essentially unconstrained.
Abstract
Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection—an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: We illustrate this on a large influenza hemagglutinin data set (3,142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (http://www.hyphy.org), as well as on the Datamonkey web server (http://www.datamonkey.org/).

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The 2019-new coronavirus epidemic: Evidence for virus evolution.

TL;DR: The phylogenetic tree showed that 2019‐nCoV significantly clustered with bat SARS‐like coronavirus sequence isolated in 2015, whereas structural analysis revealed mutation in Spike Glycoprotein and nucleocapsid protein.
Journal ArticleDOI

Datamonkey 2.0: A Modern Web Application for Characterizing Selective and Other Evolutionary Processes.

TL;DR: The release ofDatamonkey 2.0, a completely re-engineered version of the Datamonkey web-server for analyzing evolutionary signatures in sequence data, and HyPhy Vision, an accompanying JavaScript application for visualizing analysis results.
Journal ArticleDOI

Gene-Wide Identification of Episodic Selection

TL;DR: A new approach to identifying gene-wide evidence of episodic positive selection, where the non-synonymous substitution rate is transiently greater than the synonymous rate, and a computationally inexpensive evidence metric for identifying sites subject to episodicpositive selection on any foreground branches.
Journal ArticleDOI

Stability-mediated epistasis constrains the evolution of an influenza protein

TL;DR: This work created all intermediates along a 39-mutation evolutionary trajectory of influenza nucleoprotein, and introduced each mutation individually into the parent, painting a coherent portrait of epistasis during nucleop protein evolution.
Journal ArticleDOI

COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis.

TL;DR: The Open Reading Frame 1ab of COVID‐2019 has been analyzed to evidence the presence of mutation caused by selective pressure on the virus, and the stabilizing mutation falling in the endosome‐associated‐protein‐like domain of the nsp2 protein could account for CO VID‐2019 high ability of contagious, while the destabilizing mutation in nsp3 proteins could suggest a potential mechanism differentiating COVID•2019 from SARS.
References
More filters
Journal ArticleDOI

A Coefficient of agreement for nominal Scales

TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Book

Bayesian Data Analysis

TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided.
Journal ArticleDOI

Evolutionary trees from DNA sequences: A maximum likelihood approach

TL;DR: A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available that allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests.
Journal ArticleDOI

FastTree 2--approximately maximum-likelihood trees for large alignments.

TL;DR: Improvements to FastTree are described that improve its accuracy without sacrificing scalability, and FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments.

Feature selection based on mutual information: criteria ofmax-dependency, max-relevance, and min-redundancy

TL;DR: This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers).
Related Papers (5)