FUBAR : A Fast, Unconstrained Bayesian AppRoximation for inferring selection
Ben Murrell,Sasha Moola,Sasha Moola,Amandla Mabona,Amandla Mabona,Thomas Weighill,Daniel J. Sheward,Sergei L. Kosakovsky Pond,Konrad Scheffler,Konrad Scheffler +9 more
Reads0
Chats0
TLDR
This work presents an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes, and leaves the distribution of selection parameters essentially unconstrained.Abstract:
Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection—an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: We illustrate this on a large influenza hemagglutinin data set (3,142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (http://www.hyphy.org), as well as on the Datamonkey web server (http://www.datamonkey.org/).read more
Citations
More filters
Journal ArticleDOI
The 2019-new coronavirus epidemic: Evidence for virus evolution.
Domenico Benvenuto,Marta Giovanetti,Alessandra Ciccozzi,Silvia Spoto,Silvia Angeletti,Massimo Ciccozzi +5 more
TL;DR: The phylogenetic tree showed that 2019‐nCoV significantly clustered with bat SARS‐like coronavirus sequence isolated in 2015, whereas structural analysis revealed mutation in Spike Glycoprotein and nucleocapsid protein.
Journal ArticleDOI
Datamonkey 2.0: A Modern Web Application for Characterizing Selective and Other Evolutionary Processes.
Steven Weaver,Stephen D. Shank,Stephanie J. Spielman,Michael Li,Spencer V. Muse,Sergei L Kosakovsky Pond +5 more
TL;DR: The release ofDatamonkey 2.0, a completely re-engineered version of the Datamonkey web-server for analyzing evolutionary signatures in sequence data, and HyPhy Vision, an accompanying JavaScript application for visualizing analysis results.
Journal ArticleDOI
Gene-Wide Identification of Episodic Selection
Ben Murrell,Steven Weaver,M. D. Smith,Joel O. Wertheim,Sasha Murrell,Anthony Aylward,Kemal Eren,Tristan Pollner,Darren P. Martin,Davey M. Smith,Davey M. Smith,Konrad Scheffler,Konrad Scheffler,Sergei L. Kosakovsky Pond +13 more
TL;DR: A new approach to identifying gene-wide evidence of episodic positive selection, where the non-synonymous substitution rate is transiently greater than the synonymous rate, and a computationally inexpensive evidence metric for identifying sites subject to episodicpositive selection on any foreground branches.
Journal ArticleDOI
Stability-mediated epistasis constrains the evolution of an influenza protein
TL;DR: This work created all intermediates along a 39-mutation evolutionary trajectory of influenza nucleoprotein, and introduced each mutation individually into the parent, painting a coherent portrait of epistasis during nucleop protein evolution.
Journal ArticleDOI
COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis.
Silvia Angeletti,Domenico Benvenuto,Martina Bianchi,Marta Giovanetti,Stefano Pascarella,Massimo Ciccozzi +5 more
TL;DR: The Open Reading Frame 1ab of COVID‐2019 has been analyzed to evidence the presence of mutation caused by selective pressure on the virus, and the stabilizing mutation falling in the endosome‐associated‐protein‐like domain of the nsp2 protein could account for CO VID‐2019 high ability of contagious, while the destabilizing mutation in nsp3 proteins could suggest a potential mechanism differentiating COVID•2019 from SARS.
References
More filters
Journal ArticleDOI
A Coefficient of agreement for nominal Scales
TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Book
Bayesian Data Analysis
TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided.
Journal ArticleDOI
Evolutionary trees from DNA sequences: A maximum likelihood approach
TL;DR: A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available that allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests.
Journal ArticleDOI
FastTree 2--approximately maximum-likelihood trees for large alignments.
TL;DR: Improvements to FastTree are described that improve its accuracy without sacrificing scalability, and FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments.
Feature selection based on mutual information: criteria ofmax-dependency, max-relevance, and min-redundancy
TL;DR: This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers).