Showing papers in "Molecular Biology and Evolution in 2020"

PDF

Open Access

Journal Article•DOI•

IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.

[...]

Bui Quang Minh¹, Heiko A. Schmidt², Olga Chernomor², Dominik Schrempf², Dominik Schrempf³, Michael D. Woodhams⁴, Arndt von Haeseler², Arndt von Haeseler⁵, Robert Lanfear¹ - Show less +5 more•Institutions (5)

Australian National University¹, Medical University of Vienna², Eötvös Loránd University³, University of Tasmania⁴, University of Vienna⁵

01 May 2020-Molecular Biology and Evolution

TL;DR: Some notable features of IQ-TREE version 2 are described and the key advantages over other software are highlighted.

...read moreread less

Abstract: IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.

...read moreread less

4,337 citations

Journal Article•DOI•

Molecular Evolutionary Genetics Analysis (MEGA) for macOS.

[...]

Glen Stecher¹, Koichiro Tamura², Sudhir Kumar¹, Sudhir Kumar³•Institutions (3)

Temple University¹, Tokyo Metropolitan University², King Abdulaziz University³

01 Apr 2020-Molecular Biology and Evolution

TL;DR: The macOS version of the MEGA software, which eliminates the need for virtualization and emulation programs, has a native Cocoa graphical user interface that is programmed to provide a consistent user experience across macOS, Windows, and Linux.

...read moreread less

Abstract: The Molecular Evolutionary Genetics Analysis (MEGA) software enables comparative analysis of molecular sequences in phylogenetics and evolutionary medicine. Here, we introduce the macOS version of the MEGA software. This new version eliminates the need for virtualization and emulation programs previously required to use MEGA on Apple computers. MEGA for macOS utilizes memory and computing resources efficiently for conducting evolutionary analyses on macOS. It has a native Cocoa graphical user interface that is programmed to provide a consistent user experience across macOS, Windows, and Linux. MEGA for macOS is available from www.megasoftware.net free of charge.

...read moreread less

896 citations

Journal Article•DOI•

ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models

[...]

Diego Darriba¹, David Posada², Alexey M. Kozlov¹, Alexandros Stamatakis¹, Alexandros Stamatakis³, Benoit Morel¹, Tomas Flouri⁴ - Show less +3 more•Institutions (4)

Heidelberg Institute for Theoretical Studies¹, University of Vigo², Karlsruhe Institute of Technology³, University College London⁴

01 Jan 2020-Molecular Biology and Evolution

TL;DR: ModelTest-NG is a reimplementation from scratch of jModelTest and ProtTest, two popular tools for selecting the best-fit nucleotide and amino acid substitution models, respectively, and introduces several new features, such as ascertainment bias correction, mixture, and free-rate models, or the automatic processing of single partitions.

...read moreread less

Abstract: ModelTest-NG is a reimplementation from scratch of jModelTest and ProtTest, two popular tools for selecting the best-fit nucleotide and amino acid substitution models, respectively. ModelTest-NG is one to two orders of magnitude faster than jModelTest and ProtTest but equally accurate and introduces several new features, such as ascertainment bias correction, mixture, and free-rate models, or the automatic processing of single partitions. ModelTest-NG is available under a GNU GPL3 license at https://github.com/ddarriba/modeltest , last accessed September 2, 2019.

...read moreread less

783 citations

Journal Article•DOI•

Treeio: An R Package for Phylogenetic Tree Input and Output with Richly Annotated and Associated Data.

[...]

Li-Gen Wang¹, Tommy Tsan-Yuk Lam², Shuangbin Xu¹, Zehan Dai¹, Lang Zhou¹, Tingze Feng¹, Pingfan Guo¹, Casey W. Dunn³, Bradley R Jones⁴, Tyler Bradley⁵, Hongbo Zhu², Yi Guan², Yong Jiang¹, Guangchuang Yu¹ - Show less +10 more•Institutions (5)

Southern Medical University¹, University of Hong Kong², Yale University³, University of British Columbia⁴, Drexel University⁵

01 Feb 2020-Molecular Biology and Evolution

TL;DR: The treeio package is designed to connect phylogenetic tree input and output, and can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context.

...read moreread less

Abstract: Phylogenetic trees and data are often stored in incompatible and inconsistent formats. The outputs of software tools that contain trees with analysis findings are often not compatible with each other, making it hard to integrate the results of different analyses in a comparative study. The treeio package is designed to connect phylogenetic tree input and output. It supports extracting phylogenetic trees as well as the outputs of commonly used analytical software. It can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context. Treeio also supports export of a phylogenetic tree with heterogeneous-associated data to a single tree file, including BEAST compatible NEXUS and jtree formats; these facilitate data sharing as well as file format conversion for downstream analysis. The treeio package is designed to work with the tidytree and ggtree packages. Tree data can be processed using the tidy interface with tidytree and visualized by ggtree. The treeio package is released within the Bioconductor and rOpenSci projects. It is available at https://www.bioconductor.org/packages/treeio/.

...read moreread less

274 citations

Journal Article•DOI•

New Methods to Calculate Concordance Factors for Phylogenomic Datasets.

[...]

Bui Quang Minh¹, Matthew W. Hahn², Robert Lanfear¹•Institutions (2)

Australian National University¹, Indiana University²

01 Sep 2020-Molecular Biology and Evolution

TL;DR: GCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites, and are implemented in the IQ-TREE software package.

...read moreread less

Abstract: We implement two measures for quantifying genealogical concordance in phylogenomic data sets: the gene concordance factor (gCF) and the novel site concordance factor (sCF). For every branch of a reference tree, gCF is defined as the percentage of "decisive" gene trees containing that branch. This measure is already in wide usage, but here we introduce a package that calculates it while accounting for variable taxon coverage among gene trees. sCF is a new measure defined as the percentage of decisive sites supporting a branch in the reference tree. gCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites. An easy to use implementation and tutorial is freely available in the IQ-TREE software package (http://www.iqtree.org/doc/Concordance-Factor, last accessed May 13, 2020).

...read moreread less

267 citations

Journal Article•DOI•

RASP 4: ancestral state reconstruction tool for multiple genes and characters

[...]

Yan Yu¹, Christopher Blair², Christopher Blair³, Xing-Jin He¹•Institutions (3)

Sichuan University¹, New York City College of Technology², The Graduate Center, CUNY³

01 Feb 2020-Molecular Biology and Evolution

TL;DR: RASP as discussed by the authors is a software to reconstruct ancestral states through phylogenetic trees, which can apply generalized statistical ancestral reconstruction methods to phylogenies, explore the phylogenetic signal of characters to particular trees, calculate distances between trees, and cluster trees into groups.

...read moreread less

Abstract: With the continual progress of sequencing techniques, genome-scale data are increasingly used in phylogenetic studies. With more data from throughout the genome, the relationship between genes and different kinds of characters is receiving more attention. Here, we present version 4 of RASP, a software to reconstruct ancestral states through phylogenetic trees. RASP can apply generalized statistical ancestral reconstruction methods to phylogenies, explore the phylogenetic signal of characters to particular trees, calculate distances between trees, and cluster trees into groups. RASP 4 has an improved graphic user interface and is freely available from http://mnh.scu.edu.cn/soft/blog/RASP (program) and https://github.com/sculab/RASP (source code).

...read moreread less

259 citations

Journal Article•DOI•

HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies.

[...]

Sergei L Kosakovsky Pond¹, Art F. Y. Poon², Ryan Velazquez¹, Steven Weaver¹, N. Lance Hepler, Ben Murrell³, Stephen D. Shank¹, Brittany Rife Magalis¹, Dave Bouvier⁴, Anton Nekrutenko⁴, Sadie R Wisotsky⁵, Sadie R Wisotsky¹, Stephanie J. Spielman¹, Stephanie J. Spielman⁶, Simon D. W. Frost⁷, Simon D. W. Frost⁸, Spencer V. Muse⁵ - Show less +13 more•Institutions (8)

Temple University¹, University of Western Ontario², Karolinska Institutet³, Pennsylvania State University⁴, North Carolina State University⁵, Rowan University⁶, University of Cambridge⁷, The Turing Institute⁸

01 Jan 2020-Molecular Biology and Evolution

TL;DR: The 2.5 release of Hyphy includes a completely re-engineered computational core and analysis library that introduces new classes of evolutionary models and statistical tests, delivers substantial performance and stability enhancements, improves usability, streamlines end-to-end analysis workflows, makes it easier to develop custom analyses, and is mostly backwards compatible with previous HyPhy releases.

...read moreread less

Abstract: HYpothesis testing using PHYlogenies (HyPhy) is a scriptable, open-source package for fitting a broad range of evolutionary models to multiple sequence alignments, and for conducting subsequent parameter estimation and hypothesis testing, primarily in the maximum likelihood statistical framework. It has become a popular choice for characterizing various aspects of the evolutionary process: natural selection, evolutionary rates, recombination, and coevolution. The 2.5 release (available from www.hyphy.org) includes a completely re-engineered computational core and analysis library that introduces new classes of evolutionary models and statistical tests, delivers substantial performance and stability enhancements, improves usability, streamlines end-to-end analysis workflows, makes it easier to develop custom analyses, and is mostly backward compatible with previous HyPhy releases.

...read moreread less

252 citations

Journal Article•DOI•

Corrigendum to: IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.

[...]

Bui Quang Minh, Heiko A. Schmidt, Olga Chernomor, Dominik Schrempf, Michael D. Woodhams, Arndt von Haeseler, Robert Lanfear - Show less +3 more

01 Aug 2020-Molecular Biology and Evolution

197 citations

Journal Article•DOI•

Extreme Genomic CpG Deficiency in SARS-CoV-2 and Evasion of Host Antiviral Defense.

[...]

Xuhua Xia¹•Institutions (1)

University of Ottawa¹

01 Sep 2020-Molecular Biology and Evolution

TL;DR: It is shown that SARS-CoV-2 has the most extreme CpG deficiency in all known betacoronavirus genomes, and viral surveys focused on decreasing C pG in viral RNA genomes may provide important clues about the selective environments and viral defenses in the original hosts.

...read moreread less

Abstract: Wild mammalian species, including bats, constitute the natural reservoir of betacoronavirus (including SARS, MERS, and the deadly SARS-CoV-2). Different hosts or host tissues provide different cellular environments, especially different antiviral and RNA modification activities that can alter RNA modification signatures observed in the viral RNA genome. The zinc finger antiviral protein (ZAP) binds specifically to CpG dinucleotides and recruits other proteins to degrade a variety of viral RNA genomes. Many mammalian RNA viruses have evolved CpG deficiency. Increasing CpG dinucleotides in these low-CpG viral genomes in the presence of ZAP consistently leads to decreased viral replication and virulence. Because ZAP exhibits tissue-specific expression, viruses infecting different tissues are expected to have different CpG signatures, suggesting a means to identify viral tissue-switching events. The author shows that SARS-CoV-2 has the most extreme CpG deficiency in all known betacoronavirus genomes. This suggests that SARS-CoV-2 may have evolved in a new host (or new host tissue) with high ZAP expression. A survey of CpG deficiency in viral genomes identified a virulent canine coronavirus (alphacoronavirus) as possessing the most extreme CpG deficiency, comparable with that observed in SARS-CoV-2. This suggests that the canine tissue infected by the canine coronavirus may provide a cellular environment strongly selecting against CpG. Thus, viral surveys focused on decreasing CpG in viral RNA genomes may provide important clues about the selective environments and viral defenses in the original hosts.

...read moreread less

145 citations

Journal Article•DOI•

Genomic Evidence for Complex Domestication History of the Cultivated Tomato in Latin America.

[...]

Hamid Razifard¹, Alexis Ramos², Audrey L Della Valle¹, Cooper Bodary³, Erika Goetz³, Elizabeth J Manser¹, Xiang Li⁴, Lei Zhang², Sofia Visa³, Denise M. Tieman⁴, Esther van der Knaap², Ana L. Caicedo¹ - Show less +8 more•Institutions (4)

University of Massachusetts Amherst¹, University of Georgia², College of Wooster³, University of Florida⁴

01 Apr 2020-Molecular Biology and Evolution

TL;DR: The results suggest that the origin of SLC may predate domestication, and that many traits considered typical of cultivated tomatoes arose in South American SLC, but were lost or diminished once these partially domesticated forms spread northward.

...read moreread less

Abstract: The process of plant domestication is often protracted, involving underexplored intermediate stages with important implications for the evolutionary trajectories of domestication traits. Previously, tomato domestication history has been thought to involve two major transitions: one from wild Solanum pimpinellifolium L. to a semidomesticated intermediate, S. lycopersicum L. var. cerasiforme (SLC) in South America, and a second transition from SLC to fully domesticated S. lycopersicum L. var. lycopersicum in Mesoamerica. In this study, we employ population genomic methods to reconstruct tomato domestication history, focusing on the evolutionary changes occurring in the intermediate stages. Our results suggest that the origin of SLC may predate domestication, and that many traits considered typical of cultivated tomatoes arose in South American SLC, but were lost or diminished once these partially domesticated forms spread northward. These traits were then likely reselected in a convergent fashion in the common cultivated tomato, prior to its expansion around the world. Based on these findings, we reveal complexities in the intermediate stage of tomato domestication and provide insight on trajectories of genes and phenotypes involved in tomato domestication syndrome. Our results also allow us to identify underexplored germplasm that harbors useful alleles for crop improvement.

...read moreread less

106 citations

Journal Article•DOI•

ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.

[...]

Chao Zhang¹, Celine Scornavacca², Erin K. Molloy³, Siavash Mirarab¹•Institutions (3)

University of California, San Diego¹, University of Montpellier², University of Illinois at Urbana–Champaign³

01 Nov 2020-Molecular Biology and Evolution

TL;DR: This work proposes a measure of quartet similarity between single-copy and multicopy trees that accounts for orthology and paralogy and introduces a method called ASTRAL-Pro (ASTRAL for PaRalogs and Orthologs) to find the species tree that optimizes the measure using dynamic programing.

...read moreread less

Abstract: Phylogenetic inference from genome-wide data (phylogenomics) has revolutionized the study of evolution because it enables accounting for discordance among evolutionary histories across the genome. To this end, summary methods have been developed to allow accurate and scalable inference of species trees from gene trees. However, most of these methods, including the widely used ASTRAL, can only handle single-copy gene trees and do not attempt to model gene duplication and gene loss. As a result, most phylogenomic studies have focused on single-copy genes and have discarded large parts of the data. Here, we first propose a measure of quartet similarity between single-copy and multicopy trees that accounts for orthology and paralogy. We then introduce a method called ASTRAL-Pro (ASTRAL for PaRalogs and Orthologs) to find the species tree that optimizes our quartet similarity measure using dynamic programing. By studying its performance on an extensive collection of simulated data sets and on real data sets, we show that ASTRAL-Pro is more accurate than alternative methods.

...read moreread less

Journal Article•DOI•

A Bayesian Implementation of the Multispecies Coalescent Model with Introgression for Phylogenomic Analysis.

[...]

Tomas Flouri¹, Xiyun Jiao¹, Bruce Rannala², Ziheng Yang¹•Institutions (2)

University College London¹, University of California, Davis²

01 Apr 2020-Molecular Biology and Evolution

TL;DR: The multispecies-coalescent-with-introgression model accommodates deep coalescence and introgression and provides a natural framework for inference using genomic sequence data, and computer simulation confirms the good statistical properties of the method.

...read moreread less

Abstract: Recent analyses suggest that cross-species gene flow or introgression is common in nature, especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in speciation. Here, we implement the multispecies-coalescent-with-introgression model, an extension of the multispecies-coalescent model to incorporate introgression, in our Bayesian Markov chain Monte Carlo program Bpp. The multispecies-coalescent-with-introgression model accommodates deep coalescence (or incomplete lineage sorting) and introgression and provides a natural framework for inference using genomic sequence data. Computer simulation confirms the good statistical properties of the method, although hundreds or thousands of loci are typically needed to estimate introgression probabilities reliably. Reanalysis of data sets from the purple cone spruce confirms the hypothesis of homoploid hybrid speciation. We estimated the introgression probability using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, likely driven by differential selection against introgressed alleles.

...read moreread less

Journal Article•DOI•

Recent Demographic History Inferred by High-Resolution Analysis of Linkage Disequilibrium.

[...]

Enrique Santiago¹, Irene Novo², Antonio F. Pardiñas³, María Saura, J Wang⁴, Armando Caballero² - Show less +2 more•Institutions (4)

University of Oviedo¹, University of Vigo², Cardiff University³, Zoological Society of London⁴

16 Dec 2020-Molecular Biology and Evolution

TL;DR: A theoretical and computational framework to infer the demographic history of a population within the past 100 generations from the observed spectrum of linkage disequilibrium (LD) of pairs of loci over a wide range of recombination rates in a sample of contemporary individuals is developed.

...read moreread less

Abstract: Inferring changes in effective population size (Ne) in the recent past is of special interest for conservation of endangered species and for human history research. Current methods for estimating the very recent historical Ne are unable to detect complex demographic trajectories involving multiple episodes of bottlenecks, drops, and expansions. We develop a theoretical and computational framework to infer the demographic history of a population within the past 100 generations from the observed spectrum of linkage disequilibrium (LD) of pairs of loci over a wide range of recombination rates in a sample of contemporary individuals. The cumulative contributions of all of the previous generations to the observed LD are included in our model, and a genetic algorithm is used to search for the sequence of historical Ne values that best explains the observed LD spectrum. The method can be applied from large samples to samples of fewer than ten individuals using a variety of genotyping and DNA sequencing data: haploid, diploid with phased or unphased genotypes and pseudohaploid data from low-coverage sequencing. The method was tested by computer simulation for sensitivity to genotyping errors, temporal heterogeneity of samples, population admixture, and structural division into subpopulations, showing high tolerance to deviations from the assumptions of the model. Computer simulations also show that the proposed method outperforms other leading approaches when the inference concerns recent timeframes. Analysis of data from a variety of human and animal populations gave results in agreement with previous estimations by other methods or with records of historical events.

...read moreread less

Journal Article•DOI•

Ancestral Hybridization Facilitated Species Diversification in the Lake Malawi Cichlid Fish Adaptive Radiation.

[...]

Hannes Svardal, Fu Xiang Quah¹, Milan Malinsky², Benjamin P. Ngatunga, Eric A. Miska³, Eric A. Miska¹, Walter Salzburger², Martin J. Genner⁴, George F. Turner⁵, Richard Durbin³, Richard Durbin¹ - Show less +7 more•Institutions (5)

Wellcome Trust Sanger Institute¹, University of Basel², University of Cambridge³, University of Bristol⁴, Bangor University⁵

01 Apr 2020-Molecular Biology and Evolution

TL;DR: The results reinforce the role of ancestral hybridization in explosive diversification by demonstrating its significance in one of the largest recent vertebrate adaptive radiations.

...read moreread less

Abstract: The adaptive radiation of cichlid fishes in East African Lake Malawi encompasses over 500 species that are believed to have evolved within the last 800,000 years from a common founder population. It has been proposed that hybridization between ancestral lineages can provide the genetic raw material to fuel such exceptionally high diversification rates, and evidence for this has recently been presented for the Lake Victoria region cichlid superflock. Here, we report that Lake Malawi cichlid genomes also show evidence of hybridization between two lineages that split 3-4 Ma, today represented by Lake Victoria cichlids and the riverine Astatotilapia sp. "ruaha blue." The two ancestries in Malawi cichlid genomes are present in large blocks of several kilobases, but there is little variation in this pattern between Malawi cichlid species, suggesting that the large-scale mosaic structure of the genomes was largely established prior to the radiation. Nevertheless, tens of thousands of polymorphic variants apparently derived from the hybridization are interspersed in the genomes. These loci show a striking excess of differentiation across ecological subgroups in the Lake Malawi cichlid assemblage, and parental alleles sort differentially into benthic and pelagic Malawi cichlid lineages, consistent with strong differential selection on these loci during species divergence. Furthermore, these loci are enriched for genes involved in immune response and vision, including opsin genes previously identified as important for speciation. Our results reinforce the role of ancestral hybridization in explosive diversification by demonstrating its significance in one of the largest recent vertebrate adaptive radiations.

...read moreread less

Journal Article•DOI•

Performing Highly Efficient Genome Scans for Local Adaptation with R Package pcadapt Version 4.

[...]

Florian Privé¹, Florian Privé², Keurcien Luu¹, Bjarni J. Vilhjálmsson², Michael G. B. Blum¹ - Show less +1 more•Institutions (2)

University of Grenoble¹, Aarhus University²

01 Jul 2020-Molecular Biology and Evolution

TL;DR: The pcadapt package as mentioned in this paper is a R package for performing genome scans for local adaptation, which substantially improves computational efficiency by using a different format for storing genotypes and a different algorithm for computing principal components of the genotype matrix.

...read moreread less

Abstract: R package pcadapt is a user-friendly R package for performing genome scans for local adaptation. Here, we present version 4 of pcadapt which substantially improves computational efficiency while providing similar results. This improvement is made possible by using a different format for storing genotypes and a different algorithm for computing principal components of the genotype matrix, which is the most computationally demanding step in method pcadapt. These changes are seamlessly integrated into the existing pcadapt package, and users will experience a large reduction in computation time (by a factor of 20-60 in our analyses) as compared with previous versions.

...read moreread less

Journal Article•DOI•

Genomic Analysis of European Drosophila melanogaster Populations Reveals Longitudinal Structure, Continent-Wide Selection, and Previously Unknown DNA Viruses

[...]

Martin Kapun, Maite G. Barrón¹, Fabian Staubach², Darren J. Obbard³, R. Axel W. Wiberg⁴, R. Axel W. Wiberg⁵, Jorge Vieira⁶, Jorge Vieira⁷, Clément Goubert⁸, Clément Goubert⁹, Omar Rota-Stabelli, Maaria Kankare¹⁰, María Bogaerts-Márquez¹, Annabelle Haudry⁹, Lena Waidele², Iryna Kozeretska¹¹, Iryna Kozeretska¹², Elena G. Pasyukova, Volker Loeschcke¹³, Marta Pascual¹⁴, Cristina P. Vieira⁶, Cristina P. Vieira⁷, Svitlana Serga¹², Catherine Montchamp-Moreau¹⁵, Jessica K. Abbott¹⁶, Patricia Gibert⁹, Damiano Porcelli, Nico Posnien¹⁷, Alejandro Sánchez-Gracia¹⁴, Sonja Grath¹⁸, Élio Sucena¹⁹, Élio Sucena²⁰, Alan O. Bergland²¹, Maria Pilar Garcia Guerreiro²², Banu Sebnem Onder²³, Eliza Argyridou¹⁸, Lain Guio¹, Mads Fristrup Schou¹⁶, Mads Fristrup Schou¹³, Bart Deplancke²⁴, Cristina Vieira⁹, Michael G. Ritchie⁵, Bas J. Zwaan²⁵, Eran Tauber²⁶, Dorcas J. Orengo¹⁴, Eva Puerma¹⁴, Montserrat Aguadé¹⁴, Paul Schmidt²⁷, John Parsch¹⁸, Andrea J. Betancourt²⁸, Thomas Flatt²⁹, Thomas Flatt³⁰, Josefa González¹ - Show less +49 more•Institutions (30)

Pompeu Fabra University¹, University of Freiburg², University of Edinburgh³, University of Basel⁴, University of St Andrews⁵, Instituto de Biologia Molecular e Celular⁶, University of Porto⁷, Cornell University⁸, University of Lyon⁹, University of Jyväskylä¹⁰, Ministry of Education and Science of Ukraine¹¹, Taras Shevchenko National University of Kyiv¹², Aarhus University¹³, University of Barcelona¹⁴, Université Paris-Saclay¹⁵, Lund University¹⁶, University of Göttingen¹⁷, Ludwig Maximilian University of Munich¹⁸, University of Lisbon¹⁹, Instituto Gulbenkian de Ciência²⁰, University of Virginia²¹, Autonomous University of Barcelona²², Hacettepe University²³, École Polytechnique Fédérale de Lausanne²⁴, Wageningen University and Research Centre²⁵, University of Haifa²⁶, University of Pennsylvania²⁷, University of Liverpool²⁸, University of Lausanne²⁹, University of Fribourg³⁰

01 Sep 2020-Molecular Biology and Evolution

TL;DR: These analyses uncover longitudinal population structure, provide evidence for continent-wide selective sweeps, identify candidate genes for local climate adaptation, and document clines in chromosomal inversion and transposable element frequencies in European Drosophila melanogaster.

...read moreread less

Abstract: Genetic variation is the fuel of evolution, with standing genetic variation especially important for short-term evolution and local adaptation. To date, studies of spatiotemporal patterns of genetic variation in natural populations have been challenging, as comprehensive sampling is logistically difficult, and sequencing of entire populations costly. Here, we address these issues using a collaborative approach, sequencing 48 pooled population samples from 32 locations, and perform the first continent-wide genomic analysis of genetic variation in European Drosophila melanogaster. Our analyses uncover longitudinal population structure, provide evidence for continent-wide selective sweeps, identify candidate genes for local climate adaptation, and document clines in chromosomal inversion and transposable element frequencies. We also characterize variation among populations in the composition of the fly microbiome, and identify five new DNA viruses in our samples.

...read moreread less

Journal Article•DOI•

Predicting the Landscape of Recombination Using Deep Learning

[...]

Jeffrey R. Adrion¹, Jared Galloway¹, Andrew D. Kern¹•Institutions (1)

University of Oregon¹

01 Jun 2020-Molecular Biology and Evolution

TL;DR: ReLERNN is described, a deep learning method for estimating a genome-wide recombination map that is accurate even with small numbers of pooled or individually sequenced genomes and maintains high accuracy in the face of demographic model misspecification, missing genotype calls, and genome inaccessibility.

...read moreread less

Abstract: Accurately inferring the genome-wide landscape of recombination rates in natural populations is a central aim in genomics, as patterns of linkage influence everything from genetic mapping to understanding evolutionary history. Here, we describe recombination landscape estimation using recurrent neural networks (ReLERNN), a deep learning method for estimating a genome-wide recombination map that is accurate even with small numbers of pooled or individually sequenced genomes. Rather than use summaries of linkage disequilibrium as its input, ReLERNN takes columns from a genotype alignment, which are then modeled as a sequence across the genome using a recurrent neural network. We demonstrate that ReLERNN improves accuracy and reduces bias relative to existing methods and maintains high accuracy in the face of demographic model misspecification, missing genotype calls, and genome inaccessibility. We apply ReLERNN to natural populations of African Drosophila melanogaster and show that genome-wide recombination landscapes, although largely correlated among populations, exhibit important population-specific differences. Lastly, we connect the inferred patterns of recombination with the frequencies of major inversions segregating in natural Drosophila populations.

...read moreread less

Journal Article•DOI•

GeneRax: A Tool for Species-Tree-Aware Maximum Likelihood-Based Gene Family Tree Inference under Gene Duplication, Transfer, and Loss.

[...]

Benoit Morel¹, Alexey M. Kozlov¹, Alexandros Stamatakis¹, Alexandros Stamatakis², Gergely J. Szöllősi³, Gergely J. Szöllősi⁴ - Show less +2 more•Institutions (4)

Heidelberg Institute for Theoretical Studies¹, Karlsruhe Institute of Technology², Eötvös Loránd University³, Hungarian Academy of Sciences⁴

01 Sep 2020-Molecular Biology and Evolution

TL;DR: GeneRax is the first maximum likelihood species-tree-aware phylogenetic inference software that simultaneously accounts for substitutions at the sequence level as well as gene level events, such as duplication, transfer, and loss relying on established maximum likelihood optimization algorithms.

...read moreread less

Abstract: Inferring phylogenetic trees for individual homologous gene families is difficult because alignments are often too short, and thus contain insufficient signal, while substitution models inevitably fail to capture the complexity of the evolutionary processes. To overcome these challenges, species-tree-aware methods also leverage information from a putative species tree. However, only few methods are available that implement a full likelihood framework or account for horizontal gene transfers. Furthermore, these methods often require expensive data preprocessing (e.g., computing bootstrap trees) and rely on approximations and heuristics that limit the degree of tree space exploration. Here, we present GeneRax, the first maximum likelihood species-tree-aware phylogenetic inference software. It simultaneously accounts for substitutions at the sequence level as well as gene level events, such as duplication, transfer, and loss relying on established maximum likelihood optimization algorithms. GeneRax can infer rooted phylogenetic trees for multiple gene families, directly from the per-gene sequence alignments and a rooted, yet undated, species tree. We show that compared with competing tools, on simulated data GeneRax infers trees that are the closest to the true tree in 90% of the simulations in terms of relative Robinson-Foulds distance. On empirical data sets, GeneRax is the fastest among all tested methods when starting from aligned sequences, and it infers trees with the highest likelihood score, based on our model. GeneRax completed tree inferences and reconciliations for 1,099 Cyanobacteria families in 8 min on 512 CPU cores. Thus, its parallelization scheme enables large-scale analyses. GeneRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax (last accessed June 17, 2020).

...read moreread less

Journal Article•DOI•

Stress-Driven Transposable Element De-repression Dynamics and Virulence Evolution in a Fungal Pathogen.

[...]

Simone Fouché¹, Thomas Badet, Ursula Oggenfuss, Clémence Plissonneau¹, Carolina Sardinha Francisco¹, Daniel Croll - Show less +2 more•Institutions (1)

ETH Zurich¹

01 Jan 2020-Molecular Biology and Evolution

TL;DR: The complexity of TE responsiveness to stress across genetic backgrounds and genomic locations demonstrates substantial intra-specific genetic variation to control TEs with consequences for virulence.

...read moreread less

Abstract: Transposable elements (TEs) are drivers of genome evolution and affect the expression landscape of the host genome. Stress is a major factor inducing TE activity; however, the regulatory mechanisms underlying de-repression are poorly understood. Plant pathogens are excellent models to dissect the impact of stress on TEs. The process of plant infection induces stress for the pathogen, and virulence factors (i.e., effectors) located in TE-rich regions become expressed. To dissect TE de-repression dynamics and contributions to virulence, we analyzed the TE expression landscape of four strains of the major wheat pathogen Zymoseptoria tritici. We experimentally exposed strains to nutrient starvation and host infection stress. Contrary to expectations, we show that the two distinct conditions induce the expression of different sets of TEs. In particular, the most highly expressed TEs, including miniature inverted-repeat transposable element and long terminal repeat-Gypsy element, show highly distinct de-repression across stress conditions. Both the genomic context of TEs and the genetic background stress (i.e., different strains harboring the same TEs) were major predictors of de-repression under stress. Gene expression profiles under stress varied significantly depending on the proximity to the closest TEs and genomic defenses against TEs were largely ineffective to prevent de-repression. Next, we analyzed the locus encoding the Avr3D1 effector. We show that the insertion and subsequent silencing of TEs in close proximity likely contributed to reduced expression and virulence on a specific wheat cultivar. The complexity of TE responsiveness to stress across genetic backgrounds and genomic locations demonstrates substantial intraspecific genetic variation to control TEs with consequences for virulence.

...read moreread less

Journal Article•DOI•

Asterid Phylogenomics/Phylotranscriptomics Uncover Morphological Evolutionary Histories and Support Phylogenetic Placement for Numerous Whole-Genome Duplications

[...]

Caifei Zhang¹, Taikui Zhang¹, Federico Luebert², Federico Luebert³, Yezi Xiang¹, Chien Hsun Huang¹, Yi Hu⁴, Mathew Rees⁵, Michael W. Frohlich⁵, Ji Qi¹, Maximilian Weigend³, Hong Ma⁴ - Show less +8 more•Institutions (5)

Fudan University¹, University of Chile², University of Bonn³, Pennsylvania State University⁴, Royal Botanic Gardens⁵

01 Nov 2020-Molecular Biology and Evolution

TL;DR: An Aptian (Early Cretaceous) origin of asterids and the origin of all orders before the K-Pg boundary is supported and Ancestral state reconstruction at the family level suggests that the asterid ancestor was a woody terrestrial plant with simple leaves, bisexual and actinomorphic flowers with free petals and free anthers.

...read moreread less

Abstract: Asterids are one of the most successful angiosperm lineages, exhibiting extensive morphological diversity and including a number of important crops. Despite their biological prominence and value to humans, the deep asterid phylogeny has not been fully resolved, and the evolutionary landscape underlying their radiation remains unknown. To resolve the asterid phylogeny, we sequenced 213 transcriptomes/genomes and combined them with other data sets, representing all accepted orders and nearly all families of asterids. We show fully supported monophyly of asterids, Berberidopsidales as sister to asterids, monophyly of all orders except Icacinales, Aquifoliales, and Bruniales, and monophyly of all families except Icacinaceae and Ehretiaceae. Novel taxon placements benefited from the expanded sampling with living collections from botanical gardens, resolving hitherto uncertain relationships. The remaining ambiguous placements here are likely due to limited sampling and could be addressed in the future with relevant additional taxa. Using our well-resolved phylogeny as reference, divergence time estimates support an Aptian (Early Cretaceous) origin of asterids and the origin of all orders before the Cretaceous-Paleogene boundary. Ancestral state reconstruction at the family level suggests that the asterid ancestor was a woody terrestrial plant with simple leaves, bisexual, and actinomorphic flowers with free petals and free anthers, a superior ovary with a style, and drupaceous fruits. Whole-genome duplication (WGD) analyses provide strong evidence for 33 WGDs in asterids and one in Berberidopsidales, including four suprafamilial and seven familial/subfamilial WGDs. Our results advance the understanding of asterid phylogeny and provide numerous novel evolutionary insights into their diversification and morphological evolution.

...read moreread less

Journal Article•DOI•

Genomic Epidemiology, Evolution, and Transmission Dynamics of Porcine Deltacoronavirus.

[...]

Wan Ting He¹, Xiang Ji², Xiang Ji³, Wei He¹, Simon Dellicour⁴, Simon Dellicour⁵, Shilei Wang¹, Gairu Li¹, Letian Zhang¹, Marius Gilbert⁴, Henan Zhu², Gang Xing⁶, Michael Veit⁷, Zhen Huang, Guan-Zhu Han⁸, Yao-Wei Huang⁶, Marc A. Suchard², Guy Baele⁵, Philippe Lemey⁵, Shuo Su¹ - Show less +16 more•Institutions (8)

Nanjing Agricultural University¹, University of California, Los Angeles², Tulane University³, Université libre de Bruxelles⁴, Katholieke Universiteit Leuven⁵, Zhejiang University⁶, Free University of Berlin⁷, Nanjing Normal University⁸

01 Sep 2020-Molecular Biology and Evolution

TL;DR: An integrated analysis of full genome sequence data from 21 newly sequenced viruses, along with comprehensive epidemiological surveillance data collected globally over the last 15 years found four distinct phylogenetic lineages of PDCoV, which differ in their geographic circulation patterns.

...read moreread less

Abstract: The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has shown once again that coronavirus (CoV) in animals are potential sources for epidemics in humans. Porcine deltacoronavirus (PDCoV) is an emerging enteropathogen of swine with a worldwide distribution. Here, we implemented and described an approach to analyze the epidemiology of PDCoV following its emergence in the pig population. We performed an integrated analysis of full genome sequence data from 21 newly sequenced viruses, along with comprehensive epidemiological surveillance data collected globally over the last 15 years. We found four distinct phylogenetic lineages of PDCoV, which differ in their geographic circulation patterns. Interestingly, we identified more frequent intra- and interlineage recombination and higher virus genetic diversity in the Chinese lineages compared with the USA lineage where pigs are raised in different farming systems and ecological environments. Most recombination breakpoints are located in the ORF1ab gene rather than in genes encoding structural proteins. We also identified five amino acids under positive selection in the spike protein suggesting a role for adaptive evolution. According to structural mapping, three positively selected sites are located in the N-terminal domain of the S1 subunit, which is the most likely involved in binding to a carbohydrate receptor, whereas the other two are located in or near the fusion peptide of the S2 subunit and thus might affect membrane fusion. Finally, our phylogeographic investigations highlighted notable South-North transmission as well as frequent long-distance dispersal events in China that could implicate human-mediated transmission. Our findings provide new insights into the evolution and dispersal of PDCoV that contribute to our understanding of the critical factors involved in CoVs emergence.

...read moreread less

Journal Article•DOI•

Distinct Expression and Methylation Patterns for Genes with Different Fates following a Single Whole-Genome Duplication in Flowering Plants

[...]

Tao Shi¹, Razgar Seyed Rahmani², Paul F. Gugger³, Muhua H. Wang⁴, Hui Li¹, Yue Zhang¹, Zhi-Zhong Li¹, Qing-Feng Wang¹, Yves Van de Peer, Kathleen Marchal², Jin-Ming Chen¹ - Show less +7 more•Institutions (4)

Chinese Academy of Sciences¹, Ghent University², University of Maryland Center for Environmental Science³, Sun Yat-sen University⁴

01 Aug 2020-Molecular Biology and Evolution

TL;DR: After a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein–protein interactions and protein lengths and the lowest methylation in gene flanking regions.

...read moreread less

Abstract: For most sequenced flowering plants, multiple whole-genome duplications (WGDs) are found. Duplicated genes following WGD often have different fates that can quickly disappear again, be retained for long(er) periods, or subsequently undergo small-scale duplications. However, how different expression, epigenetic regulation, and functional constraints are associated with these different gene fates following a WGD still requires further investigation due to successive WGDs in angiosperms complicating the gene trajectories. In this study, we investigate lotus (Nelumbo nucifera), an angiosperm with a single WGD during the K-pg boundary. Based on improved intraspecific-synteny identification by a chromosome-level assembly, transcriptome, and bisulfite sequencing, we explore not only the fundamental distinctions in genomic features, expression, and methylation patterns of genes with different fates after a WGD but also the factors that shape post-WGD expression divergence and expression bias between duplicates. We found that after a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein-protein interactions and protein lengths and the lowest methylation in gene flanking regions. For those long-retained duplicate pairs, the degree of expression divergence correlates with their sequence divergence, degree in protein-protein interactions, and expression level, whereas their biases in expression level reflecting subgenome dominance are associated with the bias of subgenome fractionation. Overall, our study on the paleopolyploid nature of lotus highlights the impact of different functional constraints on gene fate and duplicate divergence following a single WGD in plant.

...read moreread less

Journal Article•DOI•

Extensive Ethnolinguistic Diversity in Vietnam Reflects Multiple Sources of Genetic Diversity.

[...]

Dang Liu¹, Nguyen Thuy Duong², Nguyen Dang Ton², Nguyen Van Phong², Brigitte Pakendorf³, Nong Van Hai², Mark Stoneking¹ - Show less +3 more•Institutions (3)

Max Planck Society¹, Vietnam Academy of Science and Technology², University of Lyon³

01 Sep 2020-Molecular Biology and Evolution

TL;DR: It is found that Vietnamese ethnolinguistic groups harbor multiple sources of genetic diversity that likely reflect different sources for the ancestry associated with each language family, including Austro-Asiatic groups that shifted to Austronesian languages during the past 2,500 years.

...read moreread less

Abstract: Vietnam features extensive ethnolinguistic diversity and occupies a key position in Mainland Southeast Asia. Yet, the genetic diversity of Vietnam remains relatively unexplored, especially with genome-wide data, because previous studies have focused mainly on the majority Kinh group. Here, we analyze newly generated genome-wide single-nucleotide polymorphism data for the Kinh and 21 additional ethnic groups in Vietnam, encompassing all five major language families in Mainland Southeast Asia. In addition to analyzing the allele and haplotype sharing within the Vietnamese groups, we incorporate published data from both nearby modern populations and ancient samples for comparison. In contrast to previous studies that suggested a largely indigenous origin for Vietnamese genetic diversity, we find that Vietnamese ethnolinguistic groups harbor multiple sources of genetic diversity that likely reflect different sources for the ancestry associated with each language family. However, linguistic diversity does not completely match genetic diversity: There have been extensive interactions between the Hmong-Mien and Tai-Kadai groups; different Austro-Asiatic groups show different affinities with other ethnolinguistic groups; and we identified a likely case of cultural diffusion in which some Austro-Asiatic groups shifted to Austronesian languages during the past 2,500 years. Overall, our results highlight the importance of genome-wide data from dense sampling of ethnolinguistic groups in providing new insights into the genetic diversity and history of an ethnolinguistically diverse region, such as Vietnam.

...read moreread less

Journal Article•DOI•

Bayesian Evaluation of Temporal Signal in Measurably Evolving Populations

[...]

Sebastián Duchêne¹, Philippe Lemey², Tanja Stadler³, Simon Y. W. Ho⁴, Simon Y. W. Ho⁵, David A. Duchêne⁶, Vijaykrishna Dhanasekaran⁷, Guy Baele² - Show less +4 more•Institutions (7)

University of Melbourne¹, Katholieke Universiteit Leuven², ETH Zurich³, Swiss Institute of Bioinformatics⁴, University of Sydney⁵, Australian National University⁶, Monash University⁷

01 Nov 2020-Molecular Biology and Evolution

TL;DR: The results indicate that BETS is an effective alternative to other tests of temporal signal, which has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.

...read moreread less

Abstract: Phylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population Here, we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous) We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation We explore the behavior of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus, and mitochondrial genomes of dog species Our results indicate that BETS is an effective alternative to other tests of temporal signal In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses

...read moreread less

Journal Article•DOI•

Evolutionary Genomics of Structural Variation in Asian Rice (Oryza sativa) Domestication

[...]

Yixuan Kou¹, Yixuan Kou², Yi Liao¹, Tuomas Toivainen³, Tuomas Toivainen¹, Yuanda Lv¹, Xinmin Tian⁴, J. J. Emerson¹, Brandon S. Gaut¹, Yongfeng Zhou¹ - Show less +6 more•Institutions (4)

University of California, Irvine¹, Jiangxi Agricultural University², University of Helsinki³, Xinjiang University⁴

16 Dec 2020-Molecular Biology and Evolution

TL;DR: Structural variants are discovered across a population sample of 347 high-coverage, resequenced genomes of Asian rice and its wild ancestor and detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.

...read moreread less

Abstract: Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type-which included inversions, duplications, deletions, translocations, and mobile element insertions-was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.

...read moreread less

Journal Article•DOI•

Complex Evolution of Insect Insulin Receptors and Homologous Decoy Receptors, and Functional Significance of Their Multiplicity.

[...]

Vlastimil Smýkal¹, Martin Pivarci¹, Jan Provaznik¹, Olga Bazalová¹, Pavel Jedlička¹, Ondřej Lukšan¹, Aleš Horák¹, Hana Vaněčková¹, Vladimir Benes, Ivan Fiala¹, Robert Hanus¹, David Doležel², David Doležel¹ - Show less +9 more•Institutions (2)

Academy of Sciences of the Czech Republic¹, Sewanee: The University of the South²

01 Jun 2020-Molecular Biology and Evolution

TL;DR: A comprehensive phylogenetic analysis of insect InR sequences in 118 species from 23 orders is presented and the role of three InRs identified in the linden bug, Pyrrhocoris apterus, in wing polymorphism control is investigated, suggesting an independent establishment of insulin/insulin-like growth factor signaling control over wing development.

...read moreread less

Abstract: Evidence accumulates that the functional plasticity of insulin and insulin-like growth factor signaling in insects could spring, among others, from the multiplicity of insulin receptors (InRs). Their multiple variants may be implemented in the control of insect polyphenism, such as wing or caste polyphenism. Here, we present a comprehensive phylogenetic analysis of insect InR sequences in 118 species from 23 orders and investigate the role of three InRs identified in the linden bug, Pyrrhocoris apterus, in wing polymorphism control. We identified two gene clusters (Clusters I and II) resulting from an ancestral duplication in a late ancestor of winged insects, which remained conserved in most lineages, only in some of them being subject to further duplications or losses. One remarkable yet neglected feature of InR evolution is the loss of the tyrosine kinase catalytic domain, giving rise to decoys of InR in both clusters. Within the Cluster I, we confirmed the presence of the secreted decoy of insulin receptor in all studied Muscomorpha. More importantly, we described a new tyrosine kinase-less gene (DR2) in the Cluster II, conserved in apical Holometabola for ∼300 My. We differentially silenced the three P. apterus InRs and confirmed their participation in wing polymorphism control. We observed a pattern of Cluster I and Cluster II InRs impact on wing development, which differed from that postulated in planthoppers, suggesting an independent establishment of insulin/insulin-like growth factor signaling control over wing development, leading to idiosyncrasies in the co-option of multiple InRs in polyphenism control in different taxa.

...read moreread less

Journal Article•DOI•

Adaptive Introgression across Semipermeable Species Boundaries between Local Helicoverpa zea and Invasive Helicoverpa armigera Moths.

[...]

Wendy A. Valencia-Montoya¹, Wendy A. Valencia-Montoya², Samia Elfekih³, Samia Elfekih⁴, Henry L. North², Joana I. Meier², Ian A. Warren², Wee Tek Tay⁵, Karl H.J. Gordon⁵, Alexandre Specht⁶, Silvana V. Paula-Moraes⁷, Rahul V. Rane⁴, Rahul V. Rane³, Tom Walsh⁵, Chris D. Jiggins² - Show less +11 more•Institutions (7)

Harvard University¹, University of Cambridge², Australian Animal Health Laboratory³, University of Melbourne⁴, Commonwealth Scientific and Industrial Research Organisation⁵, Empresa Brasileira de Pesquisa Agropecuária⁶, University of Florida⁷

01 Sep 2020-Molecular Biology and Evolution

TL;DR: The worst-case scenario for an invasive species is documented, in which there are now two pest species instead of one, and the native species has acquired resistance to pyrethroid insecticides through introgression.

...read moreread less

Abstract: Hybridization between invasive and native species has raised global concern, given the dramatic increase in species range shifts and pest outbreaks due to anthropogenic dispersal. Nevertheless, secondary contact between sister lineages of local and invasive species provides a natural laboratory to understand the factors that determine introgression and the maintenance or loss of species barriers. Here, we characterize the early evolutionary outcomes following secondary contact between invasive Helicoverpa armigera and native H. zea in Brazil. We carried out whole-genome resequencing of Helicoverpa moths from Brazil in two temporal samples: during the outbreak of H. armigera in 2013 and 2017. There is evidence for a burst of hybridization and widespread introgression from local H. zea into invasive H. armigera coinciding with H. armigera expansion in 2013. However, in H. armigera, the admixture proportion and the length of introgressed blocks were significantly reduced between 2013 and 2017, suggesting selection against admixture. In contrast to the genome-wide pattern, there was striking evidence for adaptive introgression of a single region from the invasive H. armigera into local H. zea, including an insecticide resistance allele that increased in frequency over time. In summary, despite extensive gene flow after secondary contact, the species boundaries are largely maintained except for the single introgressed region containing the insecticide-resistant locus. We document the worst-case scenario for an invasive species, in which there are now two pest species instead of one, and the native species has acquired resistance to pyrethroid insecticides through introgression.

...read moreread less

Journal Article•DOI•

HLA Heterozygote Advantage against HIV-1 Is Driven by Quantitative and Qualitative Differences in HLA Allele-Specific Peptide Presentation

[...]

Jatin Arora¹, Federica Pierini¹, Paul J. McLaren², Paul J. McLaren³, Mary Carrington⁴, Jacques Fellay⁵, Jacques Fellay⁶, Jacques Fellay⁷, Tobias L. Lenz¹ - Show less +5 more•Institutions (7)

Max Planck Society¹, University of Manitoba², Public Health Agency of Canada³, Massachusetts Institute of Technology⁴, École Polytechnique Fédérale de Lausanne⁵, University of Lausanne⁶, Swiss Institute of Bioinformatics⁷

01 Mar 2020-Molecular Biology and Evolution

TL;DR: HLA heterozygotes were also more likely to carry certain HLA alleles, including the highly protective HLA-B*57:01 variant, indicating that HLAheterozygote advantage ultimately results from a combination of quantitative and qualitative effects in antigen presentation.

...read moreread less

Abstract: Pathogen-mediated balancing selection is regarded as a key driver of host immunogenetic diversity. A hallmark for balancing selection in humans is the heterozygote advantage at genes of the human leukocyte antigen (HLA), resulting in improved HIV-1 control. However, the actual mechanism of the observed heterozygote advantage is still elusive. HLA heterozygotes may present a broader array of antigenic viral peptides to immune cells, possibly resulting in a more efficient cytotoxic T-cell response. Alternatively, heterozygosity may simply increase the chance to carry the most protective HLA alleles, as individual HLA alleles are known to differ substantially in their association with HIV-1 control. Here, we used data from 6,311 HIV-1-infected individuals to explore the relative contribution of quantitative and qualitative aspects of peptide presentation in HLA heterozygote advantage against HIV. Screening the entire HIV-1 proteome, we observed that heterozygous individuals exhibited a broader array of HIV-1 peptides presented by their HLA class I alleles. In addition, viral load was negatively correlated with the breadth of the HIV-1 peptide repertoire bound by an individual's HLA variants, particularly at HLA-B. This suggests that heterozygote advantage at HLA-B is at least in part mediated by quantitative peptide presentation. We also observed higher HIV-1 sequence diversity among HLA-B heterozygous individuals, suggesting stronger evolutionary pressure from HLA heterozygosity. However, HLA heterozygotes were also more likely to carry certain HLA alleles, including the highly protective HLA-B*57:01 variant, indicating that HLA heterozygote advantage ultimately results from a combination of quantitative and qualitative effects in antigen presentation.

...read moreread less

Journal Article•DOI•

Origin and Evolution of Carboxysome Positioning Systems in Cyanobacteria.

[...]

Joshua S. MacCready¹, Joseph L. Basalla¹, Anthony G. Vecchiarelli¹•Institutions (1)

University of Michigan¹

01 May 2020-Molecular Biology and Evolution

TL;DR: It is shown that the McdAB system is widespread among β-cyanobacteria, often clustering with carboxysome-related components, and is absent in α-cyAnobacteria.

...read moreread less

Abstract: Carboxysomes are protein-based organelles that are essential for allowing cyanobacteria to fix CO2. Previously, we identified a two-component system, McdAB, responsible for equidistantly positioning carboxysomes in the model cyanobacterium Synechococcus elongatus PCC 7942 (MacCready JS, Hakim P, Young EJ, Hu L, Liu J, Osteryoung KW, Vecchiarelli AG, Ducat DC. 2018. Protein gradients on the nucleoid position the carbon-fixing organelles of cyanobacteria. eLife 7:pii:e39723). McdA, a ParA-type ATPase, nonspecifically binds the nucleoid in the presence of ATP. McdB, a novel factor that directly binds carboxysomes, displaces McdA from the nucleoid. Removal of McdA from the nucleoid in the vicinity of carboxysomes by McdB causes a global break in McdA symmetry, and carboxysome motion occurs via a Brownian-ratchet-based mechanism toward the highest concentration of McdA. Despite the importance for cyanobacteria to properly position their carboxysomes, whether the McdAB system is widespread among cyanobacteria remains an open question. Here, we show that the McdAB system is widespread among β-cyanobacteria, often clustering with carboxysome-related components, and is absent in α-cyanobacteria. Moreover, we show that two distinct McdAB systems exist in β-cyanobacteria, with Type 2 systems being the most ancestral and abundant, and Type 1 systems, like that of S. elongatus, possibly being acquired more recently. Lastly, all McdB proteins share the sequence signatures of a protein capable of undergoing liquid-liquid phase separation. Indeed, we find that representatives of both McdB types undergo liquid-liquid phase separation in vitro, the first example of a ParA-type ATPase partner protein to exhibit this behavior. Our results have broader implications for understanding carboxysome evolution, biogenesis, homeostasis, and positioning in cyanobacteria.

...read moreread less

Journal Article•DOI•

A Whole-Genome Scan for Association with Invasion Success in the Fruit Fly Drosophila suzukii Using Contrasts of Allele Frequencies Corrected for Population Structure

[...]

Laure Olazcuaga¹, Anne Loiseau¹, Hugues Parrinello², Mathilde Paris³, Antoine Fraimout¹, Christelle Guédot⁴, Lauren M Diepenbrock⁵, Marc Kenis⁶, Jinping Zhang⁶, Xiao Chen⁷, Nicolas Borowiec⁸, Benoit Facon, Heidrun Vogt⁹, Donald K. Price¹⁰, Heiko Vogel¹¹, Benjamin Prud'homme³, Arnaud Estoup¹, Mathieu Gautier¹ - Show less +14 more•Institutions (11)

SupAgro¹, University of Montpellier², Aix-Marseille University³, University of Wisconsin-Madison⁴, North Carolina State University⁵, CABI⁶, Yunnan Agricultural University⁷, Centre national de la recherche scientifique⁸, Julius Kühn-Institut⁹, University of Nevada, Las Vegas¹⁰, Max Planck Society¹¹

01 Aug 2020-Molecular Biology and Evolution

TL;DR: The genome response of the spotted wing drosophila Drosophila suzukii is characterized during the worldwide invasion of this pest insect species, by conducting a genome-wide association study to identify genes involved in adaptive processes during invasion.

...read moreread less

Abstract: Evidence is accumulating that evolutionary changes are not only common during biological invasions but may also contribute directly to invasion success. The genomic basis of such changes is still largely unexplored. Yet, understanding the genomic response to invasion may help to predict the conditions under which invasiveness can be enhanced or suppressed. Here we characterized the genome response of the spotted wing drosophila Drosophila suzukii during the worldwide invasion of this pest insect species, by conducting a genome-wide association study to identify genes involved in adaptive processes during invasion. Genomic data from 22 population samples were analyzed to detect genetic variants associated with the status (invasive versus native) of the sampled populations based on a newly developed statistic, we called C 2 , that contrasts allele frequencies corrected for population structure. We evaluated this new statistical framework using simulated data sets and implemented it in an upgraded version of the program BayPass. We identified a relatively small set of single nucleotide polymorphisms (SNPs) that show a highly significant association with the invasive status of D. suzukii populations. In particular, two genes, RhoGEF64C and cpo, contained SNPs significantly associated with the invasive status in the two separate main invasion routes of D. suzukii. Our methodological approaches can be applied to any other invasive species, and more generally to any evolutionary model for species characterized by non-equilibrium demographic conditions for which binary covariables of interest can be defined at the population level.

...read moreread less

Collapse