Showing papers in &quot;Molecular Biology and Evolution in 2013&quot;

Ultrafast Approximation for Phylogenetic Bootstrap

TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.

...read moreread less

Abstract: We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

...read moreread less

27,771 citations

Journal Article•DOI•

[...]

Bui Quang Minh¹, Minh Anh Thi Nguyen², Arndt von Haeseler¹•Institutions (2)

Medical University of Vienna¹, University of Groningen²

Building Phylogenetic Trees from Molecular Data with MEGA

TL;DR: This work proposes an ultrafast bootstrap approximation approach (UFBoot) to compute the support of phylogenetic groups in maximum likelihood (ML) based trees and offers an efficient and easy-to-use software to perform the UFBoot analysis with ML tree inference.

...read moreread less

Abstract: Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and the Shimodaira-Hasegawa-like approximate likelihood ratio test have been introduced to speed up the bootstrap. Here, we suggest an ultrafast bootstrap approximation approach (UFBoot) to compute the support of phylogenetic groups in maximum likelihood (ML) based trees. To achieve this, we combine the resampling estimated log-likelihood method with a simple but effective collection scheme of candidate trees. We also propose a stopping rule that assesses the convergence of branch support values to automatically determine when to stop collecting candidate trees. UFBoot achieves a median speed up of 3.1 (range: 0.66-33.3) to 10.2 (range: 1.32-41.4) compared with RAxML RBS for real DNA and amino acid alignments, respectively. Moreover, our extensive simulations show that UFBoot is robust against moderate model violations and the support values obtained appear to be relatively unbiased compared with the conservative standard bootstrap. This provides a more direct interpretation of the bootstrap support. We offer an efficient and easy-to-use software (available at http://www.cibiv.at/software/iqtree) to perform the UFBoot analysis with ML tree inference.

...read moreread less

2,469 citations

Journal Article•DOI•

[...]

Barry G. Hall

DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution

TL;DR: A step-by-step protocol is presented in sufficient detail to allow a novice to start with a sequence of interest and to build a publication-quality tree illustrating the evolution of an appropriate set of homologs of that sequence.

...read moreread less

Abstract: Phylogenetic analysis is sometimes regarded as being an intimidating, complex process that requires expertise and years of experience. In fact, it is a fairly straightforward process that can be learned quickly and applied effectively. This Protocol describes the several steps required to produce a phylogenetic tree from molecular data for novices. In the example illustrated here, the program MEGA is used to implement all those steps, thereby eliminating the need to learn several programs, and to deal with multiple file formats from one step to another (Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 28:2731‐2739). The first step, identification of a set of homologous sequences and downloading those sequences, is implemented by MEGA’s own browser built on top of the Google Chrome toolkit. For the second step, alignment of those sequences, MEGA offers two different algorithms: ClustalW and MUSCLE. For the third step, construction of a phylogenetic tree from the aligned sequences, MEGA offers many different methods. Here we illustrate the maximum likelihood method, beginning with MEGA’s Models feature, which permits selecting the most suitable substitution model. Finally, MEGA provides a powerful and flexible interface for the final step, actually drawing the tree for publication. Here a step-by-step protocol is presented in sufficient detail to allow a novice to start with a sequence of interest and to build a publication-quality tree illustrating the evolution of an appropriate set of homologs of that sequence. MEGA is available for use on PCs and Macs from www. megasoftware.net.

...read moreread less

1,057 citations

Journal Article•DOI•

[...]

Xuhua Xia¹•Institutions (1)

University of Ottawa¹

05 Apr 2013-Molecular Biology and Evolution

TL;DR: Since its first release in 2001 as mainly a software package for phylogenetic analysis, data analysis for molecular biology and evolution (DAMBE) has gained many new functions that may be classified into six categories.

...read moreread less

Abstract: Since its first release in 2001 as mainly a software package for phylogenetic analysis, data analysis for molecular biology and evolution (DAMBE) has gained many new functions that may be classified into six categories: 1) sequence retrieval, editing, manipulation, and conversion among more than 20 standard sequence formats including MEGA, NEXUS, PHYLIP, GenBank, and the new NeXML format for interoperability, 2) motif characterization and discovery functions such as position weight matrix and Gibbs sampler, 3) descriptive genomic analysis tools with improved versions of codon adaptation index, effective number of codons, protein isoelectric point profiling, RNA and protein secondary structure prediction and calculation of minimum folding energy, and genomic skew plots with optimized window size, 4) molecular phylogenetics including sequence alignment, testing substitution saturation, distance-based, maximum parsimony, and maximum-likelihood methods for tree reconstructions, testing the molecular clock hypothesis with either a phylogeny or with relative-rate tests, dating gene duplication and speciation events, choosing the best-fit substitution models, and estimating rate heterogeneity over sites, 5) phylogeny-based comparative methods for continuous and discrete variables, and 6) graphic functions including secondary structure display, optimized skew plot, hydrophobicity plot, and many other plots of amino acid properties along a protein sequence, tree display and drawing by dragging nodes to each other, and visual searching of the maximum parsimony tree. DAMBE features a graphic, user-friendly, and intuitive interface and is freely available from http://dambe.bio.uottawa.ca (last accessed April 16, 2013).

...read moreread less

989 citations

Journal Article•DOI•

FUBAR : A Fast, Unconstrained Bayesian AppRoximation for inferring selection

[...]

Ben Murrell¹, Sasha Moola¹, Sasha Moola², Amandla Mabona¹, Amandla Mabona², Thomas Weighill¹, Daniel J. Sheward², Sergei L. Kosakovsky Pond³, Konrad Scheffler³, Konrad Scheffler¹ - Show less +6 more•Institutions (3)

Stellenbosch University¹, University of Cape Town², University of California, San Diego³

Testing for Associations between Loci and Environmental Gradients Using Latent Factor Mixed Models

TL;DR: This work presents an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes, and leaves the distribution of selection parameters essentially unconstrained.

...read moreread less

Abstract: Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection—an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: We illustrate this on a large influenza hemagglutinin data set (3,142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (http://www.hyphy.org), as well as on the Datamonkey web server (http://www.datamonkey.org/).

...read moreread less

939 citations

Journal Article•DOI•

[...]

Eric Frichot¹, Sean D. Schoville¹, Guillaume Bouchard², Olivier François¹•Institutions (2)

Centre national de la recherche scientifique¹, Xerox²

01 Jul 2013-Molecular Biology and Evolution

TL;DR: New algorithms based on population genetics, ecological modeling, and statistical learning techniques are proposed to screen genomes for signatures of local adaptation and demonstrate that LFMM can efficiently estimate random effects due to population history and isolation-by-distance patterns when computing gene-environment correlations.

...read moreread less

Abstract: Adaptation to local environments often occurs through natural selection acting on a large number of loci, each having a weak phenotypic effect. One way to detect these loci is to identify genetic polymorphisms that exhibit high correlation with environmental variables used as proxies for ecological pressures. Here, we propose new algorithms based on population genetics, ecological modeling, and statistical learning techniques to screen genomes for signatures of local adaptation. Implemented in the computer program “latent factor mixed model” (LFMM), these algorithms employ an approach in which population structure is introduced using unobserved variables. These fast and computationally efficient algorithms detect correlations between environmental and genetic variation while simultaneously inferring background levels of population structure. Comparing these new algorithms with related methods provides evidence that LFMM can efficiently estimate random effects due to population history and isolation-by-distance patterns when computing gene-environment correlations, and decrease the number of false-positive associations in genome scans. We then apply these models to plant and human genetic data, identifying several genes with functions related to development that exhibit strong correlations with climatic gradients.

...read moreread less

605 citations

Journal Article•DOI•

MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline

[...]

Wataru Iwasaki¹, Tsukasa Fukunaga¹, Ryota Isagozawa¹, Koichiro Yamada, Yasunobu Maeda¹, Takashi P. Satoh, Tetsuya Sado², Kohji Mabuchi¹, Hirohiko Takeshima¹, Masaki Miya², Mutsumi Nishida¹ - Show less +7 more•Institutions (2)

University of Tokyo¹, American Museum of Natural History²

01 Nov 2013-Molecular Biology and Evolution

TL;DR: MitoFish contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses.

...read moreread less

Abstract: Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.

...read moreread less

590 citations

Journal Article•DOI•

Estimating Gene Gain and Loss Rates in the Presence of Error in Genome Assembly and Annotation Using CAFE 3

[...]

Mira V. Han¹, Gregg W.C. Thomas², Jose Lugo-Martinez², Matthew W. Hahn²•Institutions (2)

National Evolutionary Synthesis Center¹, Indiana University²

Hierarchical and Spatially Explicit Clustering of DNA Sequences with BAPS Software

TL;DR: The results show that errors in genome annotation do lead to higher inferred rates of gene gain and loss but that CAFE 3 sufficiently accounts for these errors to provide accurate estimates of important evolutionary parameters.

...read moreread less

Abstract: Current sequencing methods produce large amounts of data, but genome assemblies constructed from these data are often fragmented and incomplete. Incomplete and error-fille da ssemblies result in many annotation errors, especially in the number of genes present in a genome. This means that methods attempting to estimate rates of gene duplication and loss often will be misled by such errors and that rates of gene family evolution will be consistently overestimated. Here, we present a method that takes these errors into account, allowing one to accurately infer rates of gene gain and loss among genomes even with low assembly and annotation quality. The method is implemented in the newest version of the software package CAFE, along with several other novel features. We demonstrate the accuracy of the method with extensive simulations and reanalyze several previously published data sets. Our results show that errors in genome annotation do lead to higher inferred rates of gene gain and lo ss but that CAFE 3s uff iciently accounts for these errors to provide accurate estimates of important evolutionary parameters.

...read moreread less

572 citations

Journal Article•DOI•

[...]

Lu Cheng¹, Thomas R. Connor², Thomas R. Connor³, Jukka Sirén¹, David M. Aanensen⁴, Jukka Corander¹ - Show less +2 more•Institutions (4)

University of Helsinki¹, Cardiff University², Wellcome Trust Sanger Institute³, Imperial College London⁴

Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci

TL;DR: Two upgrades to the Bayesian Analysis of Population Structure (BAPS) software are introduced, which enable 1) spatially explicit modeling of variation in DNA sequences and 2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures.

...read moreread less

Abstract: Phylogeographical analyses have become commonplace for a myriad of organisms with the advent of cheap DNA sequencing technologies. Bayesian model-based clustering is a powerful tool for detecting important patterns in such data and can be used to decipher even quite subtle signals of systematic differences in molecular variation. Here, we introduce two upgrades to the Bayesian Analysis of Population Structure (BAPS) software, which enable 1) spatially explicit modeling of variation in DNA sequences and 2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures. We provide a direct interface to map the results from spatial clustering with Google Maps using the portal http://www.spatialepidemiology.net/ and illustrate this approach using sequence data from Borrelia burgdorferi. The usefulness of hierarchical clustering is demonstrated through an analysis of the metapopulation structure within a bacterial population experiencing a high level of local horizontal gene transfer. The tools that are introduced are freely available at http://www.helsinki.fi/bsg/software/BAPS/.

...read moreread less

507 citations

Journal Article•DOI•

[...]

Mandev S. Gill¹, Philippe Lemey², Nuno R. Faria², Andrew Rambaut³, Beth Shapiro⁴, Marc A. Suchard¹ - Show less +2 more•Institutions (4)

University of California, Los Angeles¹, Katholieke Universiteit Leuven², University of Edinburgh³, University of California, Santa Cruz⁴

01 Mar 2013-Molecular Biology and Evolution

TL;DR: In this article, a Gaussian Markov random field (GMRF) model was proposed for the analysis of multilocus sequence data and the time to the most recent common ancestor (TMRCA) was recovered.

...read moreread less

Abstract: Effective population size is fundamental in population genetics and characterizes genetic diversity. To infer past population dynamics from molecular sequence data, coalescent-based models have been developed for Bayesian nonparametric estimation of effective population size over time. Among the most successful is a Gaussian Markov random field (GMRF) model for a single gene locus. Here, we present a generalization of the GMRF model that allows for the analysis of multilocus sequence data. Using simulated data, we demonstrate the improved performance of our method to recover true population trajectories and the time to the most recent common ancestor (TMRCA). We analyze a multilocus alignment of HIV-1 CRF02_AG gene sequences sampled from Cameroon. Our results are consistent with HIV prevalence data and uncover some aspects of the population history that go undetected in Bayesian parametric estimation. Finally, we recover an older and more reconcilable TMRCA for a classic ancient DNA data set.

...read moreread less

Journal Article•DOI•

SweeD: Likelihood-Based Detection of Selective Sweeps in Thousands of Genomes

[...]

Pavlos Pavlidis¹, Daniel Živković², Alexandros Stamatakis¹, Nikolaos Alachiotis¹•Institutions (2)

Heidelberg Institute for Theoretical Studies¹, Ludwig Maximilian University of Munich²

01 Sep 2013-Molecular Biology and Evolution

TL;DR: It is shown that an increase of sample size results in more precise detection of positive selection and the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection.

...read moreread less

Abstract: The advent of modern DNA sequencing technology is the driving force in obtaining complete intra-specific genomes that can be used to detect loci that have been subject to positive selection in the recent past. Based on selective sweep theory, beneficial loci can be detected by examining the single nucleotide polymorphism patterns in intraspecific genome alignments. In the last decade, a plethora of algorithms for identifying selective sweeps have been developed. However, the majority of these algorithms have not been designed for analyzing whole-genome data. We present SweeD (Sweep Detector), an open-source tool for the rapid detection of selective sweeps in whole genomes. It analyzes site frequency spectra and represents a substantial extension of the widely used SweepFinder program. The sequential version of SweeD is up to 22 times faster than SweepFinder and, more importantly, is able to analyze thousands of sequences. We also provide a parallel implementation of SweeD for multi-core processors. Furthermore, we implemented a checkpointing mechanism that allows to deploy SweeD on cluster systems with queue execution time restrictions, as well as to resume long-running analyses after processor failures. In addition, the user can specify various demographic models via the command-line to calculate their theoretically expected site frequency spectra. Therefore, (in contrast to SweepFinder) the neutral site frequencies can optionally be directly calculated from a given demographic model. We show that an increase of sample size results in more precise detection of positive selection. Thus, the ability to analyze substantially larger sample sizes by using SweeD leads to more accurate sweep detection. We validate SweeD via simulations and by scanning the first chromosome from the 1000 human Genomes project for selective sweeps. We compare SweeD results with results from a linkage-disequilibrium-based approach and identify common outliers.

...read moreread less

Journal Article•DOI•

Impact of Missing Data on Phylogenies Inferred from Empirical Phylogenomic Data Sets

[...]

Béatrice Roure¹, Denis Baurain², Hervé Philippe¹•Institutions (2)

Université de Montréal¹, University of Liège²

01 Jan 2013-Molecular Biology and Evolution

TL;DR: These analyses demonstrate that missing data perturb phylogenetic inference slightly beyond the expected decrease in resolving power, and confirm that including incomplete yet short-branch taxa can help to eschew artifacts, as predicted by simulations.

...read moreread less

Abstract: Progress in sequencing technology allows researchers to assemble ever-larger supermatrices for phylogenomic inference. However, current phylogenomic studies often rest on patchy data sets, with some having 80% missing (or ambiguous) data or more. Though early simulations had suggested that missing data per se do not harm phylogenetic inference when using sufficiently large data sets, Lemmon et al. (Lemmon AR, Brown JM, Stanger-Hall K, Lemmon EM. 2009. The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and Bayesian inference. Syst Biol. 58:130-145.) have recently cast doubt on this consensus in a study based on the introduction of parsimony-uninformative incomplete characters. In this work, we empirically reassess the issue of missing data in phylogenomics while exploring possible interactions with the model of sequence evolution. First, we note that parsimony-uninformative incomplete characters are actually informative in a probabilistic framework. A reanalysis of Lemmon's data set with this in mind gives a very different interpretation of their results and shows that some of their conclusions may be unfounded. Second, we investigate the effect of the progressive introduction of missing data in a complete supermatrix (126 genes × 39 species) capable of resolving animal relationships. These analyses demonstrate that missing data perturb phylogenetic inference slightly beyond the expected decrease in resolving power. In particular, they exacerbate systematic errors by reducing the number of species effectively available for the detection of multiple substitutions. Consequently, large sparse supermatrices are more sensitive to phylogenetic artifacts than smaller but less incomplete data sets, which argue for experimental designs aimed at collecting a modest number (~50) of highly covered genes. Our results further confirm that including incomplete yet short-branch taxa (i.e., slowly evolving species or close outgroups) can help to eschew artifacts, as predicted by simulations. Finally, it appears that selecting an adequate model of sequence evolution (e.g., the site-heterogeneous CAT model instead of the site-homogeneous WAG model) is more beneficial to phylogenetic accuracy than reducing the level of missing data.

...read moreread less

Journal Article•DOI•

pamlX: A Graphical User Interface for PAML

[...]

Bo Xu¹, Ziheng Yang², Ziheng Yang³, Ziheng Yang¹•Institutions (3)

Beijing Institute of Genomics¹, Chinese Academy of Sciences², University College London³

01 Dec 2013-Molecular Biology and Evolution

TL;DR: Yang et al. as mentioned in this paper present pamlX, a graphical user interface/front end for the paml (for Phylogenetic Analysis by Maximum Likelihood) program package.

...read moreread less

Abstract: This note announces pamlX, a graphical user interface/front end for the paml (for Phylogenetic Analysis by Maximum Likelihood) program package (Yang Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 13:555-556; Yang Z. 2007. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 24:1586-1591). pamlX is written in C++ using the Qt library and communicates with paml programs through files. It can be used to create, edit, and print control files for paml programs and to launch paml runs. The interface is available for free download at http://abacus.gene.ucl.ac.uk/software/paml.html.

...read moreread less

Journal Article•DOI•

Detecting sequence homology at the gene cluster level with MultiGeneBlast.

[...]

Marnix H. Medema¹, Eriko Takano², Eriko Takano¹, Rainer Breitling³, Rainer Breitling², Rainer Breitling¹ - Show less +2 more•Institutions (3)

University of Groningen¹, University of Manchester², University of Glasgow³

Phylotranscriptomics to Bring the Understudied into the Fold: Monophyletic Ostracoda, Fossil Placement, and Pancrustacean Phylogeny

TL;DR: This work provides a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes, to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions.

...read moreread less

Abstract: The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualization offered by MultiGeneBlast allows users to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions. The tool is fully equipped with applications to generate search databases from GenBank or from the user's own sequence data. Finally, an architecture search mode allows searching for gene clusters with novel configurations, by detecting genomic regions with any user-specified combination of genes. Sources, precompiled binaries, and a graphical tutorial of MultiGeneBlast are freely available from http://multigeneblast.sourceforge.net/.

...read moreread less

Journal Article•DOI•

[...]

Todd H. Oakley¹, Joanna M. Wolfe², Annie R. Lindgren³, Annie R. Lindgren¹, Alexander K. Zaharoff¹ - Show less +1 more•Institutions (3)

University of California, Santa Barbara¹, Yale University², Portland State University³

01 Jan 2013-Molecular Biology and Evolution

TL;DR: In this paper, the authors used a new 454 transcriptome data sets from Ostracoda, an ancient and diverse group with a dense fossil record, which is often undersampled in broader studies.

...read moreread less

Abstract: An ambitious, yet fundamental goal for comparative biology is to understand the evolutionary relationships for all of life. However, many important taxonomic groups have remained recalcitrant to inclusion into broader scale studies. Here, we focus on collection of 9 new 454 transcriptome data sets from Ostracoda, an ancient and diverse group with a dense fossil record, which is often undersampled in broader studies. We combine the new transcriptomes with a new morphological matrix (including fossils) and existing expressed sequence tag, mitochondrial genome, nuclear genome, and ribosomal DNA data. Our analyses lead to new insights into ostracod and pancrustacean phylogeny. We obtained support for three epic pancrustacean clades that likely originated in the Cambrian: Oligostraca (Ostracoda, Mystacocarida, Branchiura, and Pentastomida); Multicrustacea (Copepoda, Malacostraca, and Thecostraca); and a clade we refer to as Allotriocarida (Hexapoda, Remipedia, Cephalocarida, and Branchiopoda). Within the Oligostraca clade, our results support the unresolved question of ostracod monophyly. Within Multicrustacea, we find support for Thecostraca plus Copepoda, for which we suggest the name Hexanauplia. Within Allotriocarida, some analyses support the hypothesis that Remipedia is the sister taxon to Hexapoda, but others support Branchiopoda+ Cephalocarida as the sister group of hexapods. In multiple different analyses, we see better support for equivocal nodes using slow-evolving genes or when excluding distant outgroups, highlighting the increased importance of conditional data combination in this age of abundant, often anonymous data. However, when we analyze the same set of species and ignore rate of gene evolution, we find higher support when including all data, more in line with a “total evidence” philosophy. By concatenating molecular and morphological data, we place pancrustacean fossils in the phylogeny, which can be used for studies of divergence times in Pancrustacea, Arthropoda, or Metazoa. Our results and new data will allow for attributes of Ostracoda, such as its amazing fossil record and diverse biology, to be leveraged in broader scale comparative studies. Further, we illustrate how adding extensive next-generation sequence data from understudied groups can yield important new phylogenetic insights into long-standing questions, especially when carefully analyzed in combination with other data.

...read moreread less

Journal Article•DOI•

Pig Domestication and Human-Mediated Dispersal in Western Eurasia Revealed through Ancient DNA and Geometric Morphometrics

[...]

Claudio Ottoni¹, Linus Girdland Flink², Linus Girdland Flink³, Allowen Evin⁴, Allowen Evin⁵, Christina Geörg⁶, Christina Geörg⁷, Bea De Cupere⁸, Wim Van Neer¹, Wim Van Neer⁸, László Bartosiewicz⁹, Anna Linderholm³, Ross Barnett³, Joris Peters¹⁰, Ronny Decorte¹, Marc Waelkens¹, Nancy Vanderheyden¹, François-Xavier Ricaut¹¹, Canan Çakirlar¹², Canan Çakirlar⁸, Özlem Çevik, A. Rus Hoelzel³, Marjan Mashkour⁵, Azadeh Fatemeh Mohaseb Karimlu⁵, Shiva Sheikhi Seno⁵, Julie Daujat⁴, Julie Daujat⁵, Fiona Brock¹³, Ron Pinhasi¹⁴, Hitomi Hongo¹⁵, Miguel Pérez-Enciso¹⁶, Morten Arendt Rasmussen, Laurent A. F. Frantz¹⁷, Hendrik-Jan Megens¹⁷, Richard P. M. A. Crooijmans¹⁷, Martien A. M. Groenen¹⁷, Benjamin S. Arbuckle¹⁸, Nobert Benecke⁷, Una Strand Vidarsdottir³, Joachim Burger⁶, Thomas Cucchi⁴, Thomas Cucchi⁵, Keith Dobney⁴, Greger Larson³ - Show less +40 more•Institutions (18)

Katholieke Universiteit Leuven¹, Natural History Museum², Durham University³, University of Aberdeen⁴, Centre national de la recherche scientifique⁵, University of Mainz⁶, Deutsches Archäologisches Institut⁷, Royal Belgian Institute of Natural Sciences⁸, Eötvös Loránd University⁹, Ludwig Maximilian University of Munich¹⁰, University of Toulouse¹¹, University of Groningen¹², University of Oxford¹³, Trinity College, Dublin¹⁴, Graduate University for Advanced Studies¹⁵, Catalan Institution for Research and Advanced Studies¹⁶, Wageningen University and Research Centre¹⁷, Baylor University¹⁸

Evidence for Polygenic Adaptation to Pathogens in the Human Genome

TL;DR: The first genetic signatures of early domestic pigs in the Near Eastern Neolithic core zone are revealed and it is demonstrated that these early pigs differed genetically from those in western Anatolia that were introduced to Europe during the Neolithic expansion.

...read moreread less

Abstract: Zooarcheological evidence suggests that pigs were domesticated in Southwest Asia ~8,500 BC. They then spread across the Middle and Near East and westward into Europe alongside early agriculturalists. European pigs were either domesticated independently or more likely appeared so as a result of admixture between introduced pigs and European wild boar. As a result, European wild boar mtDNA lineages replaced Near Eastern/Anatolian mtDNA signatures in Europe and subsequently replaced indigenous domestic pig lineages in Anatolia. The specific details of these processes, however, remain unknown. To address questions related to early pig domestication, dispersal, and turnover in the Near East, we analyzed ancient mitochondrial DNA and dental geometric morphometric variation in 393 ancient pig specimens representing 48 archeological sites (from the Pre-Pottery Neolithic to the Medieval period) from Armenia, Cyprus, Georgia, Iran, Syria, and Turkey. Our results reveal the first genetic signatures of early domestic pigs in the Near Eastern Neolithic core zone. We also demonstrate that these early pigs differed genetically from those in western Anatolia that were introduced to Europe during the Neolithic expansion. In addition, we present a significantly more refined chronology for the introduction of European domestic pigs into Asia Minor that took place during the Bronze Age, at least 900 years earlier than previously detected. By the 5th century AD, European signatures completely replaced the endemic lineages possibly coinciding with the widespread demographic and societal changes that occurred during the Anatolian Bronze and Iron Ages.

...read moreread less

Journal Article•DOI•

[...]

Josephine T. Daub¹, Tamara Hofer¹, Tamara Hofer², Emilie Cutivet¹, Isabelle Dupanloup², Isabelle Dupanloup¹, Lluis Quintana-Murci³, Lluis Quintana-Murci⁴, Marc Robinson-Rechavi², Marc Robinson-Rechavi⁵, Laurent Excoffier¹, Laurent Excoffier² - Show less +8 more•Institutions (5)

University of Bern¹, Swiss Institute of Bioinformatics², Pasteur Institute³, Centre national de la recherche scientifique⁴, University of Lausanne⁵

26 Apr 2013-Molecular Biology and Evolution

TL;DR: The results show that past interactions with pathogens have elicited widespread and coordinated genomic responses, and suggest that adaptation to pathogens can be considered as a primary example of polygenic selection.

...read moreread less

Abstract: Most approaches aiming at finding genes involved in adaptive events have focused on the detection of outlier loci, which resulted in the discovery of individually “significant” genes with strong effects. However, a collection of small effect mutations could have a large effect on a given biological pathway that includes many genes, and such a polygenic mode of adaptation has not been systematically investigated in humans. We propose here to evidence polygenic selection by detecting signals of adaptation at the pathway or gene set level instead of analyzing single independent genes. Using a gene-set enrichment test to identify genome-wide signals of adaptation among human populations, we find that most pathways globally enriched for signals of positive selection are either directly or indirectly involved in immune response. We also find evidence for long-distance genotypic linkage disequilibrium, suggesting functional epistatic interactions between members of the same pathway. Our results show that past interactions with pathogens have elicited widespread and coordinated genomic responses, and suggest that adaptation to pathogens can be considered as a primary example of polygenic selection.

...read moreread less

Journal Article•DOI•

A Recent Evolutionary Change Affects a Regulatory Element in the Human FOXP2 Gene

[...]

Tomislav Maricic¹, Viola Günther², Oleg Georgiev², Sabine Gehre¹, Marija Ćurlin³, Christiane Schreiweis¹, Ronald Naumann¹, Hernán A. Burbano¹, Matthias Meyer¹, Carles Lalueza-Fox⁴, Marco de la Rasilla⁵, Antonio Rosas⁶, Srećko Gajović³, Janet Kelso¹, Wolfgang Enard¹, Walter Schaffner², Svante Pääbo¹ - Show less +13 more•Institutions (6)

Max Planck Society¹, University of Zurich², University of Zagreb³, Pompeu Fabra University⁴, University of Oviedo⁵, Spanish National Research Council⁶

Evolution of conjugation and type IV secretion systems

TL;DR: It is found that the derived allele of this site is less efficient than the ancestral allele in activating transcription from a reporter construct, and is a plausible candidate for having caused a recent selective sweep in the FOXP2 gene.

...read moreread less

Abstract: The FOXP2 gene is required for normal development of speech and language. By isolating and sequencing FOXP2 genomic DNA fragments from a 49,000-year-old Iberian Neandertal and 50 present-day humans, we have identified substitutions in the gene shared by all or nearly all present-day humansbut absent or polymorphic in Neandertals. One such substitution is localized in intron 8 and affects a binding site for the transcription factor POU3F2, which is highly conserved among vertebrates. We find that the derived allele of this site is less efficient than the ancestral allele in activating transcription from a reporter construct. The derived allele also binds less POU3F2 dimers than POU3F2 monomers compared with the ancestral allele. Because the substitution in the POU3F2 binding site is likely to alter the regulation of FOXP2 expression, and because it is localized in a region of the gene associated with a previously described signal of positive selection, it is a plausible candidate for having caused a recent selective sweep in the FOXP2 gene.

...read moreread less

Journal Article•DOI•

[...]

Julien Guglielmini, Fernando de la Cruz, Eduardo P. C. Rocha

01 Feb 2013-Molecular Biology and Evolution

TL;DR: In this article, the authors analyzed the phylogeny of key conjugation proteins to infer the evolutionary history of conjugations and type IV secretion systems (T4SS) and showed that single-stranded DNA (ssDNA) and double-strand DNA (dsDNA), while both based on a key AAA + ATPase, diverged before the last common ancestor of bacteria.

...read moreread less

Abstract: Genetic exchange by conjugation is responsible for the spread of resistance, virulence, and social traits among prokaryotes. Recent works unraveled the functioning of the underlying type IV secretion systems (T4SS) and its distribution and recruitment for other biological processes (exaptation), notably pathogenesis. We analyzed the phylogeny of key conjugation proteins to infer the evolutionary history of conjugation and T4SS. We show that single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) conjugation, while both based on a key AAA + ATPase, diverged before the last common ancestor of bacteria. The two key ATPases of ssDNA conjugation are monophyletic, having diverged at an early stage from dsDNA translocases. Our data suggest that ssDNA conjugation arose first in diderm bacteria, possibly Proteobacteria, and then spread to other bacterial phyla, including bacterial monoderms and Archaea. Identifiable T4SS fall within the eight monophyletic groups, determined by both taxonomy and structure of the cell envelope. Transfer to monoderms might have occurred only once, but followed diverse adaptive paths. Remarkably, some Firmicutes developed a new conjugation system based on an atypical relaxase and an ATPase derived from a dsDNA translocase. The observed evolutionary rates and patterns of presence/absence of specific T4SS proteins show that conjugation systems are often and independently exapted for other functions. This work brings a natural basis for the classification of all kinds of conjugative systems, thus tackling a problem that is growing as fast as genomic databases. Our analysis provides the first global picture of the evolution of conjugation and shows how a self-transferrable complex multiprotein system has adapted to different taxa and often been recruited by the host. As conjugation systems became specific to certain clades and cell envelopes, they may have biased the rate and direction of gene transfer by conjugation within prokaryotes.

...read moreread less

Journal Article•DOI•

Genetic Signatures Reveal High-Altitude Adaptation in a Set of Ethiopian Populations

[...]

Emilia Huerta-Sanchez¹, Michael DeGiorgio¹, Luca Pagani², Luca Pagani³, Ayele Tarekegn⁴, Rosemary Ekong⁵, Tiago Antao², Alexia Cardona², Hugh Montgomery⁵, Gianpiero L. Cavalleri⁶, Peter A. Robbins⁷, Michael E. Weale⁸, Neil Bradman⁵, Endashaw Bekele⁴, Toomas Kivisild², Chris Tyler-Smith³, Rasmus Nielsen - Show less +13 more•Institutions (8)

University of California, Berkeley¹, University of Cambridge², Wellcome Trust Sanger Institute³, Addis Ababa University⁴, University College London⁵, Royal College of Surgeons in Ireland⁶, University of Oxford⁷, King's College London⁸

Efficient Sequencing of Anuran mtDNAs and a Mitogenomic Exploration of the Phylogeny and Evolution of Frogs

TL;DR: The view that Ethiopian, Andean, and Tibetan populations living at high altitude have adapted to hypoxia differently is supported, with convergent evolution affecting different genes from the same pathway.

...read moreread less

Abstract: The Tibetan and Andean Plateaus and Ethiopian highlands are the largest regions to have long-term high-altitude residents. Such populations are exposed to lower barometric pressures and hence atmospheric partial pressures of oxygen. Such "hypobaric hypoxia" may limit physical functional capacity, reproductive health, and even survival. As such, selection of genetic variants advantageous to hypoxic adaptation is likely to have occurred. Identifying signatures of such selection is likely to help understanding of hypoxic adaptive processes. Here, we seek evidence of such positive selection using five Ethiopian populations, three of which are from high-altitude areas in Ethiopia. As these populations may have been recipients of Eurasian gene flow, we correct for this admixture. Using single-nucleotide polymorphism genotype data from multiple populations, we find the strongest signal of selection in BHLHE41 (also known as DEC2 or SHARP1). Remarkably, a major role of this gene is regulation of the same hypoxia response pathway on which selection has most strikingly been observed in both Tibetan and Andean populations. Because it is also an important player in the circadian rhythm pathway, BHLHE41 might also provide insights into the mechanisms underlying the recognized impacts of hypoxia on the circadian clock. These results support the view that Ethiopian, Andean, and Tibetan populations living at high altitude have adapted to hypoxia differently, with convergent evolution affecting different genes from the same pathway.

...read moreread less

Journal Article•DOI•

[...]

Peng Zhang¹, Dan Liang¹, Rong-Li Mao¹, David M. Hillis², David B. Wake³, David C. Cannatella² - Show less +2 more•Institutions (3)

Sun Yat-sen University¹, University of Texas at Austin², Museum of Vertebrate Zoology³

Evolutionary and Population Genomics of the Cavity Causing Bacteria Streptococcus mutans

TL;DR: An efficient method for sequencing anuran mitochondrial DNAs by amplifying the mitochondrial genome in 12 overlapping fragments using frog-specific universal primer sets is developed and mtDNA performs well for both phylogenetic and divergence time inferences and will provide important reference hypotheses for the phylogeny and evolution of frogs.

...read moreread less

Abstract: Anura (frogs and toads) constitute over 88% of living amphibian diversity but many important questions about their phylogeny and evolution remain unresolved. For this study, we developed an efficient method for sequencing anuran mitochondrial DNAs (mtDNAs) by amplifying the mitochondrial genome in 12 overlapping fragments using frog-specific universal primer sets. Based on this method, we generated 47 nearly complete, new anuran mitochondrial genomes and discovered nine novel gene arrangements. By combining the new data and published anuran mitochondrial genomes, we assembled a large mitogenomic data set (11,007nt) including 90 frog species, representing 39 of 53 recognized anuran families, to investigate their phylogenetic relationships and evolutionary history. The resulting tree strongly supported a paraphyletic arrangement of archaeobatrachian (=nonneobatrachian) frogs, with Leiopelmatoidea branching first, followed by Discoglossoidea, Pipoidea, and Pelobatoidea. Within Neobatrachia, the South African Heleophrynidae is the sister-taxon to all other neobatrachian frogs and the Seychelles-endemic Sooglossidae is recovered as the sister-taxon to Ranoidea. These phylogenetic relationships agree with many nuclear gene studies. The chronogram derived from two Bayesian relaxed clock methods (MultiDivTime and BEAST) suggests that modern frogs (Anura) originated in the early Triassic about 244 Ma and the appearance of Neobatrachia took place in the late Jurassic about 163 Ma. The initial diversifications of two species-rich superfamilies Hyloidea and Ranoidea commenced 110 and 133 Ma, respectively. These times are older than some other estimates by approximately 30‐40 My. Compared with nuclear data, mtDNA produces compatible time estimates for deep nodes (>150 Ma), but apparently older estimates for more shallow nodes. Our study shows that, although it evolves relatively rapidly and behaves much as a single locus, mtDNA performs well for both phylogenetic and divergence time inferences and will provide important reference hypotheses for the phylogeny and evolution of frogs.

...read moreread less

Journal Article•DOI•

[...]

Omar E. Cornejo¹, Tristan Lefébure², Paulina D. Pavinski Bitar², Ping Lang², Vincent P. Richards², Kirsten Eilertson², Thuy Do³, David Beighton³, Lin Zeng⁴, Sang-Joon Ahn⁴, Robert A. Burne⁴, Adam Siepel², Carlos Bustamante¹, Michael J. Stanhope² - Show less +10 more•Institutions (4)

Stanford University¹, Cornell University², King's College London³, University of Florida⁴

Evolution of the ARF gene family in land plants: old domains, new tricks.

TL;DR: Analysis of the core genome suggested that among 73 genes present in all isolates of S. mutans but absent in other species of the mutans taxonomic group, the majority can be associated with metabolic processes that could have contributed to the successful adaptation of the species to its new niche, the human mouth, and with the dietary changes that accompanied the onset of human agriculture.

...read moreread less

Abstract: Streptococcus mutans is widely recognized as one of the key etiological agents of human dental caries. Despite its role in this important disease, our present knowledge of gene content variability across the species and its relationship to adaptation is minimal. Estimates of its demographic history are not available. In this study, we generated genome sequences of 57 S. mutans isolates, as well as representative strains of the most closely related species to S. mutans (S. ratti, S. macaccae, and S. criceti), to identify the overall structure and potential adaptive features of the dispensable and core components of the genome. We also performed population genetic analyses on the core genome of the species aimed at understanding the demographic history, and impact of selection shaping its genetic variation. The maximum gene content divergence among strains was approximately 23%, with the majority of strains diverging by 5–15%. The core genome consisted of 1,490 genes and the pan-genome approximately 3,296. Maximum likelihood analysis of the synonymous site frequency spectrum (SFS) suggested that the S. mutans population started expanding exponentially approximately 10,000 years ago (95% confidence interval [CI]: 3,268–14,344 years ago), coincidental with the onset of human agriculture. Analysis of the replacement SFS indicated that a majority of these substitutions are under strong negative selection, and the remainder evolved neutrally. A set of 14 genes was identified as being under positive selection, most of which were involved in either sugar metabolism or acid tolerance. Analysis of the core genome suggested that among 73 genes present in all isolates of S. mutans but absent in other species of the mutans taxonomic group, the majority can be associated with metabolic processes that could have contributed to the successful adaptation of S. mutans to its new niche, the human mouth, and with the dietary changes that accompanied the origin of agriculture.

...read moreread less

Journal Article•DOI•

[...]

Cédric Finet¹, Annick Berne-Dedieu¹, Charles P. Scutt¹, Ferdinand Marlétaz²•Institutions (2)

École normale supérieure de Lyon¹, University of Oxford²

01 Jan 2013-Molecular Biology and Evolution

TL;DR: Gene duplications, domain rearrangement, and post-transcriptional regulation have enabled a subtle control of auxin signaling through ARF proteins that may have contributed to the critical importance of these regulators in plant development and evolution.

...read moreread less

Abstract: Auxin response factors (ARF) are key players in plant development. They mediate the cellular response to the plant hormone auxin by activating or repressing the expression of downstream developmental genes. The pivotal activation function of ARF proteins is enabled by their four-domain architecture, which includes both DNA-binding and protein dimerization motifs. To determine the evolutionary origin of this characteristic architecture, we built a comprehensive data set of 224 ARF-related protein sequences that represents all major living divisions of land plants, except hornworts. We found that ARFs are split into three subfamilies that could be traced back to the origin of the land plants. We also show that repeated events of extensive gene duplication contributed to the expansion of those three original subfamilies. Further examination of our data set uncovered a broad diversity in the structure of ARF transcripts and allowed us to identify an additional conserved motif in ARF proteins. We found that additional structural diversity in ARF proteins is mainly generated by two mechanisms: genomic truncation and alternative splicing. We propose that the loss of domains from the canonical, four-domain ARF structure has promoted functional shifts within the ARF family by disrupting either dimerization or DNA-binding capabilities. For instance, the loss of dimerization domains in some ARFs from moss and spikemoss genomes leads to proteins that are reminiscent of Aux/IAA proteins, possibly providing a clue on the evolution of these modulators of ARF function. We also assessed the functional impact of alternative splicing in the case of ARF4, for which we have identified a novel isoform in Arabidopsis thaliana. Genetic analysis showed that these two transcripts exhibit markedly different developmental roles in A. thaliana. Gene duplications, domain rearrangement, and post-transcriptional regulation have thus enabled a subtle control of auxin signaling through ARF proteins that may have contributed to the critical importance of these regulators in plant development and evolution.

...read moreread less

Journal Article•DOI•

Bio++: Efficient Extensible Libraries and Tools for Computational Molecular Evolution

[...]

Laurent Guéguen¹, Sylvain Gaillard², Sylvain Gaillard³, Sylvain Gaillard⁴, Bastien Boussau⁵, Bastien Boussau¹, Manolo Gouy¹, Mathieu Groussin¹, Nicolas C. Rochette¹, Thomas Bigot¹, David Fournier⁶, Fanny Pouyet¹, Vincent Cahais⁷, Aurélien Bernard⁷, Celine Scornavacca⁷, Benoit Nabholz⁷, Annabelle Haudry¹, Loïc Dachary, Nicolas Galtier⁷, Khalid Belkhir⁷, Julien Y. Dutheil⁷, Julien Y. Dutheil⁸ - Show less +18 more•Institutions (8)

University of Lyon¹, Agrocampus Ouest², University of Angers³, Institut national de la recherche agronomique⁴, University of California, Berkeley⁵, Max Delbrück Center for Molecular Medicine⁶, University of Montpellier⁷, Max Planck Society⁸

Crossing the species barrier: genomic hotspots of introgression between two highly divergent Ciona intestinalis species

TL;DR: The second major release of the Bio++ libraries is presented, which provides an extended set of classes and methods that notably provide built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries.

...read moreread less

Abstract: Efficient algorithms and programs for the analysis of the ever-growing amount of biological sequence data are strongly needed in the genomics era. The pace at which new data and methodologies are generated calls for the use of pre-existing, optimized-yet extensible-code, typically distributed as libraries or packages. This motivated the Bio++ project, aiming at developing a set of C++ libraries for sequence analysis, phylogenetics, population genetics, and molecular evolution. The main attractiveness of Bio++ is the extensibility and reusability of its components through its object-oriented design, without compromising the computer-efficiency of the underlying methods. We present here the second major release of the libraries, which provides an extended set of classes and methods. These extensions notably provide built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries. More complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included.

...read moreread less

Journal Article•DOI•

[...]

Camille Roux¹, Georgia Tsagkogeorga², Georgia Tsagkogeorga¹, Nicolas Bierne¹, Nicolas Bierne³, Nicolas Galtier¹ - Show less +2 more•Institutions (3)

University of Montpellier¹, Queen Mary University of London², Centre national de la recherche scientifique³

01 Jul 2013-Molecular Biology and Evolution

TL;DR: The history and degree of isolation of two cryptic and partially sympatric model species are clarified and a methodological framework to investigate genome-wide heterogeneity (GWH) at various stages of speciation process is provided.

...read moreread less

Abstract: Inferring a realistic demographic model from genetic data is an important challenge to gain insights into the historical events during the speciation process and to detect molecular signatures of selection along genomes. Recent advances in divergence population genetics have reported that speciation in face of gene flow occurred more frequently than theoretically expected, but the approaches used did not account for genome-wide heterogeneity (GWH) in introgression rates. Here, we investigate the impact of GWH on the inference of divergence with gene flow between two cryptic species of the marine model Ciona intestinalis by analyzing polymorphism and divergence patterns in 852 protein-coding sequence loci. These morphologically similar entities are highly diverged molecular-wise, but evidence of hybridization has been reported in both laboratory and field studies. We compare various speciation models and test for GWH under the approximate Bayesian computation framework. Our results demonstrate the presence of significant extents of gene flow resulting from a recent secondary contact after >3 My of divergence in isolation. The inferred rates of introgression are relatively low, highly variable across loci and mostly unidirectional, which is consistent with the idea that numerous genetic incompatibilities have accumulated over time throughout the genomes of these highly diverged species. A genomic map of the level of gene flow identified two hotspots of introgression, that is, large genome regions of unidirectional introgression. This study clarifies the history and degree of isolation of two cryptic and partially sympatric model species and provides a methodological framework to investigate GWH at various stages of speciation process.

...read moreread less

Journal Article•DOI•

Genetic Evidence of Paleolithic Colonization and Neolithic Expansion of Modern Humans on the Tibetan Plateau

[...]

Xuebin Qi¹, Chaoying Cui, Yi Peng¹, Yi Peng², Xiaoming Zhang², Xiaoming Zhang¹, Zhaohui Yang², Zhaohui Yang¹, Hua Zhong¹, Hui Zhang¹, Kun Xiang¹, Kun Xiang², Xiangyu Cao¹, Xiangyu Cao², Yi Wang¹, Yi Wang², Ouzhuluobu, Basang, Ciwangsangbu, Bianba, Gonggalanzi, Tianyi Wu, Hua Chen³, Hong Shi¹, Bing Su¹ - Show less +21 more•Institutions (3)

Kunming Institute of Zoology¹, Chinese Academy of Sciences², Harvard University³