scispace - formally typeset
Search or ask a question
Author

Nathan V. Whelan

Bio: Nathan V. Whelan is an academic researcher from United States Fish and Wildlife Service. The author has contributed to research in topics: Genetic diversity & Threatened species. The author has an hindex of 14, co-authored 41 publications receiving 1003 citations. Previous affiliations of Nathan V. Whelan include University of Alabama & Auburn University.

Papers
More filters
Journal ArticleDOI
TL;DR: Investigating possible causes of systematic error by expanding taxon sampling with eight novel transcriptomes, strictly enforcing orthology inference criteria, and progressively examining potential causes of systemic error while using both maximum-likelihood with robust data partitioning and Bayesian inference with a site-heterogeneous model finds a single, statistically robust placement of ctenophores as the authors' most distant animal relatives.
Abstract: Elucidating relationships among early animal lineages has been difficult, and recent phylogenomic analyses place Ctenophora sister to all other extant animals, contrary to the traditional view of Porifera as the earliest-branching animal lineage. To date, phylogenetic support for either ctenophores or sponges as sister to other animals has been limited and inconsistent among studies. Lack of agreement among phylogenomic analyses using different data and methods obscures how complex traits, such as epithelia, neurons, and muscles evolved. A consensus view of animal evolution will not be accepted until datasets and methods converge on a single hypothesis of early metazoan relationships and putative sources of systematic error (e.g., long-branch attraction, compositional bias, poor model choice) are assessed. Here, we investigate possible causes of systematic error by expanding taxon sampling with eight novel transcriptomes, strictly enforcing orthology inference criteria, and progressively examining potential causes of systematic error while using both maximum-likelihood with robust data partitioning and Bayesian inference with a site-heterogeneous model. We identified ribosomal protein genes as possessing a conflicting signal compared with other genes, which caused some past studies to infer ctenophores and cnidarians as sister. Importantly, biases resulting from elevated compositional heterogeneity or elevated substitution rates are ruled out. Placement of ctenophores as sister to all other animals, and sponge monophyly, are strongly supported under multiple analyses, herein.

296 citations

Journal ArticleDOI
TL;DR: Newly sequenced transcriptomes are combined with existing data to establish Ctenophora as the sister group to all other animals and suggest a radiation around 350 Ma as well as multiple transitions from a pelagic to benthic lifestyle within ctenophores.
Abstract: Ctenophora, comprising approximately 200 described species, is an important lineage for understanding metazoan evolution and is of great ecological and economic importance. Ctenophore diversity includes species with unique colloblasts used for prey capture, smooth and striated muscles, benthic and pelagic lifestyles, and locomotion with ciliated paddles or muscular propulsion. However, the ancestral states of traits are debated and relationships among many lineages are unresolved. Here, using 27 newly sequenced ctenophore transcriptomes, publicly available data and methods to control systematic error, we establish the placement of Ctenophora as the sister group to all other animals and refine the phylogenetic relationships within ctenophores. Molecular clock analyses suggest modern ctenophore diversity originated approximately 350 million years ago ± 88 million years, conflicting with previous hypotheses, which suggest it originated approximately 65 million years ago. We recover Euplokamis dunlapae—a species with striated muscles—as the sister lineage to other sampled ctenophores. Ancestral state reconstruction shows that the most recent common ancestor of extant ctenophores was pelagic, possessed tentacles, was bioluminescent and did not have separate sexes. Our results imply at least two transitions from a pelagic to benthic lifestyle within Ctenophora, suggesting that such transitions were more common in animal diversification than previously thought. Newly sequenced transcriptomes are combined with existing data to establish Ctenophora as the sister group to all other animals and suggest a radiation around 350 Ma as well as multiple transitions from a pelagic to a benthic lifestyle.

183 citations

Journal ArticleDOI
TL;DR: Comparison of modern to background extinction rates reveals that gastropods have the highest modern extinction rate yet observed, 9,539 times greater than background rates.
Abstract: This is the first American Fisheries Society conservation assessment of freshwater gastropods (snails) from Canada and the United States by the Gastropod Subcommittee (Endangered Species Committee). This review covers 703 species representing 16 families and 93 genera, of which 67 species are considered extinct, or possibly extinct, 278 are endangered, 102 are threatened, 73 are vulnerable, 157 are currently stable, and 26 species have uncertain taxonomic status. Of the entire fauna, 74% of gastropods are imperiled (vulnerable, threatened, endangered) or extinct, which exceeds imperilment levels in fishes (39%) and crayfishes (48%) but is similar to that of mussels (72%). Comparison of modern to background extinction rates reveals that gastropods have the highest modern extinction rate yet observed, 9,539 times greater than background rates. Gastropods are highly susceptible to habitat loss and degradation, particularly narrow endemics restricted to a single spring or short stream reaches. Compil...

172 citations

Journal ArticleDOI
TL;DR: It is concluded that partitioning and CAT‐GTR perform similarly in recovering accurate branching patterns, however, computation time can be orders of magnitude less for data partitioning, with commonly used implementations of CAT‐ GTR often failing to reach completion in a reasonable time frame.
Abstract: As phylogenetic datasets have increased in size, site-heterogeneous substitution models such as CAT-F81 and CAT-GTR have been advocated in favor of other models because they purportedly suppress long-branch attraction (LBA). These models are two of the most commonly used models in phylogenomics, and they have been applied to a variety of taxa, ranging from Drosophila to land plants. However, many arguments in favor of CAT models have been based on tenuous assumptions about the true phylogeny, rather than rigorous testing with known trees via simulation. Moreover, CAT models have not been compared to other approaches for handling substitutional heterogeneity such as data partitioning with site-homogeneous substitution models. We simulated amino acid sequence datasets with substitutional heterogeneity on a variety of tree shapes including those susceptible to LBA. Data were analyzed with both CAT models and partitioning to explore model performance; in total over 670,000 CPU hours were used, of which over 97% was spent running analyses with CAT models. In many cases, all models recovered branching patterns that were identical to the known tree. However, CAT-F81 consistently performed worse than other models in inferring the correct branching patterns, and both CAT models often overestimated substitutional heterogeneity. Additionally, reanalysis of two empirical metazoan datasets supports the notion that CAT-F81 tends to recover less accurate trees than data partitioning and CAT-GTR. Given these results, we conclude that partitioning and CAT-GTR perform similarly in recovering accurate branching patterns. However, computation time can be orders of magnitude less for data partitioning, with commonly used implementations of CAT-GTR often failing to reach completion in a reasonable time frame (i.e., for Bayesian analyses to converge). Practices such as removing constant sites and parsimony uninformative characters, or using CAT-F81 when CAT-GTR is deemed too computationally expensive, cannot be logically justified. Given clear problems with CAT-F81, phylogenies previously inferred with this model should be reassessed. [Data partitioning; phylogenomics, simulation, site-heterogeneity, substitution models.].

76 citations

Journal ArticleDOI
TL;DR: Phylogenetic analyses robustly supported a monophyletic Unionidae, with Coelatura recovered as part of a well-supported Africa-India clade (=Parreysiinae), and the implications are discussed in the context of Afrotropical freshwater mussel evolution and the classification of the family Unionidae.

60 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
TL;DR: Some notable features of IQ-TREE version 2 are described and the key advantages over other software are highlighted.
Abstract: IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.

4,337 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations

Journal ArticleDOI
30 May 2014-Science
TL;DR: The biodiversity of eukaryote species and their extinction rates, distributions, and protection is reviewed, and what the future rates of species extinction will be, how well protected areas will slow extinction Rates, and how the remaining gaps in knowledge might be filled are reviewed.
Abstract: Background A principal function of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) is to “perform regular and timely assessments of knowledge on biodiversity.” In December 2013, its second plenary session approved a program to begin a global assessment in 2015. The Convention on Biological Diversity (CBD) and five other biodiversity-related conventions have adopted IPBES as their science-policy interface, so these assessments will be important in evaluating progress toward the CBD’s Aichi Targets of the Strategic Plan for Biodiversity 2011–2020. As a contribution toward such assessment, we review the biodiversity of eukaryote species and their extinction rates, distributions, and protection. We document what we know, how it likely differs from what we do not, and how these differences affect biodiversity statistics. Interestingly, several targets explicitly mention “known species”—a strong, if implicit, statement of incomplete knowledge. We start by asking how many species are known and how many remain undescribed. We then consider by how much human actions inflate extinction rates. Much depends on where species are, because different biomes contain different numbers of species of different susceptibilities. Biomes also suffer different levels of damage and have unequal levels of protection. How extinction rates will change depends on how and where threats expand and whether greater protection counters them. Different visualizations of species biodiversity. ( A ) The distributions of 9927 bird species. ( B ) The 4964 species with smaller than the median geographical range size. ( C ) The 1308 species assessed as threatened with a high risk of extinction by BirdLife International for the Red List of Threatened Species of the International Union for Conservation of Nature. ( D ) The 1080 threatened species with less than the median range size. (D) provides a strong geographical focus on where local conservation actions can have the greatest global impact. Additional biodiversity maps are available at www.biodiversitymapping.org. Advances Recent studies have clarified where the most vulnerable species live, where and how humanity changes the planet, and how this drives extinctions. These data are increasingly accessible, bringing greater transparency to science and governance. Taxonomic catalogs of plants, terrestrial vertebrates, freshwater fish, and some marine taxa are sufficient to assess their status and the limitations of our knowledge. Most species are undescribed, however. The species we know best have large geographical ranges and are often common within them. Most known species have small ranges, however, and such species are typically newer discoveries. The numbers of known species with very small ranges are increasing quickly, even in well-known taxa. They are geographically concentrated and are disproportionately likely to be threatened or already extinct. We expect unknown species to share these characteristics. Current rates of extinction are about 1000 times the background rate of extinction. These are higher than previously estimated and likely still underestimated. Future rates will depend on many factors and are poised to increase. Finally, although there has been rapid progress in developing protected areas, such efforts are not ecologically representative, nor do they optimally protect biodiversity. Outlook Progress on assessing biodiversity will emerge from continued expansion of the many recently created online databases, combining them with new global data sources on changing land and ocean use and with increasingly crowdsourced data on species’ distributions. Examples of practical conservation that follow from using combined data in Colombia and Brazil can be found at www.savingspecies.org and www.youtube.com/watch?v=R3zjeJW2NVk.

2,360 citations

Journal ArticleDOI
TL;DR: RAxML-NG is presented, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML, which offers improved accuracy, flexibility, speed, scalability, and usability compared with RAx ML/ exaML.
Abstract: MOTIVATION Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture and medicine. Finding the optimal tree under the popular maximum likelihood (ML) criterion is known to be NP-hard. Thus, highly optimized and scalable codes are needed to analyze constantly growing empirical datasets. RESULTS We present RAxML-NG, a from-scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML. RAxML-NG offers improved accuracy, flexibility, speed, scalability, and usability compared with RAxML/ExaML. On taxon-rich datasets, RAxML-NG typically finds higher-scoring trees than IQTree, an increasingly popular recent tool for ML-based phylogenetic inference (although IQ-Tree shows better stability). Finally, RAxML-NG introduces several new features, such as the detection of terraces in tree space and the recently introduced transfer bootstrap support metric. AVAILABILITY AND IMPLEMENTATION The code is available under GNU GPL at https://github.com/amkozlov/raxml-ng. RAxML-NG web service (maintained by Vital-IT) is available at https://raxml-ng.vital-it.ch/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

1,765 citations