scispace - formally typeset
Search or ask a question
Journal ArticleDOI

How to fail at species delimitation.

01 Sep 2013-Molecular Ecology (Mol Ecol)-Vol. 22, Iss: 17, pp 4369-4383
TL;DR: Researchers should apply a wide range of species delimitation analyses to their data and place their trust in delimitations that are congruent across methods, for in most contexts it is better to fail to delimit species than it is to falsely delimit entities that do not represent actual evolutionary lineages.
Abstract: Species delimitation is the act of identifying species-level biological diversity. In recent years, the field has witnessed a dramatic increase in the number of methods available for delimiting species. However, most recent investigations only utilize a handful (i.e. 2–3) of the available methods, often for unstated reasons. Because the parameter space that is potentially relevant to species delimitation far exceeds the parameterization of any existing method, a given method necessarily makes a number of simplifying assumptions, any one of which could be violated in a particular system. We suggest that researchers should apply a wide range of species delimitation analyses to their data and place their trust in delimitations that are congruent across methods. Incongruence across the results from different methods is evidence of either a difference in the power to detect cryptic lineages across one or more of the approaches used to delimit species and could indicate that assumptions of one or more of the methods have been violated. In either case, the inferences drawn from species delimitation studies should be conservative, for in most contexts it is better to fail to delimit species than it is to falsely delimit entities that do not represent actual evolutionary lineages.
Citations
More filters
Journal ArticleDOI
TL;DR: It is shown that the multispecies coalescent diagnoses genetic structure, not species, and that it does not statistically distinguish structure associated with population isolation vs. species boundaries.
Abstract: The multispecies coalescent model underlies many approaches used for species delimitation. In previous work assessing the performance of species delimitation under this model, speciation was treated as an instantaneous event rather than as an extended process involving distinct phases of speciation initiation (structuring) and completion. Here, we use data under simulations that explicitly model speciation as an extended process rather than an instantaneous event and carry out species delimitation inference on these data under the multispecies coalescent. We show that the multispecies coalescent diagnoses genetic structure, not species, and that it does not statistically distinguish structure associated with population isolation vs. species boundaries. Because of the misidentification of population structure as putative species, our work raises questions about the practice of genome-based species discovery, with cascading consequences in other fields. Specifically, all fields that rely on species as units of analysis, from conservation biology to studies of macroevolutionary dynamics, will be impacted by inflated estimates of the number of species, especially as genomic resources provide unprecedented power for detecting increasingly finer-scaled genetic structure under the multispecies coalescent. As such, our work also represents a general call for systematic study to reconsider a reliance on genomic data alone. Until new methods are developed that can discriminate between structure due to population-level processes and that due to species boundaries, genomic-based results should only be considered a hypothesis that requires validation of delimited species with multiple data types, such as phenotypic and ecological information.

614 citations


Cites background from "How to fail at species delimitation..."

  • ...Misidentification of population structure as putative species is therefore emerging as a key issue (8) that has received insufficient attention, especially with respect to methodologies for delimiting taxa based on genetic data alone....

    [...]

Journal ArticleDOI
TL;DR: An overview and a tutorial of the BPP program, which is a Bayesian MCMC program for analyzing multi-locus genomic sequence data under the multispecies coalescent model, is provided.
Abstract: This paper provides an overview and a tutorial of the BPP program, which is a Bayesian MCMC program for analyzing multi-locus genomic sequence data under the multispecies coalescent model. An example dataset of five nuclear loci from the East Asian brown frogs is used to illustrate four different analyses, including estimation of species divergence time and population size parameters under the multispecies coalescent model on a fixed species phylogeny (A00), species tree estimation when the assignment and species delimitation are fixed (A01), species delimitation using a fixed guide tree (A10), and joint species delimitation and species-tree estimation or unguided species delimitation (A11). For the joint analysis (A11), two new priors are introduced, which assign uniform probabilities for the different numbers of delimited species, which may be useful when assignment, species delimitation, and species phylogeny are all inferred in one joint analysis. The paper ends with a discussion of the assumptions, the strengths and weaknesses of the BPP analysis [Current Zoology 61 (5) : – , 2015 ].

548 citations


Cites methods from "How to fail at species delimitation..."

  • ...See Fujita et al. (2012) and Carstens et al. (2013) for reviews on species delimitation methods using genetic sequence data....

    [...]

Journal ArticleDOI
TL;DR: MALDI-TOF mass spectrometry readily distinguishes the newly recognized species, which differ in aspects of pathogenicity, prevalence for patient groups, as well as biochemical and physiological aspects, such as susceptibility to antifungals.

543 citations

Journal ArticleDOI
TL;DR: It is demonstrated that ASAP has the potential to become a major tool for taxonomists as it proposes rapidly in a full graphical exploratory interface relevant species hypothesis as a first step of the integrative taxonomy process.
Abstract: Here, we describe Assemble Species by Automatic Partitioning (ASAP), a new method to build species partitions from single locus sequence alignments (i.e., barcode data sets). ASAP is efficient enough to split data sets as large 104 sequences into putative species in several minutes. Although grounded in evolutionary theory, ASAP is the implementation of a hierarchical clustering algorithm that only uses pairwise genetic distances, avoiding the computational burden of phylogenetic reconstruction. Importantly, ASAP proposes species partitions ranked by a new scoring system that uses no biological prior insight of intraspecific diversity. ASAP is a stand-alone program that can be used either through a graphical web-interface or that can be downloaded and compiled for local usage. We have assessed its power along with three others programs (ABGD, PTP and GMYC) on 10 real COI barcode data sets representing various degrees of challenge (from small and easy cases to large and complicated data sets). We also used Monte-Carlo simulations of a multispecies coalescent framework to assess the strengths and weaknesses of ASAP and the other programs. Through these analyses, we demonstrate that ASAP has the potential to become a major tool for taxonomists as it proposes rapidly in a full graphical exploratory interface relevant species hypothesis as a first step of the integrative taxonomy process.

393 citations


Cites methods from "How to fail at species delimitation..."

  • ...In this category, the most popular methods are SpedeSTEM (Ence & Carstens, 2011), BPP (Yang & Rannala, 2014) and BFD (Leaché et al., 2014), reviewed (with other methods) in several articles (Camargo & Sites, 2013; Carstens et al., 2013; Fujita et al., 2012; Leavitt et al., 2015; Rannala, 2015)....

    [...]

  • ..., 2014), reviewed (with other methods) in several articles (Camargo & Sites, 2013; Carstens et al., 2013; Fujita et al., 2012; Leavitt et al., 2015; Rannala, 2015)....

    [...]

Journal ArticleDOI
TL;DR: A recently introduced dynamic programming algorithm for estimating species trees that bypasses MCMC integration over gene trees with sophisticated methods for estimating marginal likelihoods, needed for Bayesian model selection, are combined to provide a rigorous and computationally tractable technique for genome-wide species delimitation.
Abstract: The multispecies coalescent has provided important progress for evolutionary inferences, including increasing the statistical rigor and objectivity of comparisons among competing species delimitation models. However, Bayesian species delimitation methods typically require brute force integration over gene trees via Markov chain Monte Carlo (MCMC), which introduces a large computation burden and precludes their application to genomic-scale data. Here we combine a recently introduced dynamic programming algorithm for estimating species trees that bypasses MCMC integration over gene trees with sophisticated methods for estimating marginal likelihoods, needed for Bayesian model selection, to provide a rigorous and computationally tractable technique for genome-wide species delimitation. We provide a critical yet simple correction that brings the likelihoods of different species trees, and more importantly their corresponding marginal likelihoods, to the same common denominator, which enables direct and accurate comparisons of competing species delimitation models using Bayes factors. We test this approach, which we call Bayes factor delimitation (*with genomic data; BFD*), using common species delimitation scenarios with computer simulations. Varying the numbers of loci and the number of samples suggest that the approach can distinguish the true model even with few loci and limited samples per species. Misspecification of the prior for population size θ has little impact on support for the true model. We apply the approach to West African forest geckos (Hemidactylus fasciatus complex) using genome-wide SNP data. This new Bayesian method for species delimitation builds on a growing trend for objective species delimitation methods with explicit model assumptions that are easily tested. [Bayes factor; model testing; phylogeography; RADseq; simulation; speciation.].

376 citations


Cites background from "How to fail at species delimitation..."

  • ...Thus far, species delimitation studies using gene trees have been limited to approximately 20 loci Carstens et al. (2013), but as many studies trend toward large phylogenomic datasets exceeding 100s of loci O’Neill et al. (2013); Wagner et al. (2013); Smith et al. (2013) there is a real need for…...

    [...]

  • ...Indeed, recent progress in statistical species delimitation has largely focused on genetic data Fujita et al. (2012); Carstens et al. (2013), likely due to the ease with which they can be abundantly collected even for non-model organisms....

    [...]

  • ...Indeed, recent progress in statistical species delimitation has largely focused on genetic data (Fujita et al. 2012; Carstens et al. 2013), likely due to the ease with which they can be abundantly collected even for nonmodel organisms....

    [...]

  • ...Thus far, species delimitation studies using gene trees have been limited to approximately 20 loci (Carstens et al. 2013), but as many studies trend toward large phylogenomic datasets exceeding 100s of loci (O’Neill et al....

    [...]

References
More filters
Book
19 Jun 2013
TL;DR: The second edition of this book is unique in that it focuses on methods for making formal statistical inference from all the models in an a priori set (Multi-Model Inference).
Abstract: Introduction * Information and Likelihood Theory: A Basis for Model Selection and Inference * Basic Use of the Information-Theoretic Approach * Formal Inference From More Than One Model: Multi-Model Inference (MMI) * Monte Carlo Insights and Extended Examples * Statistical Theory and Numerical Results * Summary

36,993 citations


"How to fail at species delimitation..." refers methods in this paper

  • ...…coalescent is spedeSTEM (Ence & Carstens 2011), a maximum-likelihood approach that uses STEM (Kubatko et al. 2009) to calculate the probability of models that contain differing numbers of evolutionary lineages and then uses information theory (see Burnham & Anderson 2002) to rank these models....

    [...]

  • ...The putative lineages are then sequentially collapsed on the basis of which samples are most closely related, the probability of the species tree is recalculated, and information theory (Burnham & Anderson 2002) is used to identify the optimal model of lineage composition....

    [...]

Journal ArticleDOI
01 Jun 2000-Genetics
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Abstract: We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci— e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/~pritch/home.html.

27,454 citations

Book
03 Sep 2009
TL;DR: The "Penguin Classics" edition of "On the Origin of Species" as discussed by the authors contains an introduction and notes by William Bynum, and features a cover designed by Damien Hirst.
Abstract: Charles Darwin's seminal formulation of the theory of evolution, "On the Origin of Species" continues to be as controversial today as when it was first published. This "Penguin Classics" edition contains an introduction and notes by William Bynum, and features a cover designed by Damien Hirst. Written for a general readership, "On the Origin of Species" sold out on the day of its publication and has remained in print ever since. Instantly and persistently controversial, the concept of natural selection transformed scientific analysis about all life on Earth. Before the "Origin of Species", accepted thinking held that life was the static and perfect creation of God. By a single, systematic argument Darwin called this view into question. His ideas have affected public perception of everything from religion to economics. William Bynum's introduction discusses Darwin's life, the publication and reception of the themes of "On the Origin of Species", and the subsequent development of its major themes. The new edition also includes brief biographies of some of the most important scientific thinkers leading up to and surrounding the "Origin of Species", suggested further reading, notes and a chronology. Charles Darwin (1809-82), a Victorian scientist and naturalist, has become one of the most famous figures of science to date. The advent of "On the Origin of Species" by means of natural selection in 1859 challenged and contradicted all contemporary biological and religious beliefs. If you enjoyed "On the Origin of Species", you might like Darwin's "The Descent of Man", also available in "Penguin Classics".

7,487 citations

Book
03 Jan 2000
TL;DR: This chapter discusses the history and Purview of Phylogeography, Genealogical Concordance, and Speciation Processes and Extended Genealogy Works and its applications to Speciation and Beyond.
Abstract: Preface I. History and Conceptual Background 1. The History and Purview of Phylogeography 2. Demography-Phylogeny Connections II. Empirical Intraspecific Phylogeography 3. Lessons from Human Analyses 4. Intraspecific Patterns in other Animals III. Genealogical Concordance: Toward Speciation and Beyond 5. Genealogical Concordance 6. Speciation Processes and Extended Genealogy Works Cited Index

4,753 citations

Journal ArticleDOI
01 Feb 1872-Nature
TL;DR: A man is unworthy of the name of a man of science who, whatever may be his special branch of study, has not materially altered his views on some important points within the last twelve years.
Abstract: FEW are the writers, scientific or otherwise, who ca afford, in every successive edition of their works, to place side by side the passages which they have seen reason to alter, from a change of view or any other cause. And yet to this point we find especial attention called in each succeeding edition of Mr. Darwin's “Origin of Species.” And herein lies the true humility of the man of science. Science is often charged with being arrogant. But the true student of Nature cannot be otherwise than humble-minded. That man is unworthy of the name of a man of science who, whatever may be his special branch of study, has not materially altered his views on some important points within the last twelve years.* The means at our command for obtaining correct views of the laws which govern Nature are ever increasing, and if we only The Origin of Species by means of Natural Selection; or the Preservation of Favoured Races in the Struggle for Life. By Charles Darwin, M.A., F.R.S. Sixth edition, with additions and corrections. (London: J. Murray, 1872.)

3,808 citations