A general species delimitation method with applications to phylogenetic placements
Reads0
Chats0
TLDR
The Poisson tree processes (PTP) model is introduced to infer putative species boundaries on a given phylogenetic input tree and yields more accurate results than de novo species delimitation methods.Abstract:
Motivation: Sequence-based methods to delimit species are central to DNA taxonomy, microbial community surveys and DNA metabarcoding studies. Current approaches either rely on simple sequence similarity thresholds (OTU-picking) or on complex and compute-intensive evolutionary models. The OTU-picking methods scale well on large datasets, but the results are highly sensitive to the similarity threshold. Coalescent-based species delimitation approaches often rely on Bayesian statistics and Markov Chain Monte Carlo sampling, and can therefore only be applied to small datasets. Results: We introduce the Poisson tree processes (PTP) model to infer putative species boundaries on a given phylogenetic input tree. We also integrate PTP with our evolutionary placement algorithm (EPA-PTP) to count the number of species in phylogenetic placements. We compare our approaches with popular OTU-picking methods and the General Mixed Yule Coalescent (GMYC) model. For de novo species delimitation, the stand-alone PTP model generally outperforms GYMC as well as OTU-picking methods when evolutionary distances between species are small. PTP neither requires an ultrametric input tree nor a sequence similarity threshold as input. In the open reference species delimitation approach, EPA-PTP yields more accurate results than de novo species delimitation methods. Finally, EPA-PTP scales on large datasets because it relies on the parallel implementations of the EPA and RAxML, thereby allowing to delimit species in high-throughput sequencing data. Availability and implementation: The code is freely available at www.read more
Citations
More filters
Journal ArticleDOI
ETE 3: Reconstruction, analysis and visualization of phylogenomic data
TL;DR: The Environment for Tree Exploration v3 is presented, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics.
Journal ArticleDOI
Environmental DNA - An emerging tool in conservation for monitoring past and present biodiversity
TL;DR: The achievements gained through analyses of eDNA from macro-organisms in a conservation context are reviewed, its potential advantages and limitations are discussed, and it is expected the eDNA-based approaches to move from single-marker analyses of species or communities to meta-genomic surveys of entire ecosystems to predict spatial and temporal biodiversity patterns.
Journal ArticleDOI
Multi-rate Poisson tree processes for single-locus species delimitation under maximum likelihood and Markov chain Monte Carlo
Paschalia Kapli,Sarah Lutteropp,Sarah Lutteropp,Jiajie Zhang,Kassian Kobert,Pavlos Pavlidis,Alexandros Stamatakis,Alexandros Stamatakis,Tomas Flouri,Tomas Flouri +9 more
TL;DR: The multi‐rate PTP is introduced, an improved method that alleviates the theoretical and technical shortcomings of PTP and consistently yields more accurate delimitations with respect to the taxonomy (i.e., identifies more taxonomic species, infers species numbers closer to theTaxonomy).
Journal ArticleDOI
ASAP: assemble species by automatic partitioning.
TL;DR: It is demonstrated that ASAP has the potential to become a major tool for taxonomists as it proposes rapidly in a full graphical exploratory interface relevant species hypothesis as a first step of the integrative taxonomy process.
Journal ArticleDOI
Cryptic species as a window into the paradigm shift of the species concept
TL;DR: It is time for incorporating multicriteria species approaches aiming to understand speciation across space and taxa, thus allowing integration into biodiversity conservation while accommodating for species uncertainty.
References
More filters
Journal ArticleDOI
Search and clustering orders of magnitude faster than BLAST
TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.
Journal ArticleDOI
Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy
TL;DR: The RDP Classifier can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes, and the majority of the classification errors appear to be due to anomalies in the current taxonomies.
Journal ArticleDOI
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
TL;DR: UNLABELLED RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML) that has been used to compute ML trees on two of the largest alignments to date.
Journal ArticleDOI
BEAST: Bayesian evolutionary analysis by sampling trees
TL;DR: BEAST is a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree that provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions.
Journal ArticleDOI
UCHIME improves sensitivity and speed of chimera detection
TL;DR: UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences, and in testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus.