scispace - formally typeset
Open AccessJournal ArticleDOI

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin

TLDR
The results illustrate the importance of parameter tuning for optimizing classifier performance, and the recommendations regarding parameter choices for these classifiers under a range of standard operating conditions are made.
Abstract
Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated “novel” marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2

Evan Bolyen, +123 more
- 01 Aug 2019 - 
TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.

Evaluación de la diversidad taxonómica y funcional de la comunidad microbiana relacionada con el ciclo del nitrógeno en suelos de cultivo de arroz con diferentes manejos del tamo

TL;DR: The impact of the quema de arroz on the microorganismos edaficos in the disponibilidad and ciclaje de nutrientes is poco conocido, es por esto que el retorno de los residuos vegetales al suelo ha been propuesto como una alternativa de manejo eficiente de los residentes pos-cosecha.
Journal ArticleDOI

Long-term benefit of Microbiota Transfer Therapy on autism symptoms and gut microbiota

TL;DR: The observations demonstrate the long-term safety and efficacy of MTT as a potential therapy to treat children with ASD who have GI problems, and warrant a double-blind, placebo-controlled trial in the future.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI

Search and clustering orders of magnitude faster than BLAST

Robert C. Edgar
- 01 Oct 2010 - 
TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.
Related Papers (5)