UCHIME improves sensitivity and speed of chimera detection

doi:10.1093/BIOINFORMATICS/BTR381

Open AccessJournal ArticleDOI

UCHIME improves sensitivity and speed of chimera detection

Robert C. Edgar, +4 more

- 01 Aug 2011 -

Bioinformatics

- Vol. 27, Iss: 16, pp 2194-2200

TLDR

UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences, and in testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus.

Abstract:

Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments. Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer. Contact: [email protected] Availability: Source, binaries and data: http://drive5.com/uchime. Supplementary information:Supplementary data are available at Bioinformatics online.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

DADA2: High-resolution sample inference from Illumina amplicon data

Benjamin J. Callahan, +5 more

- 01 Jul 2016 -

Nature Methods

TL;DR: The open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors is presented, revealing a diversity of previously undetected Lactobacillus crispatus variants.

...read moreread less

Journal ArticleDOI

UPARSE: highly accurate OTU sequences from microbial amplicon reads

Robert C. Edgar

- 01 Oct 2013 -

Nature Methods

TL;DR: The UPARSE pipeline reports operational taxonomic unit (OTU) sequences with ≤1% incorrect bases in artificial microbial community tests, compared with >3% correct bases commonly reported by other methods.

...read moreread less

Journal Article

Structure, function and diversity of the healthy human microbiome

Curtis Huttenhower, +247 more

- 01 Jun 2012 -

PubMed Central

TL;DR: The Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far, finding the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals.

...read moreread less

Journal ArticleDOI

VSEARCH: a versatile open source tool for metagenomics

Torbjørn Rognes, +7 more

- 18 Oct 2016 -

PeerJ

TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.

...read moreread less

Journal ArticleDOI

Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform.

James J. Kozich, +4 more

- 01 Sep 2013 -

Applied and Environmental Microbiology

TL;DR: This work presents an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads and demonstrates that it can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Stephen F. Altschul, +6 more

- 01 Sep 1997 -

Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Journal ArticleDOI

Search and clustering orders of magnitude faster than BLAST

Robert C. Edgar

- 01 Oct 2010 -

Bioinformatics

TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.

...read moreread less

Journal ArticleDOI

Taxonomic Note: A Place for DNA-DNA Reassociation and 16S rRNA Sequence Analysis in the Present Species Definition in Bacteriology

Erko Stackebrandt, +1 more

- 01 Oct 1994 -

International Journal of Systematic and ...

TL;DR: Amorphous metal alloys are employed in acoustic devices dependent upon the properties of low acoustic velocity and low attenuation, such as wire, strip and bulk delay lines.

...read moreread less

Journal ArticleDOI

The Ribosomal Database Project: improved alignments and new tools for rRNA analysis

James R. Cole, +10 more

- 01 Jan 2009 -

Nucleic Acids Research

TL;DR: An improved alignment strategy uses the Infernal secondary structure aware aligner to provide a more consistent higher quality alignment and faster processing of user sequences, and a new Pyrosequencing Pipeline that provides tools to support analysis of ultra high-throughput rRNA sequencing data.

...read moreread less

Journal ArticleDOI

Recent developments in the MAFFT multiple sequence alignment program

Kazutaka Katoh, +1 more

- 01 Jul 2008 -

Briefings in Bioinformatics

TL;DR: The initial version of the MAFFT program was developed in 2002 and was updated in 2007 with two new techniques: the PartTree algorithm and the Four-way consistency objective function, which improved the scalability of progressive alignment and the accuracy of ncRNA alignment.

...read moreread less