A high-throughput DNA sequence aligner for microbial ecology studies
TLDR
The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule.Abstract:
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically meaningful multiple sequence alignment. The most commonly used aligners have varying alignment quality and speed, tend to depend on a specific reference alignment, or lack a complete description of the underlying algorithm. The purpose of this study was to create and validate an aligner with the goal of quickly generating a high quality alignment and having the flexibility to use any reference alignment. Using the simple nearest alignment space termination algorithm, the resulting aligner operates in linear time, requires a small memory footprint, and generates a high quality alignment. In addition, the alignments generated for variable regions were of as high a quality as the alignment of full-length sequences. As implemented, the method was able to align 18 full-length 16S rRNA gene sequences and 58 V2 region sequences per second to the 50,000-column SILVA reference alignment. Most importantly, the resulting alignments were of a quality equal to SILVA-generated alignments. The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule. Furthermore, because the implementation is not connected to a specific database it is easy to generalize the method to reference alignments for any DNA sequence.read more
Citations
More filters
Journal ArticleDOI
Metagenomic biomarker discovery and explanation
Nicola Segata,Jacques Izard,Jacques Izard,Levi Waldron,Dirk Gevers,Larisa Miropolsky,Wendy S. Garrett,Curtis Huttenhower +7 more
TL;DR: A new method for metagenomic biomarker discovery is described and validates by way of class comparison, tests of biological consistency and effect size estimation to address the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities.
Journal ArticleDOI
Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform.
TL;DR: This work presents an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads and demonstrates that it can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.
Journal ArticleDOI
SINA: accurate high throughput multiple sequence alignment of ribosomal RNA genes
TL;DR: The SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project was evaluated and was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks.
Journal ArticleDOI
Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies
TL;DR: Improved quality-filtering pipeline was applied to several benchmarking studies and observed that even with the stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially confound the interpretation of microbial community data.
Journal ArticleDOI
Intestinal Domination and the Risk of Bacteremia in Patients Undergoing Allogeneic Hematopoietic Stem Cell Transplantation
Ying Taur,Joao B. Xavier,Lauren Lipuma,Carles Ubeda,Jenna D. Goldberg,Asia Gobourne,Yeon Joo Lee,Krista Dubin,Nicholas D. Socci,Agnes Viale,Miguel-Angel Perales,Robert R. Jenq,Marcel R.M. van den Brink,Eric G. Pamer +13 more
TL;DR: During allo-HSCT, the diversity and stability of the intestinal flora are disrupted, resulting in domination by bacteria associated with subsequent bacteremia, and assessment of fecal microbiota identifies patients at highest risk for bloodstream infection during allo
References
More filters
Journal ArticleDOI
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schäffer,Jinghui Zhang,Zheng Zhang,Webb Miller,David J. Lipman +6 more
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
Patrick D. Schloss,Patrick D. Schloss,Sarah L. Westcott,Sarah L. Westcott,Thomas Ryabin,Justine R. Hall,Martin Hartmann,Emily B. Hollister,Ryan A. Lesniewski,Brian B. Oakley,Donovan H. Parks,Courtney J. Robinson,Jason W. Sahl,Blaz Stres,Gerhard G. Thallinger,David J. Van Horn,Carolyn F. Weber +16 more
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Journal ArticleDOI
A general method applicable to the search for similarities in the amino acid sequence of two proteins
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI
Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB
Todd Z. DeSantis,Philip Hugenholtz,Neils Larsen,Mark Rojas,Eoin L. Brodie,Keith Keller,Thomas Huber,Daniel Dalevi,Ping Hu,Gary L. Andersen +9 more
TL;DR: A 16S rRNA gene database (http://greengenes.lbl.gov) was used to provide chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies as mentioned in this paper.
Related Papers (5)
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
Patrick D. Schloss,Patrick D. Schloss,Sarah L. Westcott,Sarah L. Westcott,Thomas Ryabin,Justine R. Hall,Martin Hartmann,Emily B. Hollister,Ryan A. Lesniewski,Brian B. Oakley,Donovan H. Parks,Courtney J. Robinson,Jason W. Sahl,Blaz Stres,Gerhard G. Thallinger,David J. Van Horn,Carolyn F. Weber +16 more
QIIME allows analysis of high-throughput community sequencing data.
J. Gregory Caporaso,Justin Kuczynski,Jesse Stombaugh,Kyle Bittinger,Frederic D. Bushman,Elizabeth K. Costello,Noah Fierer,Antonio Gonzalez Peña,Julia K. Goodrich,Jeffrey I. Gordon,Gavin A. Huttley,Scott T. Kelley,Dan Knights,Jeremy E. Koenig,Ruth E. Ley,Catherine A. Lozupone,Daniel McDonald,Brian D. Muegge,Meg Pirrung,Jens Reeder,Joel Sevinsky,Peter J. Turnbaugh,William A. Walters,Jeremy Widmann,Tanya Yatsunenko,Jesse R. Zaneveld,Rob Knight,Rob Knight +27 more