scispace - formally typeset
Open AccessJournal ArticleDOI

A high-throughput DNA sequence aligner for microbial ecology studies

TLDR
The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule.
Abstract
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically meaningful multiple sequence alignment. The most commonly used aligners have varying alignment quality and speed, tend to depend on a specific reference alignment, or lack a complete description of the underlying algorithm. The purpose of this study was to create and validate an aligner with the goal of quickly generating a high quality alignment and having the flexibility to use any reference alignment. Using the simple nearest alignment space termination algorithm, the resulting aligner operates in linear time, requires a small memory footprint, and generates a high quality alignment. In addition, the alignments generated for variable regions were of as high a quality as the alignment of full-length sequences. As implemented, the method was able to align 18 full-length 16S rRNA gene sequences and 58 V2 region sequences per second to the 50,000-column SILVA reference alignment. Most importantly, the resulting alignments were of a quality equal to SILVA-generated alignments. The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule. Furthermore, because the implementation is not connected to a specific database it is easy to generalize the method to reference alignments for any DNA sequence.

read more

Citations
More filters
Journal ArticleDOI

Metagenomic biomarker discovery and explanation

TL;DR: A new method for metagenomic biomarker discovery is described and validates by way of class comparison, tests of biological consistency and effect size estimation to address the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities.
Journal ArticleDOI

Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform.

TL;DR: This work presents an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads and demonstrates that it can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.
Journal ArticleDOI

SINA: accurate high throughput multiple sequence alignment of ribosomal RNA genes

TL;DR: The SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project was evaluated and was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks.
Journal ArticleDOI

Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies

TL;DR: Improved quality-filtering pipeline was applied to several benchmarking studies and observed that even with the stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially confound the interpretation of microbial community data.
Journal ArticleDOI

Intestinal Domination and the Risk of Bacteremia in Patients Undergoing Allogeneic Hematopoietic Stem Cell Transplantation

TL;DR: During allo-HSCT, the diversity and stability of the intestinal flora are disrupted, resulting in domination by bacteria associated with subsequent bacteremia, and assessment of fecal microbiota identifies patients at highest risk for bloodstream infection during allo
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.

16S/23S rRNA sequencing

D. J. Lane
Related Papers (5)