scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Structure, function and diversity of the healthy human microbiome

Curtis Huttenhower1, Curtis Huttenhower2, Dirk Gevers2, Rob Knight3  +250 moreInstitutions (42)
14 Jun 2012-Nature (Nature Publishing Group)-Vol. 486, Iss: 7402, pp 207-214
TL;DR: The Human Microbiome Project Consortium reported the first results of their analysis of microbial communities from distinct, clinically relevant body habitats in a human cohort; the insights into the microbial communities of a healthy population lay foundations for future exploration of the epidemiology, ecology and translational applications of the human microbiome as discussed by the authors.
Abstract: The Human Microbiome Project Consortium reports the first results of their analysis of microbial communities from distinct, clinically relevant body habitats in a human cohort; the insights into the microbial communities of a healthy population lay foundations for future exploration of the epidemiology, ecology and translational applications of the human microbiome.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors is presented, revealing a diversity of previously undetected Lactobacillus crispatus variants.
Abstract: We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.

14,505 citations

Journal ArticleDOI
TL;DR: The UPARSE pipeline reports operational taxonomic unit (OTU) sequences with ≤1% incorrect bases in artificial microbial community tests, compared with >3% correct bases commonly reported by other methods.
Abstract: Amplified marker-gene sequences can be used to understand microbial community structure, but they suffer from a high level of sequencing and amplification artifacts. The UPARSE pipeline reports operational taxonomic unit (OTU) sequences with ≤1% incorrect bases in artificial microbial community tests, compared with >3% incorrect bases commonly reported by other methods. The improved accuracy results in far fewer OTUs, consistently closer to the expected number of species in a community.

11,329 citations

Journal ArticleDOI
22 Apr 2013-PLOS ONE
TL;DR: The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.
Abstract: Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

11,272 citations

Journal ArticleDOI
TL;DR: The results demonstrate that phylogeny and function are sufficiently linked that this 'predictive metagenomic' approach should provide useful insights into the thousands of uncultivated microbial communities for which only marker gene surveys are currently available.
Abstract: Profiling phylogenetic marker genes, such as the 16S rRNA gene, is a key tool for studies of microbial communities but does not provide direct evidence of a community's functional capabilities. Here we describe PICRUSt (phylogenetic investigation of communities by reconstruction of unobserved states), a computational approach to predict the functional composition of a metagenome using marker gene data and a database of reference genomes. PICRUSt uses an extended ancestral-state reconstruction algorithm to predict which gene families are present and then combines gene families to estimate the composite metagenome. Using 16S information, PICRUSt recaptures key findings from the Human Microbiome Project and accurately predicts the abundance of gene families in host-associated and environmental communities, with quantifiable uncertainty. Our results demonstrate that phylogeny and function are sufficiently linked that this 'predictive metagenomic' approach should provide useful insights into the thousands of uncultivated microbial communities for which only marker gene surveys are currently available.

6,860 citations

Journal ArticleDOI
18 Oct 2016-PeerJ
TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.
Abstract: Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results: VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion: VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.

5,850 citations

References
More filters
Journal ArticleDOI
TL;DR: Many persons in the United States are colonized with S. aureus; prevalence rates differ demographically, and MRSA colonization prevalence, although low nationally in 2001-2002, may vary with demographic and organism characteristics.
Abstract: confidence interval [CI], 30.7%–34.1%) and 0.8% (95% CI, 0.4%–1.4%), respectively, and population estimates were 89.4 million persons (95% CI, 84.8–94.1 million persons) and 2.3 million persons (95% CI, 1.2–3.8 million persons), respectively. S. aureus colonization prevalence was highest in participants 6–11 years old. MRSA colonization was associated with age60 years and being female but not with recent health-care exposure. In unweighted analyses, the SCCmec type IV gene was more frequent in isolates from participants of younger age and of nonHispanic black race/ethnicity; the PVL gene was present in 9 (2.4%) of 372 of isolates tested. Conclusions. Many persons in the United States are colonized with S. aureus; prevalence rates differ demographically. MRSA colonization prevalence, although low nationally in 2001–2002, may vary with demographic and organism characteristics.

703 citations

Journal ArticleDOI
TL;DR: This work used massively parallel sequencing to monitor the relative abundance of tens of thousands of transposon mutants of a saccharolytic human gut bacterium, Bacteroides thetaiotaomicron, as they established themselves in wild-type and immunodeficient gnotobiotic mice, in the presence or absence of other human gut commensals.

630 citations

Journal ArticleDOI
13 Jun 2012-PLOS ONE
TL;DR: It is found that commonalities between samples based on taxonomy could sometimes belie variability at the sub-genus OTU level, and even OTUs present in nearly every subject, or that dominate in some samples, showed orders of magnitude variation in relative abundance emphasizing the highly variable nature across individuals.
Abstract: We explore the microbiota of 18 body sites in over 200 individuals using sequences amplified V1–V3 and the V3–V5 small subunit ribosomal RNA (16S) hypervariable regions as part of the NIH Common Fund Human Microbiome Project. The body sites with the greatest number of core OTUs, defined as OTUs shared amongst 95% or more of the individuals, were the oral sites (saliva, tongue, cheek, gums, and throat) followed by the nose, stool, and skin, while the vaginal sites had the fewest number of OTUs shared across subjects. We found that commonalities between samples based on taxonomy could sometimes belie variability at the sub-genus OTU level. This was particularly apparent in the mouth where a given genus can be present in many different oral sites, but the sub-genus OTUs show very distinct site selection, and in the vaginal sites, which are consistently dominated by the Lactobacillus genus but have distinctly different sub-genus V1–V3 OTU populations across subjects. Different body sites show approximately a ten-fold difference in estimated microbial richness, with stool samples having the highest estimated richness, followed by the mouth, throat and gums, then by the skin, nasal and vaginal sites. Richness as measured by the V1–V3 primers was consistently higher than richness measured by V3–V5. We also show that when such a large cohort is analyzed at the genus level, most subjects fit the stool “enterotype” profile, but other subjects are intermediate, blurring the distinction between the enterotypes. When analyzed at the finer-scale, OTU level, there was little or no segregation into stool enterotypes, but in the vagina distinct biotypes were apparent. Finally, we note that even OTUs present in nearly every subject, or that dominate in some samples, showed orders of magnitude variation in relative abundance emphasizing the highly variable nature across individuals.

480 citations

Journal ArticleDOI
TL;DR: The IslandViewer application is a web accessible application that provides the first user-friendly interface for obtaining precomputed GI predictions, or predictions from user-inputted sequence, using the most accurate methods for genomic island prediction: IslandPick, IslandPath-DIMOB and SIGI-HMM.
Abstract: Summary: Genomic islands (clusters of genes of probable horizontal origin; GIs) play a critical role in medically important adaptations of bacteria. Recently, several computational methods have been developed to predict GIs that utilize either sequence composition bias or comparative genomics approaches. IslandViewer is a web accessible application that provides the first user-friendly interface for obtaining precomputed GI predictions, or predictions from user-inputted sequence, using the most accurate methods for genomic island prediction: IslandPick, IslandPath-DIMOB and SIGI-HMM. The graphical interface allows easy viewing and downloading of island data in multiple formats, at both the chromosome and gene level, for method-specific, or overlapping, GI predictions. Availability: The IslandViewer web service is available at http://www.pathogenomics.sfu.ca/islandviewer and the source code is freely available under the GNU GPL license. Contact: ac.ufs@namknirb

392 citations

Journal ArticleDOI
TL;DR: The evolving field of bacterial typing and the genomic technologies that enable comparative analysis of multiple genomes and the metagenomes of complex microbial environments are reviewed, and the implications of the genomic era for the future of microbiology are addressed.
Abstract: Genomics has revolutionized every aspect of microbiology. Now, 13 years after the first bacterial genome was sequenced, it is important to pause and consider what has changed in microbiology research as a consequence of genomics. In this article, we review the evolving field of bacterial typing and the genomic technologies that enable comparative analysis of multiple genomes and the metagenomes of complex microbial environments, and address the implications of the genomic era for the future of microbiology.

347 citations

Related Papers (5)