scispace - formally typeset
Search or ask a question
Journal ArticleDOI

phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.

22 Apr 2013-PLOS ONE (Public Library of Science)-Vol. 8, Iss: 4
TL;DR: The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.
Abstract: Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
16 Jun 2016-Cell
TL;DR: It is reported that maternal high-fat diet induces a shift in microbial ecology that negatively impacts offspring social behavior, and a single commensal strain is identified that corrects oxytocin levels, LTP, and social deficits in MHFD offspring.

773 citations


Cites methods from "phyloseq: an R package for reproduc..."

  • ...org/ [2014]), utilizing the phyloseq package (McMurdie and Holmes, 2013) to import sample data and calculate alphaand beta-diversity metrics....

    [...]

Journal ArticleDOI
TL;DR: There was some correlation between observed dramatic fluctuations in the gut microbiome and intensified medication due to a flare of the disease, and these results will help guide therapies that will redirect the Gut microbiome towards a healthy state and maintain remission in IBD.
Abstract: Inflammatory bowel disease (IBD) is characterized by flares of inflammation with a periodic need for increased medication and sometimes even surgery. The aetiology of IBD is partly attributed to a deregulated immune response to gut microbiome dysbiosis. Cross-sectional studies have revealed microbial signatures for different IBD subtypes, including ulcerative colitis, colonic Crohn's disease and ileal Crohn's disease. Although IBD is dynamic, microbiome studies have primarily focused on single time points or a few individuals. Here, we dissect the long-term dynamic behaviour of the gut microbiome in IBD and differentiate this from normal variation. Microbiomes of IBD subjects fluctuate more than those of healthy individuals, based on deviation from a newly defined healthy plane (HP). Ileal Crohn's disease subjects deviated most from the HP, especially subjects with surgical resection. Intriguingly, the microbiomes of some IBD subjects periodically visited the HP then deviated away from it. Inflammation was not directly correlated with distance to the healthy plane, but there was some correlation between observed dramatic fluctuations in the gut microbiome and intensified medication due to a flare of the disease. These results will help guide therapies that will redirect the gut microbiome towards a healthy state and maintain remission in IBD.

750 citations

Journal ArticleDOI
TL;DR: It is found that, when maintained under germ-free conditions, mice do not display an age-related increase in circulating pro-inflammatory cytokine levels, suggesting that aging-associated microbiota promote inflammation and that reversing these age- related microbiota changes represents a potential strategy for reducing age-associated inflammation and the accompanying morbidity.

715 citations

Journal ArticleDOI
15 Nov 2017-Nature
TL;DR: Quantitative profiling bypasses compositionality effects in the reconstruction of gut microbiota interaction networks and reveals that the taxonomic trade-off between Bacteroides and Prevotella is an artefact of relative microbiome analyses.
Abstract: Current sequencing-based analyses of faecal microbiota quantify microbial taxa and metabolic pathways as fractions of the sample sequence library generated by each analysis. Although these relative approaches permit detection of disease-associated microbiome variation, they are limited in their ability to reveal the interplay between microbiota and host health. Comparative analyses of relative microbiome data cannot provide information about the extent or directionality of changes in taxa abundance or metabolic potential. If microbial load varies substantially between samples, relative profiling will hamper attempts to link microbiome features to quantitative data such as physiological parameters or metabolite concentrations. Saliently, relative approaches ignore the possibility that altered overall microbiota abundance itself could be a key identifier of a disease-associated ecosystem configuration. To enable genuine characterization of host-microbiota interactions, microbiome research must exchange ratios for counts. Here we build a workflow for the quantitative microbiome profiling of faecal material, through parallelization of amplicon sequencing and flow cytometric enumeration of microbial cells. We observe up to tenfold differences in the microbial loads of healthy individuals and relate this variation to enterotype differentiation. We show how microbial abundances underpin both microbiota variation between individuals and covariation with host phenotype. Quantitative profiling bypasses compositionality effects in the reconstruction of gut microbiota interaction networks and reveals that the taxonomic trade-off between Bacteroides and Prevotella is an artefact of relative microbiome analyses. Finally, we identify microbial load as a key driver of observed microbiota alterations in a cohort of patients with Crohn's disease, here associated with a low-cell-count Bacteroides enterotype (as defined through relative profiling).

702 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Book
13 Aug 2009
TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.
Abstract: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics. With ggplot2, its easy to: produce handsome, publication-quality plots, with automatic legends created from the plot specification superpose multiple layers (points, lines, maps, tiles, box plots to name a few) from different data sources, with automatically adjusted common scales add customisable smoothers that use the powerful modelling capabilities of R, such as loess, linear models, generalised additive models and robust regression save any ggplot2 plot (or part thereof) for later modification or reuse create custom themes that capture in-house or journal style requirements, and that can easily be applied to multiple plots approach your graph from a visual perspective, thinking about how each component of the data is represented on the final plot. This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e. you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and youll learn everything you need in the book. After reading this book youll be able to produce graphics customized precisely for your problems,and youll find it easy to get graphics out of your head and on to the screen or page.

29,504 citations

Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations


"phyloseq: an R package for reproduc..." refers background or methods in this paper

  • ...clustering output formats like QIIME [11], mothur [12], the RDP-...

    [...]

  • ...packages/pipelines, including QIIME [11], mothur [12], and...

    [...]

  • ...We would also like to thank the developers of the open source packages on which phyloseq depends, in particular Rob Knight and his lab for QIIME [11], Hadley Wickham for the ggplot2 [57], reshape [89], and plyr [90] packages, as well as the Bioconductor and R teams [24,34]....

    [...]

  • ...Virtual machine image and cloud-deployed ‘‘pipeline’’ analyses [11,15,19] can further increase accessibility of analyses...

    [...]

  • ...Instead, phyloseq provides tools to read the output files of the most common OTU-clustering applications [7,11,12,14], and represents this data in R as an instance of the main data class....

    [...]

Journal ArticleDOI
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Abstract: mothur aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. It builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. As a case study, we used mothur to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the alpha and beta diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments. This analysis of more than 222,000 sequences was completed in less than 2 h with a laptop computer.

17,350 citations


"phyloseq: an R package for reproduc..." refers methods in this paper

  • ...Instead, phyloseq provides tools to read the output files of the most common OTU-clustering applications [7,11,12,14], and represents this data in R as an instance of the main data class....

    [...]

  • ...clustering output formats like QIIME [11], mothur [12], the RDP-...

    [...]

  • ...packages/pipelines, including QIIME [11], mothur [12], and...

    [...]

  • ...This PDF file contains a table summarizing a comparison of supported capabilities between phyloseq and QIIME [11], mothur [12], and the pair of packages OTUbase [35] and mcaGUI [88]....

    [...]

Related Papers (5)