scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The sponge microbiome project

01 Oct 2017-GigaScience (Oxford University Press (OUP))-Vol. 6, Iss: 10, pp 1-7
TL;DR: This dataset represents a comprehensive resource of sponge-associated microbial communities based on 16S rRNA gene sequences that can be used to address overarching hypotheses regarding host-associated prokaryotes, including host specificity, convergent evolution, environmental drivers of microbiome structure, and the sponge- associated rare biosphere.
Abstract: Marine sponges (phylum Porifera) are a diverse, phylogenetically deep-branching clade known for forming intimate partnerships with complex communities of microorganisms. To date, 16S rRNA gene sequencing studies have largely utilised different extraction and amplification methodologies to target the microbial communities of a limited number of sponge species, severely limiting comparative analyses of sponge microbial diversity and structure. Here, we provide an extensive and standardised dataset that will facilitate sponge microbiome comparisons across large spatial, temporal, and environmental scales. Samples from marine sponges (n = 3569 specimens), seawater (n = 370), marine sediments (n = 65) and other environments (n = 29) were collected from different locations across the globe. This dataset incorporates at least 268 different sponge species, including several yet unidentified taxa. The V4 region of the 16S rRNA gene was amplified and sequenced from extracted DNA using standardised procedures. Raw sequences (total of 1.1 billion sequences) were processed and clustered with (i) a standard protocol using QIIME closed-reference picking resulting in 39 543 operational taxonomic units (OTU) at 97% sequence identity, (ii) a de novo clustering using Mothur resulting in 518 246 OTUs, and (iii) a new high-resolution Deblur protocol resulting in 83 908 unique bacterial sequences. Abundance tables, representative sequences, taxonomic classifications, and metadata are provided. This dataset represents a comprehensive resource of sponge-associated microbial communities based on 16S rRNA gene sequences that can be used to address overarching hypotheses regarding host-associated prokaryotes, including host specificity, convergent evolution, environmental drivers of microbiome structure, and the sponge-associated rare biosphere.
Citations
More filters
Journal ArticleDOI
TL;DR: The concept of holobionts as dynamic ecosystems that interact at multiple scales and respond to environmental change is discussed and the link between environmental perturbations, dysbiosis, and sponge diseases is discussed.
Abstract: The recognition that all macroorganisms live in symbiotic association with microbial communities has opened up a new field in biology. Animals, plants, and algae are now considered holobionts, complex ecosystems consisting of the host, the microbiota, and the interactions among them. Accordingly, ecological concepts can be applied to understand the host-derived and microbial processes that govern the dynamics of the interactive networks within the holobiont. In marine systems, holobionts are further integrated into larger and more complex communities and ecosystems, a concept referred to as “nested ecosystems.” In this review, we discuss the concept of holobionts as dynamic ecosystems that interact at multiple scales and respond to environmental change. We focus on the symbiosis of sponges with their microbial communities—a symbiosis that has resulted in one of the most diverse and complex holobionts in the marine environment. In recent years, the field of sponge microbiology has remarkably advanced in terms of curated databases, standardized protocols, and information on the functions of the microbiota. Like a Russian doll, these microbial processes are translated into sponge holobiont functions that impact the surrounding ecosystem. For example, the sponge-associated microbial metabolisms, fueled by the high filtering capacity of the sponge host, substantially affect the biogeochemical cycling of key nutrients like carbon, nitrogen, and phosphorous. Since sponge holobionts are increasingly threatened by anthropogenic stressors that jeopardize the stability of the holobiont ecosystem, we discuss the link between environmental perturbations, dysbiosis, and sponge diseases. Experimental studies suggest that the microbial community composition is tightly linked to holobiont health, but whether dysbiosis is a cause or a consequence of holobiont collapse remains unresolved. Moreover, the potential role of the microbiome in mediating the capacity for holobionts to acclimate and adapt to environmental change is unknown. Future studies should aim to identify the mechanisms underlying holobiont dynamics at multiple scales, from the microbiome to the ecosystem, and develop management strategies to preserve the key functions provided by the sponge holobiont in our present and future oceans.

333 citations


Cites background from "The sponge microbiome project"

  • ...Marine sponges (phylum Porifera) perfectly illustrate the idea of holobionts as ecosystems, given the exceptionally diverse microbial communities housed within them [23, 24]....

    [...]

  • ...The most dominant bacterial symbiont groups belong to the phyla Proteobacteria (mainly Gamma- and Alphaproteobacteria), Actinobacteria, Chloroflexi, Nitrospirae, Cyanobacteria, and candidatus phylum Poribacteria, while Thaumarchaea represents the dominant archaeal group [23, 24]....

    [...]

  • ..., the Global Sponge Microbiome Project) [23, 24]....

    [...]

  • ...A recent publication made use of the Global Sponge Microbiome Project data to further investigate the microbial diversity features of HMA and LMA sponges at large scale by way of a machine learning [227]....

    [...]

  • ...The Global Sponge Microbiome Project, under the umbrella of the Earth Microbiome Project, is a recent collaborative initiative to assess the microbial diversity in sponges from around the world, following standardized protocols [23, 24]....

    [...]

Journal ArticleDOI
21 Mar 2019-eLife
TL;DR: A new analysis based on the the UK Biobank, a large, independent dataset, finds that the signals of selection using UKB effect estimates are strongly attenuated or absent and the conclusion of strong polygenic adaptation now lacks support.
Abstract: Several recent papers have reported strong signals of selection on European polygenic height scores. These analyses used height effect estimates from the GIANT consortium and replication studies. Here, we describe a new analysis based on the the UK Biobank (UKB), a large, independent dataset. We find that the signals of selection using UKB effect estimates are strongly attenuated or absent. We also provide evidence that previous analyses were confounded by population stratification. Therefore, the conclusion of strong polygenic adaptation now lacks support. Moreover, these discrepancies highlight (1) that methods for correcting for population stratification in GWAS may not always be sufficient for polygenic trait analyses, and (2) that claims of differences in polygenic scores between populations should be treated with caution until these issues are better understood. Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).

314 citations

Journal ArticleDOI
12 Feb 2019
TL;DR: This work proposes a compositional beta diversity metric rooted in a centered log-ratio transformation and matrix completion called robust Aitchison PCA and demonstrates the benefits of compositional transformations upstream of beta diversity calculations through simulations.
Abstract: The central aims of many host or environmental microbiome studies are to elucidate factors associated with microbial community compositions and to relate microbial features to outcomes. However, these aims are often complicated by difficulties stemming from high-dimensionality, non-normality, sparsity, and the compositional nature of microbiome data sets. A key tool in microbiome analysis is beta diversity, defined by the distances between microbial samples. Many different distance metrics have been proposed, all with varying discriminatory power on data with differing characteristics. Here, we propose a compositional beta diversity metric rooted in a centered log-ratio transformation and matrix completion called robust Aitchison PCA. We demonstrate the benefits of compositional transformations upstream of beta diversity calculations through simulations. Additionally, we demonstrate improved effect size, classification accuracy, and robustness to sequencing depth over the current methods on several decreased sample subsets of real microbiome data sets. Finally, we highlight the ability of this new beta diversity metric to retain the feature loadings linked to sample ordinations revealing salient intercommunity niche feature importance. IMPORTANCE By accounting for the sparse compositional nature of microbiome data sets, robust Aitchison PCA can yield high discriminatory power and salient feature ranking between microbial niches. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/DEICODE; additionally, a QIIME 2 plugin is provided to perform this analysis at https://library.qiime2.org/plugins/deicode/.

259 citations


Cites background from "The sponge microbiome project"

  • ...The first data set is a subset of the Sponge Microbiome Project (sponges) (21), where...

    [...]

Journal ArticleDOI
TL;DR: StLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.
Abstract: Here, we describe single-tube long fragment read (stLFR), a technology that enables sequencing of data from long DNA molecules using economical second-generation sequencing technology. It is based on adding the same barcode sequence to subfragments of the original long DNA molecule (DNA cobarcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process, up to 3.6 billion unique barcode sequences were generated on beads, enabling practically nonredundant cobarcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique cobarcoding of more than 8 million 20- to 300-kb genomic DNA fragments. Analysis of the human genome NA12878 with stLFR demonstrated high-quality variant calling and phase block lengths up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries, and their construction did not significantly add to the time or cost of whole-genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high-quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.

166 citations

Journal ArticleDOI
TL;DR: It is demonstrated how SVs may help in finding causative variants in genome-wide association analysis and the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice is identified.
Abstract: Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.

103 citations

References
More filters
Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations


"The sponge microbiome project" refers methods in this paper

  • ...Taxonomywas added to the resulting biom table using QIIME [28], RDP classifier [29], and Greengenes v. 13.8 [21]....

    [...]

  • ...Clustering using the EMP standard protocols in QIIME Raw sequences were demultiplexed and quality controlled following the recommendations of Bokulich et al. [16]....

    [...]

  • ...Project name: The Sponge Microbiome Project Project home page: www.spongeemp.com; https://github. com/amnona/SpongeEMP Operating system(s): Unix Programming language: Python and R Other requirements: Python v. 2.7, Biopython v. 1.65, Python 3.5, R v. 3.2.2, Mothur v. 1.37.6, QIIME v. 1.9.1, Deblur License: MIT Any restrictions to use by non-academics: none...

    [...]

  • ...Raw sequences (total of 1.1 billion sequences) were processed and clustered with (i) a standard protocol using QIIME closed-reference picking resulting in 39 543 operational taxonomic units (OTU) at 97% sequence identity, (ii) a de novo clustering using Mothur resulting in 518 246 OTUs, and (iii) a new high-resolution Deblur protocol resulting in 83 908 unique bacterial sequences....

    [...]

  • ...The additional datasets that support the results of this article are available in the GigaScience repository, GigaDB [32] and include an OTU abundance matrix (the output “.shared” file from Mothur, which is tab delimited), an OTU taxonomic classification table (tab delimited text file), an OTU representative sequence FASTA file, a table of samples’ metadata, the biom files from QIIME and Deblur analyses, and the QIIME-generated tree file....

    [...]

Journal ArticleDOI
TL;DR: The extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.
Abstract: SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

18,256 citations


Additional excerpts

  • ...123 database (SILVA, RRID:SCR 006423) [20]....

    [...]

Journal ArticleDOI
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Abstract: mothur aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. It builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. As a case study, we used mothur to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the alpha and beta diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments. This analysis of more than 222,000 sequences was completed in less than 2 h with a laptop computer.

17,350 citations


Additional excerpts

  • ...6 (Mothur, RRID:SCR 011947) [18] and Python v....

    [...]

Journal ArticleDOI
TL;DR: The RDP Classifier can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes, and the majority of the classification errors appear to be due to anomalies in the current taxonomies.
Abstract: The Ribosomal Database Project (RDP) Classifier, a naive Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (≥95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/.

16,048 citations


"The sponge microbiome project" refers methods in this paper

  • ...Taxonomywas added to the resulting biom table using QIIME [28], RDP classifier [29], and Greengenes v....

    [...]

Journal ArticleDOI
TL;DR: UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences, and in testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus.
Abstract: Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments. Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer. Contact: [email protected] Availability: Source, binaries and data: http://drive5.com/uchime. Supplementary information:Supplementary data are available at Bioinformatics online.

11,904 citations


"The sponge microbiome project" refers methods in this paper

  • ...Chimeras were identified with UCHIME (UCHIME, RRID:SCR 008057) [23] and removed....

    [...]

Related Papers (5)