scispace - formally typeset
Open AccessJournal ArticleDOI

Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life

TLDR
The recovery of 7,903 bacterial and archaeal metagenome-assembled genomes increases the phylogenetic diversity represented by public genome repositories and provides the first representatives from 20 candidate phyla.
Abstract
Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter.

read more

Citations
More filters
Journal ArticleDOI

Rich Repertoire of Quorum Sensing Protein Coding Sequences in CPR and DPANN Associated with Interspecies and Interkingdom Communication.

TL;DR: An in silico analysis on 2,597 CPR/DPANN genomes was conducted to test whether these ultrasmall microorganisms might encode homologs of reference proteins involved in the synthesis and/or the detection of 26 different types of communication molecules (quorum sensing [QS] signals), since QS signals are well-known mediators of intra- and interorganismic relationships.
Journal ArticleDOI

Amino acid based de Bruijn graph algorithm for identifying complete coding genes from metagenomic and metatranscriptomic short reads

TL;DR: Application of MetaPA on metatranscriptomic data successfully identifies the majority of actively transcribed genes validated in related studies and suggests that MetaPA has a good potential in both metagenomic and metatrancriptomic studies to characterize the composition and abundance of microbiota.
Posted ContentDOI

Microbial diversity in tropical marine sediments assessed using culture-dependent and culture-independent techniques

TL;DR: In this paper, the authors describe microbial diversity in marine sediments using both culture-dependent and culture-independent approaches, showing that rare taxa play an important role in distinguishing microbial communities at different sites.
Posted ContentDOI

Analysis procedures for assessing recovery of high quality, complete, closed genomes from Nanopore long read metagenome sequencing

TL;DR: The findings further establish the feasibility of long read metagenome–assembled genome recovery, and demonstrate the utility of parallel sampling of moderately complex enrichments communities for recovery of genomes of key functional species relevant for the study of complex wastewater treatment bioprocesses.
Posted ContentDOI

Machine-learning classification suggests that many alphaproteobacterial prophages may instead be gene transfer agents

TL;DR: A ‘support vector machine’ classifier is reported that quickly and accurately distinguishes RcGTA-like genes from their viral homologs by capturing the differences in the amino acid composition of the encoded proteins.
References
More filters
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

BLAST+: architecture and applications.

TL;DR: The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences.
Journal ArticleDOI

tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

TL;DR: A program is described, tRNAscan-SE, which identifies 99-100% of transfer RNA genes in DNA sequence while giving less than one false positive per 15 gigabases.
Journal ArticleDOI

Database resources of the National Center for Biotechnology Information

TL;DR: In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website.
Related Papers (5)