scispace - formally typeset
Search or ask a question
Author

Michael J. Wilkins

Bio: Michael J. Wilkins is an academic researcher from Colorado State University. The author has contributed to research in topics: Geobacter & Metagenomics. The author has an hindex of 39, co-authored 105 publications receiving 5924 citations. Previous affiliations of Michael J. Wilkins include Pacific Northwest National Laboratory & Ohio State University.


Papers
More filters
Journal ArticleDOI
09 Jul 2015-Nature
TL;DR: This work reconstructed 8 complete and 789 draft genomes from bacteria representing >35 phyla and documented features that consistently distinguish these organisms from other bacteria, infer that this group, which may comprise >15% of the bacterial domain, has shared evolutionary history, and describe it as the candidate phyla radiation (CPR).
Abstract: A prominent feature of the bacterial domain is a radiation of major lineages that are defined as candidate phyla because they lack isolated representatives. Bacteria from these phyla occur in diverse environments and are thought to mediate carbon and hydrogen cycles. Genomic analyses of a few representatives suggested that metabolic limitations have prevented their cultivation. Here we reconstructed 8 complete and 789 draft genomes from bacteria representing >35 phyla and documented features that consistently distinguish these organisms from other bacteria. We infer that this group, which may comprise >15% of the bacterial domain, has shared evolutionary history, and describe it as the candidate phyla radiation (CPR). All CPR genomes are small and most lack numerous biosynthetic pathways. Owing to divergent 16S ribosomal RNA (rRNA) gene sequences, 50-100% of organisms sampled from specific phyla would evade detection in typical cultivation-independent surveys. CPR organisms often have self-splicing introns and proteins encoded within their rRNA genes, a feature rarely reported in bacteria. Furthermore, they have unusual ribosome compositions. All are missing a ribosomal protein often absent in symbionts, and specific lineages are missing ribosomal proteins and biogenesis factors considered universal in bacteria. This implies different ribosome structures and biogenesis mechanisms, and underlines unusual biology across a large part of the bacterial domain.

923 citations

Journal ArticleDOI
TL;DR: Terabase-scale cultivation-independent metagenomics is applied to aquifer sediments and groundwater and 2,540 draft-quality, near-complete and complete strain-resolved genomes are reconstructed, finding that few organisms within the community can conduct multiple sequential redox transformations.
Abstract: The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth’s biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to document the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles. Microorganisms from the terrestrial subsurface are understudied. Here, Anantharamanet al. analyse aquifer sediments and groundwater by genome-resolved metagenomics and reconstruct 2,540 genomes representing the majority of known bacterial phyla as well as 47 new phylum-level lineages.

845 citations

Journal ArticleDOI
28 Sep 2012-Science
TL;DR: This article uncovered metabolic characteristics for members of these phyla, and a new lineage, PER, via cultivation-independent recovery of 49 partial to near-complete genomes from an acetate-amended aquifer.
Abstract: BD1-5, OP11, and OD1 bacteria have been widely detected in anaerobic environments, but their metabolisms remain unclear owing to lack of cultivated representatives and minimal genomic sampling. We uncovered metabolic characteristics for members of these phyla, and a new lineage, PER, via cultivation-independent recovery of 49 partial to near-complete genomes from an acetate-amended aquifer. All organisms were nonrespiring anaerobes predicted to ferment. Three augment fermentation with archaeal-like hybrid type II/III ribulose-1,5-bisphosphate carboxylase-oxygenase (RuBisCO) that couples adenosine monophosphate salvage with CO2 fixation, a pathway not previously described in Bacteria. Members of OD1 reduce sulfur and may pump protons using archaeal-type hydrogenases. For six organisms, the UGA stop codon is translated as tryptophan. All bacteria studied here may play previously unrecognized roles in hydrogen production, sulfur cycling, and fermentation of refractory sedimentary carbon.

578 citations

Journal ArticleDOI
TL;DR: This study sequenced DNA from complex sediment and planktonic consortia from an aquifer adjacent to the Colorado River and reconstructed the first complete genomes for Archaea using cultivation-independent methods, which dramatically expand genomic sampling of the domain Archaea and clarify taxonomic designations within a major superphylum.

463 citations

Journal ArticleDOI
TL;DR: In this paper, the coupling among groundwater-surface water mixing, microbial communities and biogeochemistry was investigated using DNA sequencing and ultra-high-resolution organic carbon profiling to investigate the coupling between groundwater and surface water mixing in the hyporheic zone.
Abstract: Environmental transitions often result in resource mixtures that overcome limitations to microbial metabolism, resulting in biogeochemical hotspots and moments. Riverine systems, where groundwater mixes with surface water (the hyporheic zone), are spatially complex and temporally dynamic, making development of predictive models challenging. Spatial and temporal variations in hyporheic zone microbial communities are a key, but understudied, component of riverine biogeochemical function. Here, to investigate the coupling among groundwater–surface water mixing, microbial communities and biogeochemistry, we apply ecological theory, aqueous biogeochemistry, DNA sequencing and ultra-high-resolution organic carbon profiling to field samples collected across times and locations representing a broad range of mixing conditions. Our results indicate that groundwater–surface water mixing in the hyporheic zone stimulates heterotrophic respiration, alters organic carbon composition, causes ecological processes to shift from stochastic to deterministic and is associated with elevated abundances of microbial taxa that may degrade a broad suite of organic compounds. Groundwater-surface water mixing zones link critical ecosystem domains, but attendant microbe-biogeochemistry-hydrology interactions are poorly known. Here, the authors show that groundwater-surface water mixing stimulates respiration, alters carbon composition, and shifts the ecology from stochastic to deterministic.

249 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
TL;DR: An objective measure of genome quality is proposed that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities and is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches.
Abstract: Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.

5,788 citations

Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations

Journal ArticleDOI
TL;DR: This work used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence.
Abstract: Taxonomy is an organizing principle of biology and is ideally based on evolutionary relationships among organisms. Development of a robust bacterial taxonomy has been hindered by an inability to obtain most bacteria in pure culture and, to a lesser extent, by the historical use of phenotypes to guide classification. Culture-independent sequencing technologies have matured sufficiently that a comprehensive genome-based taxonomy is now possible. We used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence. Under this approach, 58% of the 94,759 genomes comprising the Genome Taxonomy Database had changes to their existing taxonomy. This result includes the description of 99 phyla, including six major monophyletic units from the subdivision of the Proteobacteria, and amalgamation of the Candidate Phyla Radiation into a single phylum. Our taxonomy should enable improved classification of uncultured bacteria and provide a sound basis for ecological and evolutionary studies.

2,098 citations

Journal ArticleDOI
TL;DR: The accuracy of the GTDB-Tk taxonomic assignments is demonstrated by evaluating its performance on a phylogenetically diverse set of 10 156 bacterial and archaeal metagenome-assembled genomes.
Abstract: A Summary: The Genome Taxonomy Database Toolkit (GTDB-Tk) provides objective taxonomic assignments for bacterial and archaeal genomes based on the GTDB. GTDB-Tk is computationally efficient and able to classify thousands of draft genomes in parallel. Here we demonstrate the accuracy of the GTDB-Tk taxonomic assignments by evaluating its performance on a phylogenetically diverse set of 10 156 bacterial and archaeal metagenome-assembled genomes.

2,053 citations