MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation.
David Koslicki,Daniel Falush +1 more
- Vol. 1, Iss: 3
Reads0
Chats0
TLDR
The algorithm MetaPalette is presented, which uses long k-mer sizes to fit a k-Mer “palette” of a given sample to the k-MER palette of reference organisms, and returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date.Abstract:
Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k -mer sizes ( k = 30, 50) to fit a k -mer “palette” of a given sample to the k -mer palette of reference organisms. By modeling the k -mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea , Bacteria , Eukaryota , and viruses. IMPORTANCE Taxonomic profiling is a challenging first step when analyzing a metagenomic sample. This work presents a method that facilitates fine-scale characterization of the presence, abundance, and evolutionary relatedness of organisms present in a given sample but absent from the training database. We calculate a “ k -mer palette” which summarizes the information from all reads, not just those in conserved genes or containing taxon-specific markers. The compositions of palettes are easy to model, allowing rapid inference of community composition. In addition to providing strain-level information where applicable, our approach provides taxonomic profiles that are more accurate than those of competing methods. Author Video : An author video summary of this article is available.read more
Citations
More filters
Evolution In Changing Environments Some Theoretical Explorations
TL;DR: People have search hundreds of times for their favorite books like this evolution in changing environments some theoretical explorations, but end up in malicious downloads instead of enjoying a good book with a cup of tea in the afternoon.
Journal ArticleDOI
Critical Assessment of Metagenome Interpretation - A benchmark of metagenomics software
Alexander Sczyrba,Peter Hofmann,Peter Hofmann,Peter Belmann,David Koslicki,Stefan Janssen,Johannes Dröge,Johannes Dröge,Ivan Gregor,Ivan Gregor,Stephan Majda,Jessika Fiedler,Eik Dahms,Eik Dahms,Andreas Bremges,Adrian Fritz,Ruben Garrido-Oter,Tue Sparholt Jørgensen,Tue Sparholt Jørgensen,Tue Sparholt Jørgensen,Nicole Shapiro,Philip D. Blood,Alexey Gurevich,Yang Bai,Dmitrij Turaev,Matthew Z. DeMaere,Rayan Chikhi,Niranjan Nagarajan,Christopher Quince,Fernando Meyer,Monika Balvočiūtė,Lars Hestbjerg Hansen,Søren J. Sørensen,Burton Kuan Hui Chia,Bertrand Denis,Jeff Froula,Zhong Wang,Robert Egan,Dongwan Don Kang,Jeffrey J. Cook,Charles Deltel,Michael Beckstette,Claire Lemaitre,Pierre Peterlongo,Guillaume Rizk,Dominique Lavenier,Yu Wei Wu,Yu Wei Wu,Steven W. Singer,Steven W. Singer,Chirag Jain,Marc Strous,Heiner Klingenberg,Peter Meinicke,Michael D. Barton,Thomas Lingner,Hsin-Hung Lin,Yu-Chieh Liao,Genivaldo G. Z. Silva,Daniel A. Cuevas,Robert Edwards,Surya Saha,Vitor C. Piro,Vitor C. Piro,Bernhard Y. Renard,Mihai Pop,Hans-Peter Klenk,Markus Göker,Nikos C. Kyrpides,Tanja Woyke,Julia A. Vorholt,Paul Schulze-Lefert,Edward M. Rubin,Aaron E. Darling,Thomas Rattei,Alice C. McHardy +75 more
TL;DR: The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups as discussed by the authors.
Journal ArticleDOI
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes.
TL;DR: PhyloFlash, a pipeline to overcome this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based binning of full metagenomic assemblies, is presented.
The Definition Of Standard Ml Revised
TL;DR: The the definition of standard ml revised is universally compatible with any devices to read and is available in the digital library an online access to it is set as public so you can get it instantly.
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Naruya Saitou,Masatoshi Nei +1 more
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Journal ArticleDOI
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI
Confidence limits on phylogenies: an approach using the bootstrap.
TL;DR: The recently‐developed statistical method known as the “bootstrap” can be used to place confidence intervals on phylogenies and shows significant evidence for a group if it is defined by three or more characters.
Journal ArticleDOI
MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0
TL;DR: An advanced version of the Molecular Evolutionary Genetics Analysis software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis, is released, which enables the inference of timetrees, as it implements the RelTime method for estimating divergence times for all branching points in a phylogeny.