Genome sequence-based species delimitation with confidence intervals and improved distance functions
Reads0
Chats0
TLDR
Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty, so it is crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes.Abstract:
For the last 25 years species delimitation in prokaryotes (Archaea and Bacteria) was to a large extent based on DNA-DNA hybridization (DDH), a tedious lab procedure designed in the early 1970s that served its purpose astonishingly well in the absence of deciphered genome sequences. With the rapid progress in genome sequencing time has come to directly use the now available and easy to generate genome sequences for delimitation of species. GBDP (Genome Blast Distance Phylogeny) infers genome-to-genome distances between pairs of entirely or partially sequenced genomes, a digital, highly reliable estimator for the relatedness of genomes. Its application as an in-silico replacement for DDH was recently introduced. The main challenge in the implementation of such an application is to produce digital DDH values that must mimic the wet-lab DDH values as close as possible to ensure consistency in the Prokaryotic species concept. Correlation and regression analyses were used to determine the best-performing methods and the most influential parameters. GBDP was further enriched with a set of new features such as confidence intervals for intergenomic distances obtained via resampling or via the statistical models for DDH prediction and an additional family of distance functions. As in previous analyses, GBDP obtained the highest agreement with wet-lab DDH among all tested methods, but improved models led to a further increase in the accuracy of DDH prediction. Confidence intervals yielded stable results when inferred from the statistical models, whereas those obtained via resampling showed marked differences between the underlying distance functions. Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty. It is thus crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes. Such methodological advancements, easily accessible through the web service at http://ggdc.dsmz.de
, are crucial steps towards a consistent and truly genome sequence-based classification of microorganisms.read more
Citations
More filters
Journal ArticleDOI
Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes
TL;DR: The overall distribution of ANI values generated by pairwise comparison of 6787 genomes of prokaryotes belonging to 22 phyla was investigated, finding an apparent distinction in the overall ANI distribution between intra- and interspecies relationships at around 95-96% ANI.
Journal ArticleDOI
A new antibiotic kills pathogens without detectable resistance
Losee Lucy Ling,Tanja Schneider,Aaron J. Peoples,Amy Spoering,Ina Engels,Brian P. Conlon,A. Mueller,Till F. Schäberle,Dallas Hughes,Slava S. Epstein,M. Jones,Linos Lazarides,Victoria Alexandra Steadman,Cohen Dr,Cintia R. Felix,Fetterman Ka,William Millett,Anthony Nitti,Ashley Zullo,Chao Chen,Kim Lewis +20 more
TL;DR: The properties of this compound suggest a path towards developing antibiotics that are likely to avoid development of resistance, as well as several methods to grow uncultured organisms by cultivation in situ or by using specific growth factors.
Journal ArticleDOI
Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes
Jongsik Chun,Aharon Oren,Antonio Ventosa,Henrik Christensen,David R. Arahal,Milton S. da Costa,Alejandro P. Rooney,Hana Yi,Xue-Wei Xu,Sofie E. De Meyer,Martha E. Trujillo +10 more
TL;DR: The minimal standards for the quality of genome sequences and how they can be applied for taxonomic purposes are described.
Journal ArticleDOI
OrthoANI: An improved algorithm and software for calculating average nucleotide identity.
TL;DR: A new algorithm, named OrthoANI, was developed to accommodate the concept of orthology for which both genome sequences were fragmented and only orthologous fragment pairs taken into consideration for calculating nucleotide identities, providing a more robust and faster means of calculating average nucleotide identity for taxonomic purposes.
Journal ArticleDOI
JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison.
Michael Richter,Ramon Rosselló-Móra,Frank Oliver Glöckner,Frank Oliver Glöckner,Jörg Peplies +4 more
TL;DR: The JSpeciesWS service indicates whether two genomes share genomic identities above or below the species embracing thresholds, and serves as a fast way to allocate unknown genomes in the frame of the hitherto sequenced species.
References
More filters
Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Naruya Saitou,Masatoshi Nei +1 more
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Journal ArticleDOI
A new look at the statistical model identification
TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Journal ArticleDOI
Confidence limits on phylogenies: an approach using the bootstrap.
TL;DR: The recently‐developed statistical method known as the “bootstrap” can be used to place confidence intervals on phylogenies and shows significant evidence for a group if it is defined by three or more characters.
Related Papers (5)
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Naruya Saitou,Masatoshi Nei +1 more