Home
/
Authors
/
Jan P. Meier-Kolthoff

Author

Jan P. Meier-Kolthoff

Other affiliations: Leibniz Institute for Neurobiology, Deutsche Sammlung von Mikroorganismen und Zellkulturen, University of Tübingen

Bio: Jan P. Meier-Kolthoff is an academic researcher from Leibniz Association. The author has contributed to research in topics: Genome & Phylogenetic tree. The author has an hindex of 26, co-authored 64 publications receiving 7576 citations. Previous affiliations of Jan P. Meier-Kolthoff include Leibniz Institute for Neurobiology & Deutsche Sammlung von Mikroorganismen und Zellkulturen.

Topics: Genome, Phylogenetic tree, Monophyly, Genomics, Phylogenetics ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2009
2007

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Genome sequence-based species delimitation with confidence intervals and improved distance functions

[...]

Jan P. Meier-Kolthoff¹, Alexander F. Auch², Hans-Peter Klenk¹, Markus Göker¹•Institutions (2)

Leibniz Association¹, University of Tübingen²

21 Feb 2013-BMC Bioinformatics

TL;DR: Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty, so it is crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes.

...read moreread less

Abstract: For the last 25 years species delimitation in prokaryotes (Archaea and Bacteria) was to a large extent based on DNA-DNA hybridization (DDH), a tedious lab procedure designed in the early 1970s that served its purpose astonishingly well in the absence of deciphered genome sequences. With the rapid progress in genome sequencing time has come to directly use the now available and easy to generate genome sequences for delimitation of species. GBDP (Genome Blast Distance Phylogeny) infers genome-to-genome distances between pairs of entirely or partially sequenced genomes, a digital, highly reliable estimator for the relatedness of genomes. Its application as an in-silico replacement for DDH was recently introduced. The main challenge in the implementation of such an application is to produce digital DDH values that must mimic the wet-lab DDH values as close as possible to ensure consistency in the Prokaryotic species concept. Correlation and regression analyses were used to determine the best-performing methods and the most influential parameters. GBDP was further enriched with a set of new features such as confidence intervals for intergenomic distances obtained via resampling or via the statistical models for DDH prediction and an additional family of distance functions. As in previous analyses, GBDP obtained the highest agreement with wet-lab DDH among all tested methods, but improved models led to a further increase in the accuracy of DDH prediction. Confidence intervals yielded stable results when inferred from the statistical models, whereas those obtained via resampling showed marked differences between the underlying distance functions. Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty. It is thus crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes. Such methodological advancements, easily accessible through the web service at http://ggdc.dsmz.de , are crucial steps towards a consistent and truly genome sequence-based classification of microorganisms.

...read moreread less

4,411 citations

Journal Article•DOI•

TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy

[...]

Jan P. Meier-Kolthoff¹, Markus Göker¹•Institutions (1)

Leibniz Association¹

16 May 2019-Nature Communications

TL;DR: TYGS, the Type (Strain) Genome Server, a user-friendly high-throughput web server for genome-based prokaryote taxonomy and analysis connected to a large, continuously growing database of genomic, taxonomic and nomenclatural information.

...read moreread less

Abstract: Microbial taxonomy is increasingly influenced by genome-based computational methods. Yet such analyses can be complex and require expert knowledge. Here we introduce TYGS, the Type (Strain) Genome Server, a user-friendly high-throughput web server for genome-based prokaryote taxonomy, connected to a large, continuously growing database of genomic, taxonomic and nomenclatural information. It infers genome-scale phylogenies and state-of-the-art estimates for species and subspecies boundaries from user-defined and automatically determined closest type genome sequences. TYGS also provides comprehensive access to nomenclature, synonymy and associated taxonomic literature. Clinically important examples demonstrate how TYGS can yield new insights into microbial classification, such as evidence for a species-level separation of previously proposed subspecies of Salmonella enterica. TYGS is an integrated approach for the classification of microbes that unlocks novel scientific approaches to microbiologists worldwide and is particularly helpful for the rapidly expanding field of genome-based taxonomic descriptions of new genera, species or subspecies.

...read moreread less

1,202 citations

Journal Article•DOI•

List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ

[...]

Aidan Parte¹, Joaquim Sardà Carbasse¹, Jan P. Meier-Kolthoff¹, L.C. Reimer¹, Markus Göker¹ - Show less +1 more•Institutions (1)

Leibniz Association¹

23 Jul 2020-International Journal of Systematic and Evolutionary Microbiology

TL;DR: The LPSN was acquired in November 2019 by the DSMZ and was relaunched using an entirely new production system in February 2020, with new features described in detail.

...read moreread less

Abstract: The List of Prokaryotic names with Standing in Nomenclature (LPSN) was acquired in November 2019 by the DSMZ and was relaunched using an entirely new production system in February 2020. This article describes in detail the structure of the new site, navigation, page layout, search facilities and new features.

...read moreread less

715 citations

Journal Article•DOI•

Taxonomic use of DNA G+C content and DNA-DNA hybridization in the genomic age

[...]

Jan P. Meier-Kolthoff¹, Hans-Peter Klenk¹, Markus Göker¹•Institutions (1)

Leibniz Association¹

01 Feb 2014-International Journal of Systematic and Evolutionary Microbiology

TL;DR: It is suggested that discrepancies between G+C content data provided in species descriptions and those recalculated after genome sequencing ≥ 1% are due to significant inaccuracies of the applied conventional methods and accordingly call for emendations of species descriptions.

...read moreread less

Abstract: The G+C content of a genome is frequently used in taxonomic descriptions of species and genera. In the past it has been determined using conventional, indirect methods, but it is nowadays reasonable to calculate the DNA G+C content directly from the increasingly available and affordable genome sequences. The expected increase in accuracy, however, might alter the way in which the G+C content is used for drawing taxonomic conclusions. We here re-estimate the literature assumption that the G+C content can vary up to 3–5 % within species using genomic datasets. The resulting G+C content differences are compared with DNA–DNA hybridization (DDH) similarities calculated in silico using the GGDC web server, with 70 % similarity as the gold standard threshold for species boundaries. The results indicate that the G+C content, if computed from genome sequences, varies no more than 1 % within species. Statistical models based on larger differences alone can reject the hypothesis that two strains belong to the same species. Because DDH similarities between two non-type strains occur in the genomic datasets, we also examine to what extent and under which conditions such a similarity could be <70 % even though the similarity of either strain to a type strain was ≥70 %. In theory, their similarity could be as low as 50 %, whereas empirical data suggest a boundary closer (but not identical) to 70 %. However, it is shown that using a 50 % boundary would not affect the conclusions regarding the DNA G+C content. Hence, we suggest that discrepancies between G+C content data provided in species descriptions on the one hand and those recalculated after genome sequencing on the other hand ≥1 % are due to significant inaccuracies of the applied conventional methods and accordingly call for emendations of species descriptions.

...read moreread less

440 citations

Journal Article•DOI•

When should a DDH experiment be mandatory in microbial taxonomy

[...]

Jan P. Meier-Kolthoff¹, Markus Göker¹, Cathrin Spröer¹, Hans-Peter Klenk¹•Institutions (1)

Leibniz Association¹

17 Apr 2013-Archives of Microbiology

TL;DR: Whether or not, and in which situations, this threshold value might be too conservative, is investigated, which means up to half of the currently conducted DDH experiments could safely be omitted without a significant risk for wrongly differentiated species.

...read moreread less

Abstract: DNA–DNA hybridizations (DDH) play a key role in microbial species discrimination in cases when 16S rRNA gene sequence similarities are 97 % or higher. Using real-world 16S rRNA gene sequences and DDH data, we here re-investigate whether or not, and in which situations, this threshold value might be too conservative. Statistical estimates of these thresholds are calculated in general as well as more specifically for a number of phyla that are frequently subjected to DDH. Among several methods to infer 16S gene sequence similarities investigated, most of those routinely applied by taxonomists appear well suited for the task. The effects of using distinct DDH methods also seem to be insignificant. Depending on the investigated taxonomic group, a threshold between 98.2 and 99.0 % appears reasonable. In that way, up to half of the currently conducted DDH experiments could safely be omitted without a significant risk for wrongly differentiated species.

...read moreread less

424 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes

[...]

Mincheol Kim¹, Hyunseok Oh¹, Sang-Cheol Park¹, Jongsik Chun¹•Institutions (1)

Seoul National University¹

01 Feb 2014-International Journal of Systematic and Evolutionary Microbiology

TL;DR: The overall distribution of ANI values generated by pairwise comparison of 6787 genomes of prokaryotes belonging to 22 phyla was investigated, finding an apparent distinction in the overall ANI distribution between intra- and interspecies relationships at around 95-96% ANI.

...read moreread less

Abstract: Among available genome relatedness indices, average nucleotide identity (ANI) is one of the most robust measurements of genomic relatedness between strains, and has great potential in the taxonomy of bacteria and archaea as a substitute for the labour-intensive DNA–DNA hybridization (DDH) technique. An ANI threshold range (95–96 %) for species demarcation had previously been suggested based on comparative investigation between DDH and ANI values, albeit with rather limited datasets. Furthermore, its generality was not tested on all lineages of prokaryotes. Here, we investigated the overall distribution of ANI values generated by pairwise comparison of 6787 genomes of prokaryotes belonging to 22 phyla to see whether the suggested range can be applied to all species. There was an apparent distinction in the overall ANI distribution between intra- and interspecies relationships at around 95–96 % ANI. We went on to determine which level of 16S rRNA gene sequence similarity corresponds to the currently accepted ANI threshold for species demarcation using over one million comparisons. A twofold cross-validation statistical test revealed that 98.65 % 16S rRNA gene sequence similarity can be used as the threshold for differentiating two species, which is consistent with previous suggestions (98.2–99.0 %) derived from comparative studies between DDH and 16S rRNA gene sequence similarity. Our findings should be useful in accelerating the use of genomic sequence data in the taxonomy of bacteria and archaea.

...read moreread less

2,227 citations

Integrative Genomics Viewer

[...]

James T. Robinson¹, Helga Thorvaldsdottir¹, Wendy Winckler¹, Mitchell Guttman¹, Eric S. Lander¹, Eric S. Lander², Gad Getz¹, Jill P. Mesirov¹ - Show less +4 more•Institutions (2)

Massachusetts Institute of Technology¹, Harvard University²

01 Jan 2011

TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.

...read moreread less

Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

...read moreread less

2,187 citations

Journal Article•DOI•

High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries.

[...]

Chirag Jain¹, Luis M. Rodriguez-R¹, Adam M. Phillippy², Konstantinos T. Konstantinidis¹, Srinivas Aluru¹ - Show less +1 more•Institutions (2)

Georgia Institute of Technology¹, National Institutes of Health²

30 Nov 2018-Nature Communications

TL;DR: FastANI is developed, a method to compute ANI using alignment-free approximate sequence mapping, and it is shown 95% ANI is an accurate threshold for demarcating prokaryotic species by analyzing about 90,000 proKaryotic genomes.

...read moreread less

Abstract: A fundamental question in microbiology is whether there is continuum of genetic diversity among genomes, or clear species boundaries prevail instead. Whole-genome similarity metrics such as Average Nucleotide Identity (ANI) help address this question by facilitating high resolution taxonomic analysis of thousands of genomes from diverse phylogenetic lineages. To scale to available genomes and beyond, we present FastANI, a new method to estimate ANI using alignment-free approximate sequence mapping. FastANI is accurate for both finished and draft genomes, and is up to three orders of magnitude faster compared to alignment-based approaches. We leverage FastANI to compute pairwise ANI values among all prokaryotic genomes available in the NCBI database. Our results reveal clear genetic discontinuity, with 99.8% of the total 8 billion genome pairs analyzed conforming to >95% intra-species and <83% inter-species ANI values. This discontinuity is manifested with or without the most frequently sequenced species, and is robust to historic additions in the genome databases. Average Nucleotide Identity (ANI) is a robust and useful measure to gauge genetic relatedness between two genomes. Here, the authors develop FastANI, a method to compute ANI using alignment-free approximate sequence mapping, and show 95% ANI is an accurate threshold for demarcating prokaryotic species by analyzing about 90,000 prokaryotic genomes.

...read moreread less

2,176 citations

Journal Article•DOI•

A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life

[...]

Donovan H. Parks¹, Maria Chuvochina¹, David W. Waite¹, Christian Rinke¹, Adam Skarshewski¹, Pierre-Alain Chaumeil¹, Philip Hugenholtz¹ - Show less +3 more•Institutions (1)

University of Queensland¹

27 Aug 2018-Nature Biotechnology

TL;DR: This work used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence.

...read moreread less

Abstract: Taxonomy is an organizing principle of biology and is ideally based on evolutionary relationships among organisms. Development of a robust bacterial taxonomy has been hindered by an inability to obtain most bacteria in pure culture and, to a lesser extent, by the historical use of phenotypes to guide classification. Culture-independent sequencing technologies have matured sufficiently that a comprehensive genome-based taxonomy is now possible. We used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence. Under this approach, 58% of the 94,759 genomes comprising the Genome Taxonomy Database had changes to their existing taxonomy. This result includes the description of 99 phyla, including six major monophyletic units from the subdivision of the Proteobacteria, and amalgamation of the Candidate Phyla Radiation into a single phylum. Our taxonomy should enable improved classification of uncultured bacteria and provide a sound basis for ecological and evolutionary studies.

...read moreread less

2,098 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse