Home
/
Authors
/
Hans-Peter Klenk

Author

Hans-Peter Klenk

Other affiliations: Leibniz Institute for Neurobiology, Max Planck Society, Deutsche Sammlung von Mikroorganismen und Zellkulturen ...read more

Bio: Hans-Peter Klenk is an academic researcher from Newcastle University. The author has contributed to research in topics: Genome & Whole genome sequencing. The author has an hindex of 67, co-authored 564 publications receiving 31086 citations. Previous affiliations of Hans-Peter Klenk include Leibniz Institute for Neurobiology & Max Planck Society.

Topics: Genome, Whole genome sequencing, Gene, Type species, Phylogenetic tree ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1983

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Genome sequence-based species delimitation with confidence intervals and improved distance functions

[...]

Jan P. Meier-Kolthoff¹, Alexander F. Auch², Hans-Peter Klenk¹, Markus Göker¹•Institutions (2)

Leibniz Association¹, University of Tübingen²

21 Feb 2013-BMC Bioinformatics

TL;DR: Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty, so it is crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes.

...read moreread less

Abstract: For the last 25 years species delimitation in prokaryotes (Archaea and Bacteria) was to a large extent based on DNA-DNA hybridization (DDH), a tedious lab procedure designed in the early 1970s that served its purpose astonishingly well in the absence of deciphered genome sequences. With the rapid progress in genome sequencing time has come to directly use the now available and easy to generate genome sequences for delimitation of species. GBDP (Genome Blast Distance Phylogeny) infers genome-to-genome distances between pairs of entirely or partially sequenced genomes, a digital, highly reliable estimator for the relatedness of genomes. Its application as an in-silico replacement for DDH was recently introduced. The main challenge in the implementation of such an application is to produce digital DDH values that must mimic the wet-lab DDH values as close as possible to ensure consistency in the Prokaryotic species concept. Correlation and regression analyses were used to determine the best-performing methods and the most influential parameters. GBDP was further enriched with a set of new features such as confidence intervals for intergenomic distances obtained via resampling or via the statistical models for DDH prediction and an additional family of distance functions. As in previous analyses, GBDP obtained the highest agreement with wet-lab DDH among all tested methods, but improved models led to a further increase in the accuracy of DDH prediction. Confidence intervals yielded stable results when inferred from the statistical models, whereas those obtained via resampling showed marked differences between the underlying distance functions. Despite the high accuracy of GBDP-based DDH prediction, inferences from limited empirical data are always associated with a certain degree of uncertainty. It is thus crucial to enrich in-silico DDH replacements with confidence-interval estimation, enabling the user to statistically evaluate the outcomes. Such methodological advancements, easily accessible through the web service at http://ggdc.dsmz.de , are crucial steps towards a consistent and truly genome sequence-based classification of microorganisms.

...read moreread less

4,411 citations

Journal Article•DOI•

The complete genome sequence of the gastric pathogen Helicobacter pylori

[...]

Jean-F. Tomb, Owen White, Anthony R. Kerlavage, Rebecca A. Clayton, Granger G. Sutton, Robert D. Fleischmann, Karen A. Ketchum, Hans-Peter Klenk, Steven R. Gill, Brian Dougherty, Karen E. Nelson, John Quackenbush, Lixin Zhou, Ewen F. Kirkness, Scott N. Peterson, Brendan J. Loftus, Delwood Richardson, Robert J. Dodson, Hanif Khalak, Anna Glodek, Keith McKenney, Lisa M. Fitzegerald, Norman H. Lee, Mark Raymond Adams, Erin Hickey, Douglas E. Berg¹, Jeanine D. Gocayne, Teresa Utterback, Jeremy Peterson, Jenny M. Kelley, Matthew D. Cotton, J. Weidman, Claire Fujii, Cheryl Bowman, Larry Watthey, Erik Wallin², William S. Hayes, Mark Borodovsky, Peter D. Karp³, Hamilton O. Smith⁴, Claire M. Fraser, J. Craig Venter - Show less +38 more•Institutions (4)

Washington University in St. Louis¹, Stockholm University², Artificial Intelligence Center³, Johns Hopkins University⁴

07 Aug 1997-Nature

TL;DR: Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification, and consistent with its restricted niche, it has a few regulatory networks, and a limited metabolic repertoire and biosynthetic capacity.

...read moreread less

Abstract: Helicobacter pylori, strain 26695, has a circular genome of 1,667,867 base pairs and 1,590 predicted coding sequences. Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification. Many putative adhesins, lipoproteins and other outer membrane proteins were identified, underscoring the potential complexity of host-pathogen interaction. Based on the large number of sequence-related genes encoding outer membrane proteins and the presence of homopolymeric tracts and dinucleotide repeats in coding sequences, H. pylori, like several other mucosal pathogens, probably uses recombination and slipped-strand mispairing within repeats as mechanisms for antigenic variation and adaptive evolution. Consistent with its restricted niche, H. pylori has a few regulatory networks, and a limited metabolic repertoire and biosynthetic capacity. Its survival in acid conditions depends, in part, on its ability to establish a positive inside-membrane potential in low pH.

...read moreread less

3,577 citations

Journal Article•DOI•

The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus

[...]

Hans-Peter Klenk¹, Rebecca A. Clayton¹, Jean-Francois Tomb¹, Owen White¹, Karen E. Nelson¹, Karen A. Ketchum¹, Robert J. Dodson¹, Michelle Gwinn¹, Erin Hickey¹, Jeremy Peterson¹, Delwood Richardson¹, Anthony R. Kerlavage¹, David E. Graham², Nikos C. Kyrpides², Robert D. Fleischmann¹, John Quackenbush¹, Norman H. Lee¹, Granger G. Sutton¹, Steven R. Gill¹, Ewen F. Kirkness¹, Brian Dougherty¹, Keith McKenney¹, Mark Raymond Adams¹, Brendan J. Loftus¹, Scott N. Peterson¹, Claudia I. Reich², Leslie Klis McNeil², Jonathan H. Badger², Anna Glodek¹, Lixin Zhou¹, Ross Overbeek³, Jeannine D. Gocayne¹, Janice Weidman¹, Lisa McDonald¹, Teresa Utterback¹, Matthew D. Cotton¹, Tracy Spriggs¹, Patricia Artiach¹, Brian P. Kaine², Sean M. Sykes¹, Paul W. Sadow¹, Kurt P. D'Andrea¹, Cheryl Bowman¹, Claire Fujii¹, Stacey Garland¹, Tanya Mason¹, Gary J. Olsen², Claire M. Fraser¹, Hamilton O. Smith¹, Carl R. Woese², J. Craig Venter¹ - Show less +47 more•Institutions (3)

TigerLogic¹, University of Illinois at Urbana–Champaign², Argonne National Laboratory³

27 Nov 1997-Nature

TL;DR: The A. fulgidus genome encodes functionally uncharacterized yet conserved proteins, two-thirds of which are shared with M. jannaschii (428 ORFs), indicating substantial archaeal gene diversity.

...read moreread less

Abstract: Archaeoglobus fulgidus is the first sulphur-metabolizing organism to have its genome sequence determined. Its genome of 2,178,400 base pairs contains 2,436 open reading frames (ORFs). The information processing systems and the biosynthetic pathways for essential components (nucleotides, amino acids and cofactors) have extensive correlation with their counterparts in the archaeon Methanococcus jannaschii. The genomes of these two Archaea indicate dramatic differences in the way these organisms sense their environment, perform regulatory and transport functions, and gain energy. In contrast to M. jannaschii, A. fulgidus has fewer restriction-modification systems, and none of its genes appears to contain inteins. A quarter (651 ORFs) of the A. fulgidus genome encodes functionally uncharacterized yet conserved proteins, two-thirds of which are shared with M. jannaschii (428 ORFs). Another quarter of the genome encodes new proteins indicating substantial archaeal gene diversity.

...read moreread less

1,394 citations

Journal Article•DOI•

Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison

[...]

Alexander F. Auch¹, Mathias von Jan², Hans-Peter Klenk², Markus Göker²•Institutions (2)

University of Tübingen¹, DSM²

28 Feb 2010-Standards in Genomic Sciences

TL;DR: This work investigates state-of-the-art methods for inferring whole-genome distances in their ability to mimic DDH and finds that some distance formulas are very robust against missing fractions of genomic information.

...read moreread less

Abstract: The pragmatic species concept for Bacteria and Archaea is ultimately based on DNA-DNA hybridization (DDH). While enabling the taxonomist, in principle, to obtain an estimate of the overall similarity between the genomes of two strains, this technique is tedious and error-prone and cannot be used to incrementally build up a comparative database. Recent technological progress in the area of genome sequencing calls for bioinformatics methods to replace the wet-lab DDH by in-silico genome-to-genome comparison. Here we investigate state-of-the-art methods for inferring whole-genome distances in their ability to mimic DDH. Algorithms to efficiently determine high-scoring segment pairs or maximally unique matches perform well as a basis of inferring intergenomic distances. The examined distance functions, which are able to cope with heavily reduced genomes and repetitive sequence regions, outperform previously described ones regarding the correlation with and error ratios in emulating DDH. Simulation of incompletely sequenced genomes indicates that some distance formulas are very robust against missing fractions of genomic information. Digitally derived genome-to-genome distances show a better correlation with 16S rRNA gene sequence distances than DDH values. The future perspectives of genome-informed taxonomy are discussed, and the investigated methods are made available as a web service for genome-based species delineation.

...read moreread less

1,256 citations

Journal Article•DOI•

Taxonomy, Physiology, and Natural Products of Actinobacteria

[...]

Essaid Ait Barka¹, Parul Vatsa¹, Lisa Sanchez¹, Nathalie Gaveau-Vaillant¹, Cédric Jacquard¹, Hans-Peter Klenk², Christophe Clément¹, Yder Ouhdouch, Gilles P. van Wezel³ - Show less +5 more•Institutions (3)

University of Reims Champagne-Ardenne¹, Newcastle University², Leiden University³

01 Mar 2016-Microbiology and Molecular Biology Reviews

TL;DR: Actinobacteria are Gram-positive bacteria with high G+C DNA content that constitute one of the largest bacterial phyla, and they are ubiquitously distributed in both aquatic and terrestrial ecosystems.

...read moreread less

Abstract: Actinobacteria are Gram-positive bacteria with high G+C DNA content that constitute one of the largest bacterial phyla, and they are ubiquitously distributed in both aquatic and terrestrial ecosystems. Many Actinobacteria have a mycelial lifestyle and undergo complex morphological differentiation. They also have an extensive secondary metabolism and produce about two-thirds of all naturally derived antibiotics in current clinical use, as well as many anticancer, anthelmintic, and antifungal compounds. Consequently, these bacteria are of major importance for biotechnology, medicine, and agriculture. Actinobacteria play diverse roles in their associations with various higher organisms, since their members have adopted different lifestyles, and the phylum includes pathogens (notably, species of Corynebacterium, Mycobacterium, Nocardia, Propionibacterium, and Tropheryma), soil inhabitants (e.g., Micromonospora and Streptomyces species), plant commensals (e.g., Frankia spp.), and gastrointestinal commensals (Bifidobacterium spp.). Actinobacteria also play an important role as symbionts and as pathogens in plant-associated microbial communities. This review presents an update on the biology of this important bacterial phylum.

...read moreread less

1,199 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The sequence of the human genome.

[...]

J. Craig Venter¹, Mark Raymond Adams¹, Eugene W. Myers¹, Peter W. Li¹ +269 more•Institutions (12)

16 Feb 2001-Science

TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.

...read moreread less

Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

...read moreread less

12,098 citations

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

ModelFinder: fast model selection for accurate phylogenetic estimates

[...]

Subha Kalyaanamoorthy¹, Subha Kalyaanamoorthy², Bui Quang Minh³, Thomas K. F. Wong⁴, Thomas K. F. Wong¹, Arndt von Haeseler⁵, Arndt von Haeseler⁶, Lars S. Jermiin¹, Lars S. Jermiin⁴ - Show less +5 more•Institutions (6)

Commonwealth Scientific and Industrial Research Organisation¹, University of Alberta², Max F. Perutz Laboratories³, Australian National University⁴, University of Vienna⁵, Medical University of Vienna⁶

01 Jun 2017-Nature Methods

TL;DR: ModelFinder is presented, a fast model-selection method that greatly improves the accuracy of phylogenetic estimates by incorporating a model of rate heterogeneity across sites not previously considered in this context and by allowing concurrent searches of model space and tree space.

...read moreread less

Abstract: Model-based molecular phylogenetics plays an important role in comparisons of genomic data, and model selection is a key step in all such analyses. We present ModelFinder, a fast model-selection method that greatly improves the accuracy of phylogenetic estimates by incorporating a model of rate heterogeneity across sites not previously considered in this context and by allowing concurrent searches of model space and tree space.

...read moreread less

7,425 citations

Journal Article•DOI•

Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms

[...]

J. Gregory Caporaso¹, Christian L. Lauber², William A. Walters³, Donna Berg-Lyons², James Huntley³, Noah Fierer³, Noah Fierer², Sarah M. Owens⁴, Jason Betley⁵, Louise Fraser⁵, Markus J. Bauer⁵, Niall Anthony Gormley⁵, Jack A. Gilbert⁶, Jack A. Gilbert⁴, Geoff Smith⁵, Rob Knight - Show less +12 more•Institutions (6)

Northern Arizona University¹, Cooperative Institute for Research in Environmental Sciences², University of Colorado Boulder³, Argonne National Laboratory⁴, Illumina⁵, University of Chicago⁶

01 Aug 2012-The ISME Journal

TL;DR: It is shown that the protocol developed for these instruments successfully recaptures known biological results, and additionally that biological conclusions are consistent across sequencing platforms (the HiSeq2000 versus the MiSeq) and across the sequenced regions of amplicons.

...read moreread less

Abstract: DNA sequencing continues to decrease in cost with the Illumina HiSeq2000 generating up to 600 Gb of paired-end 100 base reads in a ten-day run. Here we present a protocol for community amplicon sequencing on the HiSeq2000 and MiSeq Illumina platforms, and apply that protocol to sequence 24 microbial communities from host-associated and free-living environments. A critical question as more sequencing platforms become available is whether biological conclusions derived on one platform are consistent with what would be derived on a different platform. We show that the protocol developed for these instruments successfully recaptures known biological results, and additionally that biological conclusions are consistent across sequencing platforms (the HiSeq2000 versus the MiSeq) and across the sequenced regions of amplicons.

...read moreread less

6,840 citations

Journal Article•DOI•

CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes

[...]

Donovan H. Parks¹, Michael Imelfort¹, Connor T. Skennerton¹, Philip Hugenholtz¹, Gene W. Tyson¹ - Show less +1 more•Institutions (1)

University of Queensland¹

01 Jul 2015-Genome Research

TL;DR: An objective measure of genome quality is proposed that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities and is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches.

...read moreread less

Abstract: Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.

...read moreread less

5,788 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse