Home
/
Authors
/
Katharina J. Hoff

Author

Katharina J. Hoff

Other affiliations: University of Göttingen

Bio: Katharina J. Hoff is an academic researcher from University of Greifswald. The author has contributed to research in topics: Gene prediction & Genome. The author has an hindex of 16, co-authored 35 publications receiving 3321 citations. Previous affiliations of Katharina J. Hoff include University of Göttingen.

Topics: Gene prediction, Genome, Genome project, Medicine, Biology ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Butterfly genome reveals promiscuous exchange of mimicry adaptations among species

[...]

Kanchon K. Dasmahapatra¹, James R. Walters², Adriana D. Briscoe³, John W. Davey, Annabel Whibley, Nicola J. Nadeau², Aleksey V. Zimin⁴, Daniel S.T. Hughes⁵, Laura Ferguson⁵, Simon H. Martin², Camilo Salazar⁶, Camilo Salazar², James J. Lewis³, Sebastian Adler⁷, Seung-Joon Ahn⁸, Dean A. Baker⁹, Simon W. Baxter², Nicola Chamberlain¹⁰, Ritika Chauhan¹¹, Brian A. Counterman¹², Tamas Dalmay¹¹, Lawrence E. Gilbert¹³, Karl H.J. Gordon¹⁴, David G. Heckel⁸, Heather M. Hines⁵, Katharina J. Hoff⁷, Peter W. H. Holland⁵, Emmanuelle Jacquin-Joly¹⁵, Francis M. Jiggins, Robert T. Jones, Durrell D. Kapan¹⁶, Durrell D. Kapan¹⁷, Paul J. Kersey, Gerardo Lamas, Daniel Lawson, Daniel Mapleson¹¹, Luana S. Maroja¹⁸, Arnaud Martin³, Simon Moxon¹⁹, William J. Palmer², Riccardo Papa²⁰, Alexie Papanicolaou¹⁴, Yannick Pauchet⁸, David A. Ray¹², Neil Rosser¹, Steven L. Salzberg²¹, Megan A. Supple²², Alison K. Surridge², Ayşe Tenger-Trolander¹⁰, Heiko Vogel⁸, Paul A. Wilkinson²³, Derek Wilson, James A. Yorke⁴, Furong Yuan³, Alexi Balmuth²⁴, Cathlene Eland, Karim Gharbi, Marian Thomson, Richard A. Gibbs²⁵, Yi Han²⁵, Joy Jayaseelan²⁵, Christie Kovar²⁵, Tittu Mathew²⁵, Donna M. Muzny²⁵, Fiona Ongeri²⁵, Ling-Ling Pu²⁵, Jiaxin Qu²⁵, Rebecca Thornton²⁵, Kim C. Worley²⁵, Yuanqing Wu²⁵, Mauricio Linares²⁶, Mark Blaxter, Richard H. ffrench-Constant²⁷, Mathieu Joron, Marcus R. Kronforst¹⁰, Sean P. Mullen²⁸, Robert D. Reed³, Steven E. Scherer²⁵, Stephen Richards²⁵, James Mallet¹, James Mallet¹⁰, W. Owen McMillan, Chris D. Jiggins², Chris D. Jiggins⁶ - Show less +80 more•Institutions (28)

University College London¹, University of Cambridge², University of California, Irvine³, University of Maryland, College Park⁴, University of Oxford⁵, Smithsonian Institution⁶, University of Greifswald⁷, Max Planck Society⁸, Imperial College London⁹, Harvard University¹⁰, University of East Anglia¹¹, Mississippi State University¹², University of Texas at Austin¹³, Commonwealth Scientific and Industrial Research Organisation¹⁴, University of Paris¹⁵, California Academy of Sciences¹⁶, University of Hawaii¹⁷, Williams College¹⁸, Yale University¹⁹, University of Puerto Rico²⁰, Johns Hopkins University²¹, North Carolina State University²², University of Bristol²³, University of Edinburgh²⁴, Baylor College of Medicine²⁵, Del Rosario University²⁶, University of Exeter²⁷, Boston University²⁸

05 Jul 2012-Nature

TL;DR: It is inferred that closely related Heliconius species exchange protective colour-pattern genes promiscuously, implying that hybridization has an important role in adaptive radiation.

...read moreread less

Abstract: Sequencing of the genome of the butterfly Heliconius melpomene shows that closely related Heliconius species exchange protective colour-pattern genes promiscuously.

...read moreread less

1,103 citations

Journal Article•DOI•

BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS

[...]

Katharina J. Hoff¹, Simone Lange¹, Alexandre Lomsadze², Mark Borodovsky², Mario Stanke¹ - Show less +1 more•Institutions (2)

University of Greifswald¹, Georgia Institute of Technology²

01 Mar 2016-Bioinformatics

TL;DR: Baker1 is presented, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS and was observed that BRAKER1 was more accurate than MAKER2 when it is using RNA- Seq as sole source for training and prediction.

...read moreread less

Abstract: MOTIVATION Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. RESULTS We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. AVAILABILITY AND IMPLEMENTATION BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ CONTACT katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

...read moreread less

809 citations

Journal Article•DOI•

BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database

[...]

Tomáš Brůna¹, Katharina J. Hoff², Alexandre Lomsadze¹, Mario Stanke², Mark Borodovsky¹ - Show less +1 more•Institutions (2)

Georgia Institute of Technology¹, University of Greifswald²

06 Jan 2021

TL;DR: The BRAKER2 pipeline as mentioned in this paper generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS, and it is favorably compared with other pipelines, e.g. MAKER2, in terms of accuracy and performance.

...read moreread less

Abstract: The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.

...read moreread less

455 citations

Book Chapter•DOI•

Whole-Genome Annotation with BRAKER.

[...]

Katharina J. Hoff¹, Alexandre Lomsadze², Mark Borodovsky³, Mark Borodovsky², Mario Stanke¹ - Show less +1 more•Institutions (3)

University of Greifswald¹, Georgia Institute of Technology², Moscow Institute of Physics and Technology³

01 Jan 2019-Methods of Molecular Biology

TL;DR: This book chapter describes how to apply BRAKER in environments characterized by various combinations of external evidence, both RNA-Seq and protein alignments.

...read moreread less

Abstract: BRAKER is a pipeline for highly accurate and fully automated gene prediction in novel eukaryotic genomes. It combines two major tools: GeneMark-ES/ET and AUGUSTUS. GeneMark-ES/ET learns its parameters from a novel genomic sequence in a fully automated fashion; if available, it uses extrinsic evidence for model refinement. From the protein-coding genes predicted by GeneMark-ES/ET, we select a set for training AUGUSTUS, one of the most accurate gene finding tools that, in contrast to GeneMark-ES/ET, integrates extrinsic evidence already into the gene prediction step. The first published version, BRAKER1, integrated genomic footprints of unassembled RNA-Seq reads into the training as well as into the prediction steps. The pipeline has since been extended to the integration of data on mapped cross-species proteins, and to the usage of heterogeneous extrinsic evidence, both RNA-Seq and protein alignments. In this book chapter, we briefly summarize the pipeline methodology and describe how to apply BRAKER in environments characterized by various combinations of external evidence.

...read moreread less

382 citations

Journal Article•DOI•

Finding the missing honey bee genes: Lessons learned from a genome upgrade

[...]

Christine G. Elsik¹, Christine G. Elsik², Kim C. Worley³, Anna K. Bennett², Martin Beye⁴, Francisco Camara⁵, Christopher P. Childers², Christopher P. Childers¹, Dirk C. de Graaf⁶, Griet Debyser⁶, Jixin Deng³, Bart Devreese⁶, Eran Elhaik⁷, Jay D. Evans⁸, Leonard J. Foster⁹, Dan Graur¹⁰, Roderic Guigó⁵, Katharina J. Hoff, Michael Holder³, Matthew E. Hudson¹¹, Greg J. Hunt¹², Huaiyang Jiang¹³, Vandita Joshi³, Radhika S. Khetani¹¹, Peter Kosarev, Christie Kovar³, Jian Ma¹¹, Ryszard Maleszka¹⁴, Robin F. A. Moritz¹⁵, Monica Munoz-Torres¹⁶, Monica Munoz-Torres², Terence Murphy¹⁷, Donna M. Muzny³, Irene Newsham³, Justin T. Reese¹, Justin T. Reese², Hugh M. Robertson¹¹, Gene E. Robinson¹¹, Olav Rueppell¹⁸, Victor V. Solovyev¹⁹, Mario Stanke, Eckart Stolle¹⁵, Jennifer M. Tsuruda²⁰, Matthias Van Vaerenbergh⁶, Robert M. Waterhouse²¹, Daniel B. Weaver, Charles W. Whitfield¹¹, Yuanqing Wu³, Evgeny M. Zdobnov²¹, Lan Zhang³, Dianhui Zhu³, Richard A. Gibbs³ - Show less +48 more•Institutions (21)

University of Missouri¹, University of Washington², Baylor College of Medicine³, University of Düsseldorf⁴, Pompeu Fabra University⁵, Ghent University⁶, Johns Hopkins University⁷, Agricultural Research Service⁸, University of British Columbia⁹, University of Houston¹⁰, University of Illinois at Urbana–Champaign¹¹, Purdue University¹², University of Pittsburgh¹³, Australian National University¹⁴, Martin Luther University of Halle-Wittenberg¹⁵, Lawrence Berkeley National Laboratory¹⁶, National Institutes of Health¹⁷, University of North Carolina at Greensboro¹⁸, King Abdullah University of Science and Technology¹⁹, Clemson University²⁰, Swiss Institute of Bioinformatics²¹

30 Jan 2014-BMC Genomics

TL;DR: Improved honey bee genome assembly with a new gene annotation set and a number of genes similar to that of other insect genomes are reported, contrary to what was suggested in OGSv1.0.

...read moreread less

Abstract: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.

...read moreread less

370 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences.

[...]

Minoru Kanehisa¹, Yoko Sato², Kanae Morishima¹•Institutions (2)

Kyoto University¹, Fujitsu²

22 Feb 2016-Journal of Molecular Biology

TL;DR: Both BlastKOALA and GhostKOalA are automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and K EGG modules to infer high-level functions of the organism or the ecosystem.

...read moreread less

2,247 citations

Journal Article•DOI•

Ancient Admixture in Human History

[...]

Nick Patterson¹, Priya Moorjani², Yontao Luo³, Swapan Mallick², Nadin Rohland², Yiping Zhan³, Teri Genschoreck³, Teresa Webster³, David Reich², David Reich¹ - Show less +6 more•Institutions (3)

Massachusetts Institute of Technology¹, Harvard University², Affymetrix³

01 Nov 2012-Genetics

TL;DR: A suite of methods for learning about population mixtures are presented, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture.

...read moreread less

Abstract: Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present-day Basques and Sardinians and the other related to present-day populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean "Iceman."

...read moreread less

1,877 citations

Journal Article•DOI•

Hybridization and speciation

[...]

Richard J. Abbott¹, Dirk C. Albach², Stephen W. Ansell³, Jan W. Arntzen⁴, Stuart J. E. Baird, Nicolas Bierne⁵, Janette W. Boughman⁶, Alan Brelsford⁷, C. A. Buerkle⁸, Richard J. A. Buggs⁹, Roger K. Butlin¹⁰, Ulf Dieckmann¹¹, Fabrice Eroukhmanoff¹², Andrea Grill¹³, Sara Helms Cahan¹⁴, Jo S. Hermansen¹², Godfrey M. Hewitt¹⁵, Alan G. Hudson¹⁶, Chris D. Jiggins¹⁷, Julia C. Jones¹⁸, Barbara Keller¹⁹, T. Marczewski²⁰, James Mallet²¹, Paloma Martínez-Rodríguez²², Markus Möst²³, Sean P. Mullen²⁴, Richard A. Nichols⁹, Arne W. Nolte²⁵, Christian Parisod²⁶, Karin S. Pfennig²⁷, Amber M. Rice²⁸, Michael G. Ritchie¹, Burkhardt Seifert²⁹, Carole M. Smadja³⁰, Rike B. Stelkens³¹, Jacek M. Szymura³², Risto Väinölä²⁹, Jochen B. W. Wolf³³, Dietmar Zinner³⁴ - Show less +35 more•Institutions (34)

University of St Andrews¹, University of Oldenburg², Natural History Museum³, Naturalis⁴, Centre national de la recherche scientifique⁵, Michigan State University⁶, University of Lausanne⁷, University of Wyoming⁸, Queen Mary University of London⁹, University of Sheffield¹⁰, International Institute for Applied Systems Analysis¹¹, University of Oslo¹², University of Vienna¹³, University of Vermont¹⁴, University of East Anglia¹⁵, Spanish National Research Council¹⁶, University of Cambridge¹⁷, University of Konstanz¹⁸, University of Zurich¹⁹, Royal Botanic Garden Edinburgh²⁰, Harvard University²¹, Autonomous University of Madrid²², Swiss Federal Institute of Aquatic Science and Technology²³, Boston University²⁴, Max Planck Society²⁵, University of Neuchâtel²⁶, University of North Carolina at Chapel Hill²⁷, Lehigh University²⁸, American Museum of Natural History²⁹, University of Montpellier³⁰, University of Liverpool³¹, Jagiellonian University³², Uppsala University³³, German Primate Center³⁴

01 Feb 2013-Journal of Evolutionary Biology

TL;DR: A perspective on the context and evolutionary significance of hybridization during speciation is offered, highlighting issues of current interest and debate and suggesting that the Dobzhansky–Muller model of hybrid incompatibilities requires a broader interpretation.

...read moreread less

Abstract: Hybridization has many and varied impacts on the process of speciation. Hybridization may slow or reverse differentiation by allowing gene flow and recombination. It may accelerate speciation via adaptive introgression or cause near-instantaneous speciation by allopolyploidization. It may have multiple effects at different stages and in different spatial contexts within a single speciation event. We offer a perspective on the context and evolutionary significance of hybridization during speciation, highlighting issues of current interest and debate. In secondary contact zones, it is uncertain if barriers to gene flow will be strengthened or broken down due to recombination and gene flow. Theory and empirical evidence suggest the latter is more likely, except within and around strongly selected genomic regions. Hybridization may contribute to speciation through the formation of new hybrid taxa, whereas introgression of a few loci may promote adaptive divergence and so facilitate speciation. Gene regulatory networks, epigenetic effects and the evolution of selfish genetic material in the genome suggest that the Dobzhansky-Muller model of hybrid incompatibilities requires a broader interpretation. Finally, although the incidence of reinforcement remains uncertain, this and other interactions in areas of sympatry may have knock-on effects on speciation both within and outside regions of hybridization.

...read moreread less

1,715 citations

Journal Article•DOI•

BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics.

[...]

Robert M. Waterhouse¹, Mathieu Seppey¹, Felipe A. Simão¹, Mosè Manni¹, Panagiotis Ioannidis¹, Guennadi Klioutchnikov¹, Evgenia V. Kriventseva¹, Evgeny M. Zdobnov¹ - Show less +4 more•Institutions (1)

Swiss Institute of Bioinformatics¹

01 Mar 2018-Molecular Biology and Evolution

TL;DR: This work presents BUSCO v3 with example analyses that highlight the wide‐ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.

...read moreread less

Abstract: Genomics promises comprehensive surveying of genomes and metagenomes, but rapidly changing technologies and expanding data volumes make evaluation of completeness a challenging task. Technical sequencing quality metrics can be complemented by quantifying completeness of genomic data sets in terms of the expected gene content of Benchmarking Universal Single-Copy Orthologs (BUSCO, http://busco.ezlab.org). The latest software release implements a complete refactoring of the code to make it more flexible and extendable to facilitate high-throughput assessments. The original six lineage assessment data sets have been updated with improved species sampling, 34 new subsets have been built for vertebrates, arthropods, fungi, and prokaryotes that greatly enhance resolution, and data sets are now also available for nematodes, protists, and plants. Here, we present BUSCO v3 with example analyses that highlight the wide-ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.

...read moreread less

1,575 citations

Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

[...]

Brian J. Haas, Steven L. Salzberg, Wei Zhu, Mihaela Pertea, Jonathan E. Allen, Joshua Orvis, Owen White, C R Buell, Jennifer R. Wortman - Show less +5 more

10 Dec 2007

TL;DR: The experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

...read moreread less

Abstract: EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

...read moreread less

1,528 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse