scispace - formally typeset
Search or ask a question
Author

Burkhard Morgenstern

Bio: Burkhard Morgenstern is an academic researcher from University of Göttingen. The author has contributed to research in topics: Multiple sequence alignment & Alignment-free sequence analysis. The author has an hindex of 44, co-authored 115 publications receiving 14218 citations. Previous affiliations of Burkhard Morgenstern include University of Évry Val d'Essonne & Max Planck Society.


Papers
More filters
Journal ArticleDOI
TL;DR: To the authors' knowledge, this is the first ab initio gene finder that can predict multiple transcripts and offers a motif searching facility, where user-defined regular expressions can be searched against putative proteins encoded by the predicted genes.
Abstract: AUGUSTUS is a software tool for gene prediction in eukaryotes based on a Generalized Hidden Markov Model, a probabilistic model of a sequence and its gene structure. Like most existing gene finders, the first version of AUGUSTUS returned one transcript per predicted gene and ignored the phenomenon of alternative splicing. Herein, we present a WWW server for an extended version of AUGUSTUS that is able to predict multiple splice variants. To our knowledge, this is the first ab initio gene finder that can predict multiple transcripts. In addition, we offer a motif searching facility, where user-defined regular expressions can be searched against putative proteins encoded by the predicted genes. The AUGUSTUS web interface and the downloadable open-source stand-alone program are freely available from http://augustus.gobics.de.

1,557 citations

Journal ArticleDOI
TL;DR: This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume).
Abstract: The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).

1,314 citations

Journal ArticleDOI
24 Apr 2008-Nature
TL;DR: Tribolium castaneum is a member of the most species-rich eukaryotic order, a powerful model organism for the study of generalized insect development, and an important pest of stored agricultural products.
Abstract: Tribolium castaneum is a member of the most species-rich eukaryotic order, a powerful model organism for the study of generalized insect development, and an important pest of stored agricultural products. We describe its genome sequence here. This omnivorous beetle has evolved the ability to interact with a diverse chemical environment, as shown by large expansions in odorant and gustatory receptors, as well as P450 and other detoxification enzymes. Development in Tribolium is more representative of other insects than is Drosophila, a fact reflected in gene content and function. For example, Tribolium has retained more ancestral genes involved in cell-cell communication than Drosophila, some being expressed in the growth zone crucial for axial elongation in short-germ development. Systemic RNA interference in T. castaneum functions differently from that in Caenorhabditis elegans, but nevertheless offers similar power for the elucidation of gene function and identification of targets for selective insect control.

1,248 citations

Journal ArticleDOI
TL;DR: A WWW server for AUGUSTUS, a software for gene prediction in eukaryotic genomic sequences that is based on a generalized hidden Markov model, a probabilistic model of a sequence and its gene structure, is presented.
Abstract: We present a WWW server for AUGUSTUS, a software for gene prediction in eukaryotic genomic sequences that is based on a generalized hidden Markov model, a probabilistic model of a sequence and its gene structure. The web server allows the user to impose constraints on the predicted gene structure. A constraint can specify the position of a splice site, a translation initiation site or a stop codon. Furthermore, it is possible to specify the position of known exons and intervals that are known to be exonic or intronic sequence. The number of constraints is arbitrary and constraints can be combined in order to pin down larger parts of the predicted gene structure. The result then is the most likely gene structure that complies with all given user constraints, if such a gene structure exists. The specification of constraints is useful when part of the gene structure is known, e.g. by expressed sequence tag or protein sequence alignments, or if the user wants to change the default prediction. The web interface and the downloadable stand-alone program are available free of charge at http://augustus.gobics.de/submission.

1,127 citations

Stephen Richards, R. A. Gibbs, George M. Weinstock, Susan J. Brown, R. E. Denell, Richard W. Beeman, Richard A. Gibbs, Gregor Bucher, Markus Friedrich, Cornelis J. P. Grimmelikhuijzen, Martin Klingler, Marcé D. Lorenzen, Siegfried Roth, Reinhard Schröder, Diethard Tautz, Evgeny M. Zdobnov, Donna M. Muzny, Tony Attaway, Stephanie Bell, Christian J. Buhay, Mimi N. Chandrabose, Dean Chavez, KP Clerk-Blankenburg, Andy Cree, Marvin Diep Dao, Clay Davis, Joseph Chacko, Huyen Dinh, Shannon Dugan-Rocha, Gerald R. Fowler, Toni T. Garner, Jeffrey Garnes, Andreas Gnirke, Alicia Hawes, Judith Hernandez, Sandra Hines, M. Holder, Jennifer Hume, Shalini N. Jhangiani, Joshi, Ziad Khan, LaRonda Jackson, Christie Kovar, A Kowis, Sandra L. Lee, Lora Lewis, Jonathan Margolis, Michael J. Morgan, Lynne V. Nazareth, Ngoc Nguyen, Geoffrey Okwuonu, David Parker, San Juana Ruiz, Jireh Santibanez, Joël Savard, Steve Scherer, Brian W. Schneider, Erica Sodergren, S Vattahil, Donna Villasana, Courtney Sherell White, Rita A. Wright, Yoonseong Park, Joanne Lord, Brenda Oppert, Stephen Brown, Liangjiang Wang, G Weinstock, Yue Liu, Kim C. Worley, Christine G. Elsik, Justin T. Reese, Eran Elhaik, Giddy Landan, Dan Graur, Peter Arensburger, Peter W. Atkinson, J Beidler, Jeffery P. Demuth, Douglas W. Drury, YZ Du, Haruhiko Fujiwara, Maselli, Mizuko Osanai, Hugh M. Robertson, Zhijian Tu, Jianjun Wang, Suzhi Wang, Henry Song, Lan Zhang, Doreen Werner, Mario Stanke, Burkhard Morgenstern, Solovyev, Peter Kosarev, Garth Brown, Hsiu Chuan Chen, Olga Ermolaeva, Wratko Hlavina, Yuri Kapustin 
01 Jan 2008
TL;DR: Tribolium castaneum is a member of the most species-rich eukaryotic order, a powerful model organism for the study of generalized insect development, and an important pest of stored agricultural products as discussed by the authors.
Abstract: Tribolium castaneum is a member of the most species-rich eukaryotic order, a powerful model organism for the study of generalized insect development, and an important pest of stored agricultural products We describe its genome sequence here This omnivorous beetle has evolved the ability to interact with a diverse chemical environment, as shown by large expansions in odorant and gustatory receptors, as well as P450 and other detoxification enzymes Development in Tribolium is more representative of other insects than is Drosophila, a fact reflected in gene content and function For example, Tribolium has retained more ancestral genes involved in cell-cell communication than Drosophila, some being expressed in the growth zone crucial for axial elongation in short-germ development Systemic RNA interference in T castaneum functions differently from that in Caenorhabditis elegans, but nevertheless offers similar power for the elucidation of gene function and identification of targets for selective insect control

1,081 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations

Journal ArticleDOI
TL;DR: A new program called Clustal Omega is described, which can align virtually any number of protein sequences quickly and that delivers accurate alignments, and which outperforms other packages in terms of execution time and quality.
Abstract: Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

12,489 citations

Journal ArticleDOI
TL;DR: A simplified scoring system is proposed that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length.
Abstract: A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homologous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.

12,003 citations

Journal ArticleDOI
TL;DR: The major concepts and results recently achieved in the study of the structure and dynamics of complex networks are reviewed, and the relevant applications of these ideas in many different disciplines are summarized, ranging from nonlinear science to biology, from statistical mechanics to medicine and engineering.

9,441 citations