scispace - formally typeset
Search or ask a question

Showing papers on "Munich Information Center for Protein Sequences published in 2007"


Journal ArticleDOI
TL;DR: The Munich Information Center for Protein Sequences combines automatic processing of large amounts of sequences with manual annotation of selected model genomes with the compilation of manually curated databases for protein interactions based on scrutinized information from the literature.
Abstract: The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

162 citations


Book ChapterDOI
TL;DR: This chapter introduces the basic statistical results that underlie mapping of nucleotide sequences to genomes and briefly surveys the common programs and algorithms that are used to perform genome mapping, all available via public hosted web sites.
Abstract: The unprecedented availability of genome sequences, coupled with user-friendly, web-enabled search and analysis tools allows practitioners to locate interesting genome features or sequence tracts with relative ease. Although many public model organism- and genome-mapping resources offer pre-mapped genome browsing, biologists also still need to perform de novo mapping analyses. Correct interpretation of the results in genome annotation databases or the results of one's individual analyses requires at least a conceptual understanding of the statistics and mechanics of genome searches, the expected results from statistical considerations, as well as the algorithms used by different search tools. This chapter introduces the basic statistical results that underlie mapping of nucleotide sequences to genomes and briefly surveys the common programs and algorithms that are used to perform genome mapping, all available via public hosted web sites. Selection of the appropriate sequence search and mapping tool will often demand tradeoffs in sensitivity and specificity relating to the statistics of the search.

9 citations


Journal ArticleDOI
TL;DR: An open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes, which re-annotates a genome by similarity searches utilizing the BLAST-Like Alignment Tool and referring to protein databases such as UniProt KB,NCBI nr, NCBI COGs, Pfam, and PSORTb.

8 citations


Journal ArticleDOI
01 Jul 2007
TL;DR: In this paper, a Web-service-enabled component-based architecture is proposed to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Microbase).
Abstract: The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, this analysis has become a computationally intensive and data-intensive problem. This paper describes the development of a Web-service-enabled, component-based, architecture to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Microbase). The system is coordinated through the use of Web-service-based notifications and integrates distributed computing resources together with genomic databases to realize all-against-all comparisons for a large volume of genome sequences and to present the data in a computationally amenable format through a Web service interface. We demonstrate the use of the system in searching for orthologues and candidate protein families, which ultimately could lead to the identification of potential therapeutic targets.

4 citations