scispace - formally typeset
Search or ask a question

Showing papers on "Munich Information Center for Protein Sequences published in 2011"


Journal ArticleDOI
03 Oct 2011-Mycology
TL;DR: The application of the latest technologies and tools for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes are highlighted to improve the quality of predicted gene sets.
Abstract: Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center's production genome annotation environment.

118 citations


Journal ArticleDOI
TL;DR: The Munich Information Center for Protein Sequences (MIPS) has many years of experience in providing annotated collections of biological data and a novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis.
Abstract: The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).

85 citations


Proceedings ArticleDOI
Peipei Li1, Lyong Heo1, Meijing Li1, Keun Ho Ryu1, Gouchol Pok1 
26 Jul 2011
TL;DR: A novel protein function prediction approach on the basis of frequent pattern mining in graph data that predicts protein functions on a core set of protein-protein interaction data from DIP and function annotation data from FunCat of MIPS.
Abstract: Protein function prediction is one of the most challenging problems in the post-genomic era. Previous prediction methods using protein-protein interaction networks relied on the neighborhoods or the connected paths to known proteins. Still new algorithm is required to increase the accuracy. In this paper, we propose a novel protein function prediction approach on the basis of frequent pattern mining in graph data. A protein-protein interaction network is represented as an unweighted, undirected graph with nodes denoting proteins and edges denoting interactions between proteins. Each node is labeled with a set of corresponding protein functions. The function prediction method is processed in three steps, neighbor finding, pattern finding and function annotation. Using our approach we predict protein functions on a core set of protein-protein interaction data from DIP (Database of Interacting Proteins) and function annotation data from FunCat of MIPS (the Munich Information Center for Protein Sequences). The experimental results show better performance in prediction accuracy than existing neighbor counting methods.

4 citations