scispace - formally typeset
Search or ask a question
Topic

Munich Information Center for Protein Sequences

About: Munich Information Center for Protein Sequences is a research topic. Over the lifetime, 79 publications have been published within this topic receiving 6967 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the theory of Markov random fields is employed to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners, and the probability that the protein has such a function using Bayesian approaches.
Abstract: Assigning functions to novel proteins is one of the most important problems in the postgenomic era. Several approaches have been applied to this problem, including the analysis of gene expression patterns, phylogenetic profiles, protein fusions, and protein-protein interactions. In this paper, we develop a novel approach that employs the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners. For each function of interest and protein, we predict the probability that the protein has such function using Bayesian approaches. Unlike other available approaches for protein annotation in which a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We employ our method to predict protein functions based on "biochemical function," "subcellular location," and "cellular role" for yeast proteins defined in the Yeast Proteome Database (YPD, www.incyte.com), using the protein-protein interaction data from the Munich Information Center for Protein Sequences (MIPS, mips.gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction.

302 citations

Journal ArticleDOI
TL;DR: This "poor man's genome" resource forms the core foundations for various genome-scale experiments within the as yet unsequenceable plant genomes.

301 citations

Proceedings ArticleDOI
14 Aug 2002
TL;DR: A novel approach is developed that applies the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of its interaction protein partners to outperforms other available methods for function prediction based on protein interaction data.
Abstract: Assigning functions to novel proteins is one of the most important problems in the post-genomic era. We develop a novel approach that applies the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of its interaction protein partners. For each function of interest and a protein, we predict the probability that the protein has that function using Bayesian approaches. Unlike in other available approaches for protein annotation where a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We apply our method to predict cellular functions (43 categories including a category "others") for yeast proteins defined in the Yeast Proteome Database, using the protein-protein interaction data from the Munich Information Center for Protein Sequences. We show that our approach outperforms other available methods for function prediction based on protein interaction data.

270 citations

Journal ArticleDOI
TL;DR: A bibliography submission system is developed for scientists to submit, categorize and retrieve literature information, and a non-redundant reference protein database, PIR-NREF is introduced.
Abstract: The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases).

223 citations

Journal ArticleDOI
TL;DR: The Protein Information Resource (PIR) produces the largest, most comprehensive, annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International protein Sequence Database (JIPID).
Abstract: The Protein Information Resource (PIR) produces the largest, most comprehensive, annotated protein sequence database in the public domain, the PIR- International Protein Sequence Database, in collabo- ration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Sequence Database (JIPID). The expanded PIR WWW site allows sequence similarity and text searching of the Protein Sequence Database and auxiliary databases. Several new web-based search engines combine searches of sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. New capabilities for searching the PIR sequence databases include annotation-sorted search, domain search, combined global and domain search, and interactive text searches. The PIR-International databases and search tools are accessible on the PIR WWW site at http://pir.georgetown.edu and at the MIPS WWW site at http://www.mips.biochem.mpg.de . The PIR-Inter- national Protein Sequence Database and other files are also available by FTP.

198 citations


Network Information
Related Topics (5)
Genomics
15.4K papers, 1M citations
80% related
Genome
74.2K papers, 3.8M citations
80% related
Human genome
11.5K papers, 1M citations
78% related
Conserved sequence
12.4K papers, 887K citations
76% related
Phylogenetic tree
26.6K papers, 1.3M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20171
20161
20151
20144
20134
20121