scispace - formally typeset
Search or ask a question

Showing papers on "Munich Information Center for Protein Sequences published in 2001"


Journal ArticleDOI
TL;DR: The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database.
Abstract: The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classificationdriven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively crossreferenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. The Protein Information Resource (PIR) for over three decades has been a community resource that provides protein databases and analysis tools to support research on molecular evolution, functional genomics and computational biology. The PIR, along with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), maintains and distributes the PIR-International Protein Sequence Database, the most comprehensive, well-annotated and non-redundant protein sequence database in the public domain. To further support genomic and proteomic research, we have greatly improved our bioinformatics infrastructure in the last 2 years, which allows us: (i) to continue to provide high quality protein sequence data and annotation, while keeping pace with the large influx of data being generated by genome sequencing projects; (ii) to develop an integrated system of protein databases and analytical tools for expert annotation and knowledge discovery; and (iii) to improve accessibility of our resource and interoperability of our databases. Some key developments include: highly-automated protein sequence classification and annotation, enhanced web site with many new search engines and functionality for protein data mining and analysis, a new integrated classification database that provides comprehensive descriptions of family relationships and functional/structural annotations, database migration into Oracle 8i object-relational database system and database distribution in XML format.

86 citations


Journal ArticleDOI
TL;DR: The most important advances in the field of genome annotation over the past two years involve the use of cDNA sequences, protein structures and gene expression data to predict genes.

32 citations


Reference EntryDOI
19 Apr 2001
TL;DR: The comprehensive, annotated and curated protein sequence databases – the PIR-International Protein Sequence Database and SWISS-PROT – support genome annotation, protein functional and structural analysis, proteomics, and phylogenetic studies.
Abstract: The comprehensive, annotated and curated protein sequence databases – the PIR-International Protein Sequence Database and SWISS-PROT – support genome annotation, protein functional and structural analysis, proteomics, and phylogenetic studies. The databases are widely available, either free or by subscription, and are augmented by a variety of useful tools and related databases. Keywords: protein sequences; sequence databases; bioinformatics resources; protein superfamilies

13 citations


01 Jan 2001
TL;DR: This chapter gives an overview of available resources on rice bioinformatics and their role in elucidating and propagating biological and genomic information on rice as well as proposed logistics for interlinking these resources.
Abstract: As rice genomics data continue to accumulate at a rapid rate, databases are becoming more valuable for storing and providing access to large and rigorous data sets. This chapter gives an overview of available resources on rice bioinformatics and their role in elucidating and propagating biological and genomic information on rice. Of particular focus here is the informatics infrastructure developed at the Rice Genome Research Program (RGP) following an extensive rice genome analysis. The database named INE (INtegrated Rice Genome Explorer) integrates genetic and physical mapping information with the genome sequence being generated in collaboration with the International Rice Genome Sequencing Project (IRGSP). Database links are initially evaluated using a query tool to explore and compare data across the rice and maize genome databases and for potential application to multiple-crop database querying. A proposed logistics for interlinking these resources is presented to integrate, manipulate, and analyze information on the rice ge-nome. One of the biggest challenges of rice bioinformatics lies in the emerging role of rice as a model system among grass crop species. In view of the importance of comparative genomics in the formulation of new knowledge on plant genome structure and function, bioinformatics remains an essential strategy for gaining new insights into the needs and expectations of rice genomics. Bioinformatics is a new field that emerged in parallel with the advances achieved in genomic analysis. Improved techniques in molecular biology played a key role in catalyzing large-scale sequencing of expressed sequence tags (ESTs), construction of whole genetic maps with specified markers, physical mapping with large insert-size libraries, whole genome sequencing, and transcriptional profiling (Benton 1996). This scenario of rapid technology development combined with mass production of ge-nomic data led to a vital need to transform massive information into more manageable forms by way of bioinformatics. Advances in computer technology including the emergence of the World Wide Web and the Internet, now dominating every aspect of

1 citations