scispace - formally typeset
Search or ask a question

Showing papers on "Munich Information Center for Protein Sequences published in 2008"


Journal ArticleDOI
TL;DR: Several ways in which high-throughput tandem mass spectrometry-based proteomics can improve the quality of genome annotations are highlighted and it is suggested that it could be efficiently applied during the gene calling process so that the improvements are propagated through the subsequent functional annotation process.
Abstract: While genome sequencing efforts reveal the basic building blocks of life, a genome sequence alone is insufficient for elucidating biological function. Genome annotationcthe process of identifying genes and assigning function to each gene in a genome sequencecprovides the means to elucidate biological function from sequence. Current stateof-the-art high-throughput genome annotation uses a combination of comparative (sequence similarity data) and non-comparative (ab initio gene prediction algorithms) methods to identify protein-coding genes in genome sequences. Because approaches used to validate the presence of predicted protein-coding genes are typically based on expressed RNA sequences, they cannot independently and unequivocally determine whether a predicted protein-coding gene is translated into a protein. With the ability to directly measure peptides arising from expressed proteins, high-throughput liquid chromatography-tandem mass spectrometry-based proteomics approaches can be used to verify coding regions of a genomic sequence. Here, we highlight several ways in which high-throughput tandem mass spectrometry-based proteomics can improve the quality of genome annotations and suggest that it could be efficiently applied during the gene calling process so that the improvements are propagated through the subsequent functional annotation process.

139 citations


Journal ArticleDOI
01 Feb 2008
TL;DR: This paper describes how seven genomic features and four experimental interaction data sets were combined using a Bayesian-networks-based data integration approach to infer PPI networks in yeast, indicating that a better clustering result was obtained in terms of both statistical measures and biological relevance.
Abstract: Protein-protein interactions (PPIs) play crucial roles in virtually every aspect of cellular function within an organism. One important objective of modern biology is the extraction of functional modules, such as protein complexes from global protein interaction networks. This paper describes how seven genomic features and four experimental interaction data sets were combined using a Bayesian-networks-based data integration approach to infer PPI networks in yeast. Greater coverage and higher accuracy were achieved than in previous high-throughput studies of PPI networks in yeast. A Markov clustering algorithm was then used to extract protein complexes from the inferred protein interaction networks. The quality of the computed complexes was evaluated using the hand-curated complexes from the Munich Information Center for Protein Sequences database and gene-ontology-driven semantic similarity. The results indicated that, by integrating multiple genomic information sources, a better clustering result was obtained in terms of both statistical measures and biological relevance.

37 citations


Journal ArticleDOI
TL;DR: The Seoul National University Genome Browser (SNUGB) integrates various types of genomic information derived from 98 fungal/oomycete and 34 plant and animal species, graphically presents germane features and properties of each genome, and supports comparison between genomes.
Abstract: Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP) was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed.

20 citations


Book ChapterDOI
05 Dec 2008
TL;DR: An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence.
Abstract: An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence. Bioinformatics is essential for annotating the structure and function of genes, proteins and the analysis of complete genomes and to molecular biology and biochemistry. Included is an overview of bioinformatics, the full spectrum of genome annotation approaches including; genome analysis and gene prediction, gene regulation analysis and expression, genome variation and QTL analysis, large scale protein annotation of function and structure, annotation and prediction of protein interactions, and the organization and annotation of molecular networks and biochemical pathways. Also covered is a technical framework to organize and represent genome data using the DAS technology and work in the annotation of two large genomic sets: HIV/HCV viral genomes and splicing alternatives potentially encoded in 1% of the human genome.

19 citations