scispace - formally typeset
Open AccessJournal ArticleDOI

The Universal Protein Resource (UniProt)

Reads0
Chats0
TLDR
During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; the UniProt keyword list got augmented by additional keywords; the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications.
Abstract
The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two weeks.

read more

Citations
More filters
Journal ArticleDOI

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

TL;DR: The survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.
Journal ArticleDOI

The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling

TL;DR: The SWISS-MODEL workspace is a web-based integrated service dedicated to protein structure homology modelling that assists and guides the user in building protein homology models at different levels of complexity.
Journal ArticleDOI

I-TASSER: a unified platform for automated protein structure and function prediction

TL;DR: The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence- to-structure-to-function paradigm.
Journal ArticleDOI

Biopython: freely available Python tools for computational molecular biology and bioinformatics

TL;DR: Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning.
Journal ArticleDOI

Lysine Acetylation Targets Protein Complexes and Co-Regulates Major Cellular Functions

TL;DR: A proteomic-scale analysis of protein acetylation suggests that it is an important biological regulatory mechanism and the regulatory scope of lysine acetylations is broad and comparable with that of other major posttranslational modifications.
References
More filters
Journal ArticleDOI

The Pfam protein families database

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Journal ArticleDOI

UniProt: the Universal Protein knowledgebase

TL;DR: The Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt), which is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces.
Journal ArticleDOI

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003

TL;DR: The SWISS-PROT protein knowledgebase connects amino acid sequences with the current knowledge in the Life Sciences by providing an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions.
Journal ArticleDOI

Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.

TL;DR: A new procedure is described for detecting and correcting those errors that arise at the model-building stage of the procedure and a good procedure for creating HMMs for sequences of proteins of known structure are determined.
Related Papers (5)