UniProt: the Universal Protein knowledgebase

doi:10.1093/NAR/GKH131

Open AccessJournal ArticleDOI

UniProt: the Universal Protein knowledgebase

Rolf Apweiler, +14 more

- 01 Jan 2004 -

Nucleic Acids Research

- Vol. 32, Iss: 90001, pp 115-119

Chats0

TLDR

The Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt), which is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces.

Abstract:

To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

Damian Szklarczyk, +11 more

- 08 Jan 2019 -

Nucleic Acids Research

TL;DR: The latest version of STRING more than doubles the number of organisms it covers, and offers an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input.

...read moreread less

Journal ArticleDOI

Prokka: Rapid Prokaryotic Genome Annotation

Torsten Seemann

- 15 Jul 2014 -

Bioinformatics

TL;DR: Prokka is introduced, a command line software tool to fully annotate a draft bacterial genome in about 10 min on a typical desktop computer, and produces standards-compliant output files for further analysis or viewing in genome browsers.

...read moreread less

Journal ArticleDOI

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences

Weizhong Li, +1 more

- 01 Jul 2006 -

Bioinformatics

TL;DR: Cd-hit-2d compares two protein datasets and reports similar matches between them; cd- Hit-est clusters a DNA/RNA sequence database and cd- hit-est-2D compares two nucleotide datasets.

...read moreread less

Journal ArticleDOI

Metascape provides a biologist-oriented resource for the analysis of systems-level datasets.

Yingyao Zhou, +7 more

- 03 Apr 2019 -

Nature Communications

TL;DR: A biologist-oriented portal that provides a gene list annotation, enrichment and interactome resource and enables integrated analysis of multi-OMICs datasets, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

The Pfam protein families database

Marco Punta, +15 more

- 01 Jan 2000 -

Nucleic Acids Research

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.

...read moreread less

Journal ArticleDOI

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003

Brigitte Boeckmann, +11 more

- 01 Jan 2003 -

Nucleic Acids Research

TL;DR: The SWISS-PROT protein knowledgebase connects amino acid sequences with the current knowledge in the Life Sciences by providing an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions.

...read moreread less

Journal ArticleDOI

The Ensembl genome database project

Tim Hubbard, +34 more

- 01 Jan 2002 -

Nucleic Acids Research

TL;DR: The Ensembl database project provides a bioinformatics framework to organise biology around the sequences of large genomes and is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources.

...read moreread less

Journal ArticleDOI

Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.

Julian Gough, +3 more

- 02 Nov 2001 -

Journal of Molecular Biology

TL;DR: A new procedure is described for detecting and correcting those errors that arise at the model-building stage of the procedure and a good procedure for creating HMMs for sequences of proteins of known structure are determined.

...read moreread less