UniProt: the Universal Protein knowledgebase
Rolf Apweiler,Amos Marc Bairoch,Cathy H. Wu,Winona C. Barker,Brigitte Boeckmann,Serenella Ferro,Elisabeth Gasteiger,Hongzhan Huang,Rodrigo Lopez,Michele Magrane,Maria Jesus Martin,Darren A. Natale,Claire O'Donovan,Nicole Redaschi,Lai-Su L. Yeh +14 more
Reads0
Chats0
TLDR
The Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt), which is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces.Abstract:
To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt.read more
Citations
More filters
Journal ArticleDOI
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.
Damian Szklarczyk,Annika L. Gable,David Lyon,Alexander Junge,Stefan Wyder,Jaime Huerta-Cepas,Milan Simonovic,Nadezhda Tsankova Doncheva,John H. Morris,Peer Bork,Lars Juhl Jensen,Christian von Mering +11 more
TL;DR: The latest version of STRING more than doubles the number of organisms it covers, and offers an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input.
Journal ArticleDOI
Prokka: Rapid Prokaryotic Genome Annotation
TL;DR: Prokka is introduced, a command line software tool to fully annotate a draft bacterial genome in about 10 min on a typical desktop computer, and produces standards-compliant output files for further analysis or viewing in genome browsers.
Journal ArticleDOI
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Weizhong Li,Adam Godzik +1 more
TL;DR: Cd-hit-2d compares two protein datasets and reports similar matches between them; cd- Hit-est clusters a DNA/RNA sequence database and cd- hit-est-2D compares two nucleotide datasets.
Journal ArticleDOI
SWISS-MODEL: homology modelling of protein structures and complexes.
Andrew Waterhouse,Andrew Waterhouse,Martino Bertoni,Martino Bertoni,Stefan Bienert,Stefan Bienert,Gabriel Studer,Gabriel Studer,Gerardo Tauriello,Gerardo Tauriello,Rafal Gumienny,Rafal Gumienny,Florian T Heer,Florian T Heer,Tjaart A. P. de Beer,Tjaart A. P. de Beer,Christine Rempfer,Christine Rempfer,Lorenza Bordoli,Lorenza Bordoli,Rosalba Lepore,Rosalba Lepore,Torsten Schwede,Torsten Schwede +23 more
TL;DR: An update to the SWISS-MODEL server is presented, which includes the implementation of a new modelling engine, ProMod3, and the introduction a new local model quality estimation method, QMEANDisCo.
Journal ArticleDOI
Metascape provides a biologist-oriented resource for the analysis of systems-level datasets.
Yingyao Zhou,Bin Zhou,Lars Pache,Max W. Chang,Alireza Hadj Khodabakhshi,Olga Tanaseichuk,Christopher Benner,Sumit K. Chanda +7 more
TL;DR: A biologist-oriented portal that provides a gene list annotation, enrichment and interactome resource and enables integrated analysis of multi-OMICs datasets, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era.
References
More filters
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
The Pfam protein families database
Marco Punta,Penny Coggill,Ruth Y. Eberhardt,Jaina Mistry,John Tate,Chris Boursnell,Ningze Pang,Kristoffer Forslund,Goran Ceric,Jody Clements,Andreas Heger,Liisa Holm,Erik L. L. Sonnhammer,Sean R. Eddy,Alex Bateman,Robert D. Finn +15 more
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Journal ArticleDOI
The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
Brigitte Boeckmann,Amos Marc Bairoch,Rolf Apweiler,Marie-Claude Blatter,Anne Estreicher,Elisabeth Gasteiger,Maria Jesus Martin,Karine Michoud,Claire O'Donovan,Isabelle Phan,Sandrine Pilbout,Michel Schneider +11 more
TL;DR: The SWISS-PROT protein knowledgebase connects amino acid sequences with the current knowledge in the Life Sciences by providing an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions.
Journal ArticleDOI
The Ensembl genome database project
Tim Hubbard,Daniel Barker,Ewan Birney,Graham Cameron,Yuan Chen,Louise Clark,Tony Cox,James Cuff,Val Curwen,Thomas A. Down,Richard Durbin,Eduardo Eyras,James G. R. Gilbert,Martin Hammond,Lukasz Huminiecki,Arek Kasprzyk,Heikki Lehväslaiho,Philip Lijnzaad,Craig Melsopp,Emmanuel Mongin,Roger Pettett,Matthew Pocock,Simon C. Potter,Alistair G. Rust,Esther Schmidt,Stephen M. J. Searle,Guy Slater,James Smith,William Spooner,Arne Stabenau,Jim Stalker,Elia Stupka,Abel Ureta-Vidal,Imre Vastrik,Michele Clamp +34 more
TL;DR: The Ensembl database project provides a bioinformatics framework to organise biology around the sequences of large genomes and is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources.
Journal ArticleDOI
Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.
TL;DR: A new procedure is described for detecting and correcting those errors that arise at the model-building stage of the procedure and a good procedure for creating HMMs for sequences of proteins of known structure are determined.
Related Papers (5)
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more