Showing papers by "Rodrigo Lopez published in 2004"

PDF

Open Access

Journal Article•DOI•

UniProt: the Universal Protein knowledgebase

[...]

Rolf Apweiler¹, Amos Marc Bairoch, Cathy H. Wu, Winona C. Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria Jesus Martin, Darren A. Natale, Claire O'Donovan, Nicole Redaschi, Lai-Su L. Yeh - Show less +11 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2004-Nucleic Acids Research

TL;DR: The Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt), which is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces.

...read moreread less

Abstract: To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt.

...read moreread less

7,298 citations

Journal Article•DOI•

The Universal Protein Resource (UniProt)

[...]

Amos Marc Bairoch¹, Rolf Apweiler, Cathy H. Wu, Winona C. Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria Jesus Martin, Darren A. Natale, Claire O'Donovan, Nicole Redaschi, Lai-Su L. Yeh - Show less +11 more•Institutions (1)

Swiss Institute of Bioinformatics¹

17 Dec 2004-Nucleic Acids Research

TL;DR: During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; the UniProt keyword list got augmented by additional keywords; the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications.

...read moreread less

Abstract: The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two weeks.

...read moreread less

4,074 citations

Journal Article•DOI•

The EMBL Nucleotide Sequence Database

[...]

Tamara Kulikova¹, Philippe Aldebert¹, Nicola Althorpe¹, Wendy Baker¹, Kirsty Bates¹, Paul Browne¹, Alexandra van den Broek¹, Guy Cochrane¹, Karyn Duggan¹, Ruth Y. Eberhardt¹, Nadeem Faruque¹, Maria Garcia-Pastor¹, Nicola Harte¹, Carola Kanz¹, Rasko Leinonen¹, Quan Lin¹, Vincent Lombard¹, Rodrigo Lopez¹, Renato Mancuso¹, Michelle McHale¹, Francesco Nardone¹, Ville Silventoinen¹, Peter Stoehr¹, Guenter Stoesser¹, Mary Ann Tuli¹, Katerina Tzouvara¹, Robert Vaughan¹, Dan Wu¹, Weimin Zhu¹, Rolf Apweiler¹ - Show less +26 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2004-Nucleic Acids Research

TL;DR: Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data.

...read moreread less

Abstract: The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl.html) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. While automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO), the preferred submission tool for individual submitters is Webin (WWW). Through all stages, dataflow is monitored by EBI biologists communicating with the sequencing groups. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI). Database releases are produced quarterly and are distributed on CD-ROM. Network services allow access to the most up-to-date data collection via Internet and World Wide Web interface. EBI's Sequence Retrieval System (SRS) is a Network Browser for Databanks in Molecular Biology, integrating and linking the main nucleotide and protein databases, plus many specialised databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, Blast etc) are available for external users to compare their own sequences against the most currently available data in the EMBL Nucleotide Sequence Database and SWISS-PROT.

...read moreread less

1,187 citations

Journal Article•DOI•

The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology

[...]

Evelyn Camon¹, Michele Magrane¹, Daniel Barrell¹, Vivian Lee¹, Emily Dimmer¹, John Maslen¹, David Binns¹, Nicola Harte¹, Rodrigo Lopez¹, Rolf Apweiler¹ - Show less +6 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2004-Nucleic Acids Research

TL;DR: The Gene Ontology Annotation database aims to provide high-quality electronic and manual annotations to the UniProt Knowledgebase (Swiss-Prot, TrEMBL and PIR-PSD) using the standardized vocabulary of theGene Ontology (GO).

...read moreread less

Abstract: The Gene Ontology Annotation (GOA) database (http://www.ebi.ac.uk/GOA) aims to provide high-quality electronic and manual annotations to the UniProt Knowledgebase (Swiss-Prot, TrEMBL and PIR-PSD) using the standardized vocabulary of the Gene Ontology (GO). As a supplementary archive of GO annotation, GOA promotes a high level of integration of the knowledge represented in UniProt with other databases. This is achieved by converting UniProt annotation into a recognized computational format. GOA provides annotated entries for nearly 60,000 species (GOA-SPTr) and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. By integrating GO annotations from other model organism groups, GOA consolidates specialized knowledge and expertise to ensure the data remain a key reference for up-to-date biological information. Furthermore, the GOA database fully endorses the Human Proteomics Initiative by prioritizing the annotation of proteins likely to benefit human health and disease. In addition to a non-redundant set of annotations to the human proteome (GOA-Human) and monthly releases of its GO annotation for all species (GOA-SPTr), a series of GO mapping files and specific cross-references in other databases are also regularly distributed. GOA can be queried through a simple user-friendly web interface or downloaded in a parsable format via the EBI and GO FTP websites. The GOA data set can be used to enhance the annotation of particular model organism or gene expression data sets, although increasingly it has been used to evaluate GO predictions generated from text mining or protein interaction experiments. In 2004, the GOA team will build on its success and will continue to supplement the functional annotation of UniProt and work towards enhancing the ability of scientists to access all available biological information. Researchers wishing to query or contribute to the GOA project are encouraged to email: goa@ebi.ac.uk.

...read moreread less

917 citations

Journal Article•DOI•

InterPro, progress and status in 2005

[...]

17 Dec 2004-Nucleic Acids Research

TL;DR: InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites.

...read moreread less

Abstract: InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

...read moreread less

575 citations

Journal Article•DOI•

IPD--the Immuno Polymorphism Database.

[...]

James Robinson¹, Kavita Mistry², Hamish McWilliam², Rodrigo Lopez², Steven G.E. Marsh² - Show less +1 more•Institutions (2)

Royal Free Hospital¹, European Bioinformatics Institute²

17 Dec 2004-Nucleic Acids Research

TL;DR: The Immuno Polymorphism Database (IPD) is a set of specialist databases related to the study of polymorphic genes in the immune system and provides access to the European Searchable Tumour cell-line database.

...read moreread less

Abstract: The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study of polymorphic genes in the immune system. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of killer-cell immunoglobulin-like receptors, IPD-MHC, a database of sequences of the major histocompatibility complex of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTDAB, which provides access to the European Searchable Tumour Cell-Line Database, a cell bank of immunologically characterized melanoma cell lines. The data is currently available online from the website and FTP directory. This article describes the latest updates and additional tools added to the IPD project.

...read moreread less

464 citations

Journal Article•DOI•

UniProt archive

[...]

Rasko Leinonen¹, Federico Garcia Diez¹, David Binns¹, Wolfgang Fleischmann¹, Rodrigo Lopez¹, Rolf Apweiler¹ - Show less +2 more•Institutions (1)

European Bioinformatics Institute¹

22 Nov 2004-Bioinformatics

TL;DR: UniProt Archive (UniParc) as discussed by the authors is the most comprehensive, non-redundant protein sequence database available, containing only protein sequences and database cross-references; all other information must be retrieved from the source databases.

...read moreread less

Abstract: Summary: UniProt Archive (UniParc) is the most comprehensive, non-redundant protein sequence database available. Its protein sequences are retrieved from predominant, publicly accessible resources. All new and updated protein sequences are collected and loaded daily into UniParc for full coverage. To avoid redundancy, each unique sequence is stored only once with a stable protein identifier, which can be used later in UniParc to identify the same protein in all source databases. When proteins are loaded into the database, database cross-references are created to link them to the origins of the sequences. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databases cross-referenced by UniParc. UniParc contains only protein sequences and database cross-references; all other information must be retrieved from the source databases. Availability: http://www.ebi.ac.uk/uniparc/

...read moreread less

174 citations

Journal Article•DOI•

Public web-based services from the European Bioinformatics Institute.

[...]

Nicola Harte¹, Ville Silventoinen¹, Emmanuel Quevillon¹, Stephen J. Robinson¹, Kimmo Kallio¹, Xavier Fustero¹, Pravin C. Patel¹, Petteri Jokinen¹, Rodrigo Lopez¹ - Show less +5 more•Institutions (1)

European Bioinformatics Institute¹

01 Jul 2004-Nucleic Acids Research

TL;DR: A detailed introduction to the interactive analysis systems that are available from the EBI are provided and a brief introduction to other, related services are provided.

...read moreread less

Abstract: The mission of the European Bioinformatics Institute (EBI), an outstation of the European Molecular Biology Laboratory (EMBL) in Heidelberg, is to ensure that the growing body of information from molecular biology and genome research is placed in the public domain and is accessible freely to all parts of the scientific community in ways that promote scientific progress. To fulfil this mission, the EBI provides a wide variety of free, publicly available bioinformatics services. These can be divided into data submissions processing; access to query, analysis and retrieval systems and tools; ftp downloads of software and databases; training and education and user support. All of these services are available at the EBI website: http://www.ebi.ac.uk/services. This paper provides a detailed introduction to the interactive analysis systems that are available from the EBI and a brief introduction to other, related services.

...read moreread less

41 citations