CDD/SPARCLE: the conserved domain database in 2020

doi:10.1093/NAR/GKZ991

Open AccessJournal ArticleDOI

CDD/SPARCLE: the conserved domain database in 2020

Shennan Lu, +16 more

- 08 Jan 2020 -

Nucleic Acids Research

- Vol. 48

Chats0

TLDR

As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research.

Abstract:

As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq. These architecture definitions are available via SPARCLE, the Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

The InterPro protein families and domains database: 20 years on.

Matthias Blum, +32 more

- 08 Jan 2021 -

Nucleic Acids Research

TL;DR: The status of InterPro (version 81.0) in its 20th year of operation, and its associated software, is reported, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.

...read moreread less

Journal ArticleDOI

Protein sequence analysis using the MPI Bioinformatics Toolkit

Felix Gabler, +8 more

- 01 Dec 2020 -

Current protocols in human genetics

TL;DR: Detailed information is provided on utilizing the three most widely accessed tools within the MPI Bioinformatics Toolkit: HHpred for the detection of homologs, HHpred in conjunction with MODELLER for structure prediction and homology modeling, and CLANS for the visualization of relationships in large sequence datasets.

...read moreread less

Journal ArticleDOI

COG database update: focus on microbial diversity, model organisms, and widespread pathogens.

Michael Y. Galperin, +5 more

- 08 Jan 2021 -

Nucleic Acids Research

TL;DR: The Clusters of Orthologous Genes (COG) database, created in 1997 and went through several rounds of updates, most recently, in 2014, substantially expands the scope of the database to include complete genomes of 1187 bacteria and 122 archaea, typically, with a single genome per genus.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

The Pfam protein families database in 2019.

Sara El-Gebali, +16 more

- 08 Jan 2019 -

Nucleic Acids Research

TL;DR: A significant comparison to the structural classification database that led to the creation of 825 new families based on their set of uncharacterized families (EUFs) was carried out and Pfam entries were connected to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms.

...read moreread less

Journal ArticleDOI

CD-Search: protein domain annotations on the fly

Aron Marchler-Bauer, +1 more

- 01 Jul 2004 -

Nucleic Acids Research

TL;DR: The Conserved Domain Search service (CD-Search), a web-based tool for the detection of structural and functional domains in protein sequences, uses BLAST(R) heuristics to provide a fast, interactive service, and searches a comprehensive collection of domain models.

...read moreread less

Journal ArticleDOI

The COG database: new developments in phylogenetic classification of proteins from complete genomes

Roman L. Tatusov, +9 more

- 01 Jan 2001 -

Nucleic Acids Research

TL;DR: The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis.

...read moreread less

Journal ArticleDOI

20 years of the SMART protein domain annotation resource.

Ivica Letunic, +1 more

- 01 Jan 2018 -

Nucleic Acids Research

TL;DR: In its 20th year, the SMART analysis results pages have been streamlined again and its information sources have been updated, and the internal full text search engine has been redesigned and updated, resulting in greatly increased search speed.

...read moreread less