Home
/
Authors
/
Renata C. Geer

Author

Renata C. Geer

Bio: Renata C. Geer is an academic researcher from National Institutes of Health. The author has contributed to research in topics: Entrez & Conserved Domain Database. The author has an hindex of 13, co-authored 17 publications receiving 10410 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

CDD: a Conserved Domain Database for the functional annotation of proteins

[...]

Aron Marchler-Bauer¹, Shennan Lu¹, John B. Anderson¹, Farideh Chitsaz¹, Myra K. Derbyshire¹, Carol DeWeese-Scott¹, Jessica H. Fong¹, Lewis Y. Geer¹, Renata C. Geer¹, Noreen R. Gonzales¹, Marc Gwadz¹, David I. Hurwitz¹, John D. Jackson¹, Zhaoxi Ke¹, Christopher J. Lanczycki¹, Fu-Ping Lu¹, Gabriele H. Marchler¹, Mikhail Mullokandov¹, Marina V. Omelchenko¹, Cynthia L. Robertson¹, James S. Song¹, Narmada Thanki¹, Roxanne A. Yamashita¹, Dachuan Zhang¹, Naigong Zhang¹, Chanjuan Zheng¹, Stephen H. Bryant¹ - Show less +23 more•Institutions (1)

National Institutes of Health¹

01 Jan 2011-Nucleic Acids Research

TL;DR: NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints.

...read moreread less

Abstract: NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

...read moreread less

2,934 citations

Journal Article•DOI•

CDD: NCBI's conserved domain database

[...]

Aron Marchler-Bauer¹, Myra K. Derbyshire¹, Noreen R. Gonzales¹, Shennan Lu¹, Farideh Chitsaz¹, Lewis Y. Geer¹, Renata C. Geer¹, Jane He¹, Marc Gwadz¹, David I. Hurwitz¹, Christopher J. Lanczycki¹, Fu Lu¹, Gabriele H. Marchler¹, James S. Song¹, Narmada Thanki¹, Zhouxi Wang¹, Roxanne A. Yamashita¹, Dachuan Zhang¹, Chanjuan Zheng¹, Stephen H. Bryant¹ - Show less +16 more•Institutions (1)

National Institutes of Health¹

28 Jan 2015-Nucleic Acids Research

TL;DR: NCBI's CDD, the Conserved Domain Database, enters its 15th year as a public resource for the annotation of proteins with the location of conserved domain footprints and aims at increasing coverage and providing finer-grained classifications of common protein domains.

...read moreread less

Abstract: NCBI's CDD, the Conserved Domain Database, enters its 15th year as a public resource for the annotation of proteins with the location of conserved domain footprints. Going forward, we strive to improve the coverage and consistency of domain annotation provided by CDD. We maintain a live search system as well as an archive of pre-computed domain annotation for sequences tracked in NCBI's Entrez protein database, which can be retrieved for single sequences or in bulk. We also maintain import procedures so that CDD contains domain models and domain definitions provided by several collections available in the public domain, as well as those produced by an in-house curation effort. The curation effort aims at increasing coverage and providing finer-grained classifications of common protein domains, for which a wealth of functional and structural data has become available. CDD curation generates alignment models of representative sequence fragments, which are in agreement with domain boundaries as observed in protein 3D structure, and which model the structurally conserved cores of domain families as well as annotate conserved features. CDD can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

...read moreread less

2,821 citations

Journal Article•DOI•

CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.

[...]

Aron Marchler-Bauer¹, Yu Bo¹, Lianyi Han¹, Jane He¹, Christopher J. Lanczycki¹, Shennan Lu¹, Farideh Chitsaz¹, Myra K. Derbyshire¹, Renata C. Geer¹, Noreen R. Gonzales¹, Marc Gwadz¹, David I. Hurwitz¹, Fu Lu¹, Gabriele H. Marchler¹, James S. Song¹, Narmada Thanki¹, Zhouxi Wang¹, Roxanne A. Yamashita¹, Dachuan Zhang¹, Chanjuan Zheng¹, Lewis Y. Geer¹, Stephen H. Bryant¹ - Show less +18 more•Institutions (1)

National Institutes of Health¹

04 Jan 2017-Nucleic Acids Research

TL;DR: NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints.

...read moreread less

Abstract: NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families. CDD also supports comparative analyses of protein families via conserved domain architectures, and a recent curation effort focuses on providing functional characterizations of distinct subfamily architectures using SPARCLE: Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

...read moreread less

2,052 citations

Journal Article•DOI•

CDD/SPARCLE: the conserved domain database in 2020

[...]

Shennan Lu¹, Jiyao Wang¹, Farideh Chitsaz¹, Myra K. Derbyshire¹, Renata C. Geer¹, Noreen R. Gonzales¹, Marc Gwadz¹, David I. Hurwitz¹, Gabriele H. Marchler¹, James S. Song¹, Narmada Thanki¹, Roxanne A. Yamashita¹, Mingzhang Yang¹, Dachuan Zhang¹, Chanjuan Zheng¹, Christopher J. Lanczycki¹, Aron Marchler-Bauer¹ - Show less +13 more•Institutions (1)

National Institutes of Health¹

08 Jan 2020-Nucleic Acids Research

TL;DR: As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research.

...read moreread less

Abstract: As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq. These architecture definitions are available via SPARCLE, the Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

...read moreread less

1,515 citations

Journal Article•DOI•

CDD: specific functional annotation with the Conserved Domain Database.

[...]

Aron Marchler-Bauer¹, John B. Anderson¹, Farideh Chitsaz¹, Myra K. Derbyshire¹, Carol DeWeese-Scott¹, Jessica H. Fong¹, Lewis Y. Geer¹, Renata C. Geer¹, Noreen R. Gonzales¹, Marc Gwadz¹, Siqian He¹, David I. Hurwitz¹, John D. Jackson¹, Zhaoxi Ke¹, Christopher J. Lanczycki¹, Cynthia A. Liebert¹, Chunlei Liu¹, Fu-er Lu¹, Shennan Lu¹, Gabriele H. Marchler¹, Mikhail Mullokandov¹, James S. Song¹, Asba Tasneem¹, Narmada Thanki¹, Roxanne A. Yamashita¹, Dachuan Zhang¹, Naigong Zhang¹, Stephen H. Bryant¹ - Show less +24 more•Institutions (1)

National Institutes of Health¹

01 Jan 2009-Nucleic Acids Research

TL;DR: NCBI's Conserved Domain Database is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution, and provides annotation of domain footprints and conserved functional sites on protein sequences.

...read moreread less

Abstract: NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution The collection can be accessed at http://wwwncbinlmnihgov/Structure/cdd/cddshtml, and is also part of NCBI's Entrez query and retrieval system, cross-linked to numerous other resources CDD provides annotation of domain footprints and conserved functional sites on protein sequences Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI's Entrez system, and CDD's collection of models can be queried with novel protein sequences via the CD-Search service at http://wwwncbinlmnihgov/Structure/cdd/wrpsbcgi Starting with the latest version of CDD, v214, information from redundant and homologous domain models is summarized at a superfamily level, and domain annotation on proteins is flagged as either ‘specific’ (identifying molecular function with high confidence) or as ‘non-specific’ (identifying superfamily membership only)

...read moreread less

1,115 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding.

[...]

Roujian Lu¹, Xiang Zhao¹, Juan Li², Peihua Niu¹, Bo Yang³, Honglong Wu, Wenling Wang¹, Hao Song⁴, Baoying Huang¹, Na Zhu¹, Yuhai Bi⁴, Xuejun Ma¹, Faxian Zhan³, Liang Wang⁴, Tao Hu², Hong Zhou², Zhenhong Hu, Weimin Zhou¹, Li Zhao¹, Jing Chen⁵, Yao Meng¹, Ji Wang¹, Yang Lin, Jianying Yuan, Zhihao Xie, Jinmin Ma, William J. Liu¹, Dayan Wang¹, Wenbo Xu¹, Edward C. Holmes⁶, George F. Gao¹, George F. Gao⁴, Guizhen Wu¹, Weijun Chen, Weifeng Shi², Wenjie Tan¹, Wenjie Tan⁴ - Show less +33 more•Institutions (6)

Chinese Center for Disease Control and Prevention¹, Peking Union Medical College², Centers for Disease Control and Prevention³, Chinese Academy of Sciences⁴, Wenzhou Medical College⁵, University of Sydney⁶

22 Feb 2020-The Lancet

TL;DR: The phylogenetic analysis suggests that bats might be the original host of this virus, an animal sold at the seafood market in Wuhan might represent an intermediate host facilitating the emergence of the virus in humans.

...read moreread less

9,474 citations

Journal Article•DOI•

The Phyre2 web portal for protein modeling, prediction and analysis

[...]

Lawrence A. Kelley¹, Stefans Mezulis¹, Christopher M. Yates¹, Christopher M. Yates², Mark N. Wass¹, Mark N. Wass³, Michael J.E. Sternberg¹ - Show less +3 more•Institutions (3)

Imperial College London¹, University College London², University of Kent³

07 May 2015-Nature Protocols

TL;DR: An updated protocol for Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants for a user's protein sequence.

...read moreread less

Abstract: Phyre2 is a web-based tool for predicting and analyzing protein structure and function. Phyre2 uses advanced remote homology detection methods to build 3D models, predict ligand binding sites, and analyze amino acid variants in a protein sequence. Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2 . A typical structure prediction will be returned between 30 min and 2 h after submission.

...read moreread less

7,941 citations

Journal Article•DOI•

NCBI GEO: archive for functional genomics data sets—update

[...]

Tanya Barrett¹, Stephen E. Wilhite¹, Pierre Ledoux¹, Carlos Evangelista¹, Irene F. Kim¹, Maxim Tomashevsky¹, Kimberly A. Marshall¹, Katherine Phillippy¹, Patti M. Sherman¹, Michelle Holko¹, Andrey Yefanov¹, Hye Seung Lee¹, Naigong Zhang¹, Cynthia L. Robertson¹, Nadezhda Serova¹, Sean Davis¹, Alexandra Soboleva¹ - Show less +13 more•Institutions (1)

National Institutes of Health¹

27 Nov 2012-Nucleic Acids Research

TL;DR: The Gene Expression Omnibus is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community and supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable.

...read moreread less

Abstract: The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

...read moreread less

6,683 citations

Journal Article•DOI•

TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data.

[...]

Chengjie Chen¹, Hao Chen², Yi Zhang, Hannah R. Thomas³, Margaret H. Frank³, Yehua He¹, Rui Xia - Show less +3 more•Institutions (3)

South China Agricultural University¹, Hunan Agricultural University², Cornell University³

03 Aug 2020-Molecular Plant

TL;DR: The toolkit incorporates over 130 functions, which are designed to meet the increasing demand for big-data analyses, ranging from bulk sequence processing to interactive data visualization, and a new plotting engine developed to maximum their interactive ability.

...read moreread less

5,173 citations

Journal Article•DOI•

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

[...]

Ross Overbeek¹, Robert Olson¹, Gordon D. Pusch¹, Gary J. Olsen¹, James J. Davis¹, Terry Disz¹, Robert Edwards², Svetlana Gerdes¹, Bruce Parrello¹, Maulik Shukla³, Veronika Vonstein¹, Alice R. Wattam³, Fangfang Xia¹, Rick Stevens¹ - Show less +10 more•Institutions (3)

University of Illinois at Urbana–Champaign¹, San Diego State University², Virginia Tech³

01 Jan 2014-Nucleic Acids Research

TL;DR: The interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources are described.

...read moreread less

Abstract: In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

...read moreread less

3,415 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse