Home
/
Authors
/
Jody Clements

Author

Jody Clements

Other affiliations: Institute for Systems Biology, Wellcome Trust Sanger Institute, University of Houston ...read more

Bio: Jody Clements is an academic researcher from Howard Hughes Medical Institute. The author has contributed to research in topics: Connectome & Mutation. The author has an hindex of 25, co-authored 35 publications receiving 34001 citations. Previous affiliations of Jody Clements include Institute for Systems Biology & Wellcome Trust Sanger Institute.

Topics: Connectome, Mutation, Germline mutation, Biology, Cancer ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Pfam protein families database

[...]

Marco Punta¹, Penny Coggill¹, Ruth Y. Eberhardt¹, Jaina Mistry¹, John Tate¹, Chris Boursnell¹, Ningze Pang¹, Kristoffer Forslund¹, Goran Ceric¹, Jody Clements¹, Andreas Heger¹, Liisa Holm¹, Erik L. L. Sonnhammer¹, Sean R. Eddy¹, Alex Bateman¹, Robert D. Finn¹ - Show less +12 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jan 2000-Nucleic Acids Research

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.

...read moreread less

Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

...read moreread less

14,075 citations

Journal Article•DOI•

Pfam: the protein families database.

[...]

Robert D. Finn¹, Alex Bateman², Jody Clements¹, Penelope Coggill², Ruth Y. Eberhardt², Sean R. Eddy¹, Andreas Heger, Kirstie Hetherington³, Liisa Holm, Jaina Mistry², Erik L. L. Sonnhammer⁴, John Tate², Marco Punta² - Show less +9 more•Institutions (4)

Howard Hughes Medical Institute¹, European Bioinformatics Institute², Wellcome Trust Sanger Institute³, Stockholm University⁴

01 Jan 2014-Nucleic Acids Research

TL;DR: Pfam as discussed by the authors is a widely used database of protein families, containing 14 831 manually curated entries in the current version, version 27.0, and has been updated several times since 2012.

...read moreread less

Abstract: Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.

...read moreread less

9,415 citations

Journal Article•DOI•

HMMER web server: interactive sequence similarity searching

[...]

Robert D. Finn¹, Jody Clements¹, Sean R. Eddy¹•Institutions (1)

Janelia Farm Research Campus¹

01 Jul 2011-Nucleic Acids Research

TL;DR: This work has focused on minimizing search times and the ability to rapidly display tabular results, regardless of the number of matches found, developing graphical summaries of the search results to provide quick, intuitive appraisement of them.

...read moreread less

Abstract: HMMER is a software suite for protein sequence similarity searches using probabilistic methods Previously, HMMER has mainly been available only as a computationally intensive UNIX command-line tool, restricting its use Recent advances in the software, HMMER3, have resulted in a 100-fold speed gain relative to previous versions It is now feasible to make efficient profile hidden Markov model (profile HMM) searches via the web A HMMER web server (http://hmmerjaneliaorg) has been designed and implemented such that most protein database searches return within a few seconds Methods are available for searching either a single protein sequence, multiple protein sequence alignment or profile HMM against a target sequence database, and for searching a protein sequence against Pfam The web server is designed to cater to a range of different user expertise and accepts batch uploading of multiple queries at once All search methods are also available as RESTful web services, thereby allowing them to be readily integrated as remotely executed tasks in locally scripted workflows We have focused on minimizing search times and the ability to rapidly display tabular results, regardless of the number of matches found, developing graphical summaries of the search results to provide quick, intuitive appraisement of them

...read moreread less

4,159 citations

Journal Article•DOI•

Patterns of somatic mutation in human cancer genomes

[...]

Christopher Greenman¹, Philip J. Stephens¹, Raffaella Smith¹, Gillian L. Dalgliesh¹, Christopher I. Hunter¹, Graham R. Bignell¹, Helen Davies¹, Jon W. Teague¹, Adam Butler¹, Claire Stevens¹, Sarah Edkins¹, Sarah O’Meara¹, Imre Vastrik², Esther Schmidt², Tim Avis¹, Syd Barthorpe¹, Gurpreet Bhamra¹, Gemma Buck¹, Bhudipa Choudhury¹, Jody Clements¹, Jennifer Cole¹, Ed Dicks¹, Simon A. Forbes¹, Kris Gray¹, Kelly Halliday¹, Rachel Harrison¹, Katy Hills¹, Jon Hinton¹, Andy Jenkinson¹, David T. Jones¹, Andy Menzies¹, Tatiana Mironenko¹, Janet Perry¹, Keiran Raine¹, Dave Richardson¹, Rebecca Shepherd¹, Alexandra Small¹, Calli Tofts¹, Jennifer Varian¹, Tony Webb¹, Sofie West¹, Sara Widaa¹, Andrew D. Yates¹, Daniel P. Cahill³, David N. Louis³, Peter Goldstraw, Andrew G. Nicholson, Francis Brasseur⁴, Leendert H. J. Looijenga⁵, Barbara L. Weber⁶, Yoke Eng Chiew⁷, Anna deFazio⁷, Mel Greaves⁸, Anthony R. Green⁹, Peter J. Campbell¹, Ewan Birney², Douglas F. Easton⁹, Georgia Chenevix-Trench¹⁰, Min-Han Tan¹¹, Sok Kean Khoo¹¹, Bin Tean Teh¹¹, Siu Tsan Yuen¹², Suet Yi Leung¹², Richard Wooster¹, P. Andrew Futreal¹, Michael R. Stratton¹, Michael R. Stratton⁸ - Show less +63 more•Institutions (12)

Wellcome Trust Sanger Institute¹, European Bioinformatics Institute², Harvard University³, Ludwig Institute for Cancer Research⁴, Erasmus University Rotterdam⁵, University of Pennsylvania⁶, University of Sydney⁷, Institute of Cancer Research⁸, University of Cambridge⁹, QIMR Berghofer Medical Research Institute¹⁰, Van Andel Institute¹¹, University of Hong Kong¹²

08 Mar 2007-Nature

TL;DR: More than 1,000 somatic mutations found in 274 megabases of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers reveal the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.

...read moreread less

Abstract: Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be 'passengers' that do not contribute to oncogenesis. However, there was evidence for 'driver' mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.

...read moreread less

2,732 citations

Journal Article•DOI•

The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website

[...]

Sally Bamford¹, Elisabeth Dawson², Simon A. Forbes², Jody Clements², Roger Pettett², Ahmet Dogan³, Adrienne M. Flanagan³, Jon W. Teague², P A Futreal, Michael R. Stratton², Richard Wooster² - Show less +7 more•Institutions (3)

Wellcome Trust Sanger Institute¹, Wellcome Trust², University College London³

13 Jul 2004-British Journal of Cancer

TL;DR: The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website have been developed to store somatic mutation data in a single location and display the data and other information related to human cancer.

...read moreread less

Abstract: The discovery of mutations in cancer genes has advanced our understanding of cancer. These results are dispersed across the scientific literature and with the availability of the human genome sequence will continue to accrue. The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website have been developed to store somatic mutation data in a single location and display the data and other information related to human cancer. To populate this resource, data has currently been extracted from reports in the scientific literature for somatic mutations in four genes, BRAF, HRAS, KRAS2 and NRAS. At present, the database holds information on 66 634 samples and reports a total of 10 647 mutations. Through the web pages, these data can be queried, displayed as figures or tables and exported in a number of formats. COSMIC is an ongoing project that will continue to curate somatic mutation data and release it through the website.

...read moreread less

1,163 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gene Ontology: tool for the unification of biology

[...]

M Ashburner¹, Catherine A. Ball, Judith A. Blake, David Botstein, Heather Butler, J. M. Cherry, Allan Peter Davis, Kara Dolinski, Selina S. Dwight, J.T. Eppig, Midori A. Harris, David P. Hill, Laurie Issel-Tarver, Andrew Kasarskis, Suzanna E. Lewis, John C. Matese, Joel E. Richardson, M. Ringwald, Gerald M. Rubin, Gavin Sherlock - Show less +16 more•Institutions (1)

Stanford University¹

01 May 2000-Nature Genetics

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.

...read moreread less

Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

...read moreread less

35,225 citations

Journal Article•DOI•

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

[...]

Kazutaka Katoh¹, Daron M. Standley¹•Institutions (1)

Osaka University¹

01 Apr 2013-Molecular Biology and Evolution

TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.

...read moreread less

Abstract: We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

...read moreread less

27,771 citations

Journal Article•DOI•

Initial sequencing and analysis of the human genome.

[...]

Eric S. Lander¹, Lauren Linton¹, Bruce W. Birren¹, Chad Nusbaum¹ +245 more•Institutions (29)

15 Feb 2001-Nature

TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

...read moreread less

Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

...read moreread less

22,269 citations

Journal Article•DOI•

Search and clustering orders of magnitude faster than BLAST

[...]

Robert C. Edgar

01 Oct 2010-Bioinformatics

TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.

...read moreread less

Abstract: Motivation: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. Results: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. Availability: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

17,301 citations

Journal Article•DOI•

The Pfam protein families database

[...]

Wellcome Trust Sanger Institute¹

01 Jan 2000-Nucleic Acids Research

...read moreread less

14,075 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse