Home
/
Authors
/
Tony Burdett

Author

Tony Burdett

Other affiliations: City University of New York, Wellcome Trust, University of Pennsylvania ...read more

Bio: Tony Burdett is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Ontology (information science) & Medicine. The author has an hindex of 24, co-authored 51 publications receiving 9472 citations. Previous affiliations of Tony Burdett include City University of New York & Wellcome Trust.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2000

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.

[...]

Annalisa Buniello¹, Jacqueline A. L. MacArthur¹, Maria Cerezo¹, Laura W. Harris¹, James D. Hayhurst¹, Cinzia Malangone¹, Aoife McMahon¹, Joannella Morales¹, Edward Mountjoy², Edward Mountjoy³, Elliot Sollis¹, Daniel Suveges¹, Olga Vrousgou¹, Patricia L. Whetzel¹, M. Ridwan Amode¹, Jose A. Guillen¹, Harpreet Singh Riat¹, Stephen J. Trevanion¹, Peggy Hall⁴, Heather Junkins⁴, Paul Flicek¹, Tony Burdett¹, Lucia A. Hindorff⁴, Fiona Cunningham¹, Helen Parkinson¹ - Show less +21 more•Institutions (4)

European Bioinformatics Institute¹, Wellcome Trust Sanger Institute², University of Oxford³, National Institutes of Health⁴

08 Jan 2019-Nucleic Acids Research

TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.

...read moreread less

Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

...read moreread less

2,878 citations

Journal Article•DOI•

The NHGRI GWAS Catalog, a curated resource of SNP-trait associations

[...]

Danielle Welter¹, Jacqueline A. L. MacArthur¹, Joannella Morales¹, Tony Burdett¹, Peggy Hall¹, Heather Junkins¹, Alan Klemm¹, Paul Flicek¹, Teri A. Manolio¹, Lucia A. Hindorff¹, Helen Parkinson¹ - Show less +7 more•Institutions (1)

National Institutes of Health¹

01 Jan 2014-Nucleic Acids Research

TL;DR: A number of recent improvements to theNHGRI Catalog of Published Genome-Wide Association Studies are presented, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.

...read moreread less

Abstract: The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100000 singlenucleotide polymorphisms (SNPs) and all SNP-trait associations with P <110 5 . The Catalog includes 1751 curated publications of 11912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs’ chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.

...read moreread less

2,755 citations

Journal Article•DOI•

The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog).

[...]

Jacqueline A. L. MacArthur¹, Emily H. Bowler¹, Maria Cerezo¹, Laurent Gil¹, Peggy Hall², Emma Hastings¹, Heather Junkins², Aoife McMahon¹, Annalisa Milano¹, Joannella Morales¹, Zoë May Pendlington¹, Danielle Welter¹, Tony Burdett¹, Lucia A. Hindorff², Paul Flicek¹, Fiona Cunningham¹, Helen Parkinson¹ - Show less +13 more•Institutions (2)

European Bioinformatics Institute¹, National Institutes of Health²

04 Jan 2017-Nucleic Acids Research

TL;DR: Improvements to the NHGRI-EBI GWAS Catalog improved the data release frequency by increasing automation of curation and providing scaling improvements, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future.

...read moreread less

Abstract: The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future.

...read moreread less

1,903 citations

Journal Article•DOI•

ArrayExpress update—simplifying data submissions

[...]

Nikolay Kolesnikov¹, Emma Hastings¹, Maria Keays¹, Olga Melnichuk¹, Y. Amy Tang¹, Eleanor Williams¹, Miroslaw Dylag¹, Natalja Kurbatova¹, Marco Brandizi¹, Tony Burdett¹, Karyn Megy¹, Ekaterina Pilicheva¹, Gabriella Rustici¹, Andrew Tikhonov¹, Helen Parkinson¹, Robert Petryszak¹, Ugis Sarkans¹, Alvis Brazma¹ - Show less +14 more•Institutions (1)

European Bioinformatics Institute¹

28 Jan 2015-Nucleic Acids Research

TL;DR: The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold and will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines in the near future.

...read moreread less

Abstract: The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42 000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.

...read moreread less

676 citations

Journal Article•DOI•

Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants

[...]

Robert Petryszak¹, Maria Keays¹, Y. Amy Tang¹, Nuno A. Fonseca¹, Elisabet Barrera¹, Tony Burdett¹, Anja Füllgrabe¹, Alfonso Muñoz-Pomer Fuentes¹, Simon Jupp¹, Satu Koskinen¹, Oliver Mannion¹, Laura Huerta¹, Karyn Megy¹, Catherine Snow¹, Eleanor Williams¹, Mitra Barzine¹, Emma Hastings¹, Hendrik Weisser², James C. Wright², Pankaj Jaiswal³, Wolfgang Huber¹, Jyoti S. Choudhary², Helen Parkinson¹, Alvis Brazma¹ - Show less +20 more•Institutions (3)

European Bioinformatics Institute¹, Wellcome Trust Sanger Institute², Oregon State University³

04 Jan 2016-Nucleic Acids Research

TL;DR: The first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues, and novel analyses and visualisations include: ‘enrichment’ in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains.

...read moreread less

Abstract: Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons—estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: ‘enrichment’ in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.

...read moreread less

509 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth¹, Gordon K. Smyth⁵ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Journal Article•DOI•

NCBI GEO: archive for functional genomics data sets—update

[...]

Tanya Barrett¹, Stephen E. Wilhite¹, Pierre Ledoux¹, Carlos Evangelista¹, Irene F. Kim¹, Maxim Tomashevsky¹, Kimberly A. Marshall¹, Katherine Phillippy¹, Patti M. Sherman¹, Michelle Holko¹, Andrey Yefanov¹, Hye Seung Lee¹, Naigong Zhang¹, Cynthia L. Robertson¹, Nadezhda Serova¹, Sean Davis¹, Alexandra Soboleva¹ - Show less +13 more•Institutions (1)

National Institutes of Health¹

27 Nov 2012-Nucleic Acids Research

TL;DR: The Gene Expression Omnibus is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community and supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable.

...read moreread less

Abstract: The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

...read moreread less

6,683 citations

Journal Article•DOI•

The Molecular Signatures Database Hallmark Gene Set Collection

[...]

Arthur Liberzon¹, Chet Birger¹, Helga Thorvaldsdottir¹, Mahmoud Ghandi¹, Jill P. Mesirov², Pablo Tamayo² - Show less +2 more•Institutions (2)

Broad Institute¹, University of California, San Diego²

23 Dec 2015-Cell systems

TL;DR: A combination of automated approaches and expert curation is used to develop a collection of "hallmark" gene sets, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression in MSigDB.

...read moreread less

Abstract: The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of “hallmark” gene sets as part of MSigDB. Each hallmark in this collection consists of a “refined” gene set, derived from multiple “founder” sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.

...read moreread less

6,062 citations

Journal Article•DOI•

GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses.

[...]

Zefang Tang¹, Chenwei Li¹, Boxi Kang¹, Ge Gao¹, Cheng Li¹, Zemin Zhang - Show less +2 more•Institutions (1)

Peking University¹

03 Jul 2017-Nucleic Acids Research

TL;DR: GEPIA (Gene Expression Profiling Interactive Analysis) fills in the gap between cancer genomics big data and the delivery of integrated information to end users, thus helping unleash the value of the current data resources.

...read moreread less

Abstract: Tremendous amount of RNA sequencing data have been produced by large consortium projects such as TCGA and GTEx, creating new opportunities for data mining and deeper understanding of gene functions. While certain existing web servers are valuable and widely used, many expression analysis functions needed by experimental biologists are still not adequately addressed by these tools. We introduce GEPIA (Gene Expression Profiling Interactive Analysis), a web-based tool to deliver fast and customizable functionalities based on TCGA and GTEx data. GEPIA provides key interactive and customizable functions including differential expression analysis, profiling plotting, correlation analysis, patient survival analysis, similar gene detection and dimensionality reduction analysis. The comprehensive expression analyses with simple clicking through GEPIA greatly facilitate data mining in wide research areas, scientific discussion and the therapeutic discovery process. GEPIA fills in the gap between cancer genomics big data and the delivery of integrated information to end users, thus helping unleash the value of the current data resources. GEPIA is available at http://gepia.cancer-pku.cn/.

...read moreread less

5,980 citations

Journal Article•DOI•

The PRIDE database and related tools and resources in 2019: improving support for quantification data.

[...]

Yasset Perez-Riverol¹, Attila Csordas¹, Jingwen Bai¹, Manuel Bernal-Llinares¹, Suresh Hewapathirana¹, Deepti J. Kundu¹, Avinash Inuganti¹, Johannes Griss¹, Johannes Griss², Gerhard Mayer³, Martin Eisenacher³, Enrique Perez¹, Julian Uszkoreit³, Julianus Pfeuffer⁴, Timo Sachsenberg⁴, Şule Yılmaz⁵, Shivani Tiwary⁵, Juergen Cox⁵, Enrique Audain, Mathias Walzer¹, Andrew F. Jarnuczak¹, Tobias Ternent¹, Alvis Brazma¹, Juan Antonio Vizcaíno¹ - Show less +20 more•Institutions (5)

European Bioinformatics Institute¹, Medical University of Vienna², Ruhr University Bochum³, University of Tübingen⁴, Max Planck Society⁵

08 Jan 2019-Nucleic Acids Research

TL;DR: Key statistics on the current data contents and volume of downloads are outlined, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas are outlined.

...read moreread less

Abstract: The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world’s largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.

...read moreread less

5,735 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse