Home
/
Authors
/
Michael A. Gillette

Author

Michael A. Gillette

Other affiliations: Michael E. DeBakey Veterans Affairs Medical Center in Houston, Baylor College of Medicine, Veterans Health Administration ...read more

Bio: Michael A. Gillette is an academic researcher from Harvard University. The author has contributed to research in topics: Proteomics & Proteogenomics. The author has an hindex of 38, co-authored 76 publications receiving 37340 citations. Previous affiliations of Michael A. Gillette include Michael E. DeBakey Veterans Affairs Medical Center in Houston & Baylor College of Medicine.

Topics: Proteomics, Proteogenomics, Proteome, Biomarker discovery, Medicine ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2009
2008
2007
2006
2005
2001
1996
1995

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

[...]

Aravind Subramanian¹, Pablo Tamayo¹, Vamsi K. Mootha², Sayan Mukherjee³, Benjamin L. Ebert², Michael A. Gillette², Amanda G. Paulovich⁴, Scott L. Pomeroy², Todd R. Golub², Eric S. Lander¹, Jill P. Mesirov¹ - Show less +7 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², Duke University³, Fred Hutchinson Cancer Research Center⁴

25 Oct 2005-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.

...read moreread less

Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

...read moreread less

34,830 citations

Journal Article•DOI•

Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses

[...]

Arindam Bhattacharjee¹, William G. Richards¹, Jane Staunton², Cheng Li¹, Stefano Monti², Priya Vasa¹, Christine Ladd², Javad Beheshti¹, Raphael Bueno¹, Michael A. Gillette², Massimo Loda¹, Griffin M. Weber¹, Eugene J. Mark¹, Eric S. Lander², Wing Hung Wong¹, Bruce E. Johnson¹, Todd R. Golub², Todd R. Golub¹, David J. Sugarbaker¹, Matthew Meyerson¹ - Show less +16 more•Institutions (2)

Harvard University¹, Massachusetts Institute of Technology²

20 Nov 2001-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A molecular taxonomy of lung carcinoma is generated and results suggest that integration of expression profile data with clinical parameters could aid in diagnosis of lung cancer patients.

...read moreread less

Abstract: We have generated a molecular taxonomy of lung carcinoma, the leading cause of cancer death in the United States and worldwide. Using oligonucleotide microarrays, we analyzed mRNA expression levels corresponding to 12,600 transcript sequences in 186 lung tumor samples, including 139 adenocarcinomas resected from the lung. Hierarchical and probabilistic clustering of expression data defined distinct subclasses of lung adenocarcinoma. Among these were tumors with high relative expression of neuroendocrine genes and of type II pneumocyte genes, respectively. Retrospective analysis revealed a less favorable outcome for the adenocarcinomas with neuroendocrine gene expression. The diagnostic potential of expression profiling is emphasized by its ability to discriminate primary lung adenocarcinomas from metastases of extra-pulmonary origin. These results suggest that integration of expression profile data with clinical parameters could aid in diagnosis of lung cancer patients.

...read moreread less

2,450 citations

Journal Article•DOI•

Protein biomarker discovery and validation: the long and uncertain path to clinical utility.

[...]

Nader Rifai¹, Michael A. Gillette², Steven A. Carr²•Institutions (2)

Boston Children's Hospital¹, Massachusetts Institute of Technology²

01 Aug 2006-Nature Biotechnology

TL;DR: Better understanding of the overall process of biomarker discovery and validation and of the challenges and strategies inherent in each phase should improve experimental study design, in turn increasing the efficiency of biomarkers development and facilitating the delivery and deployment of novel clinical tests.

...read moreread less

Abstract: Better biomarkers are urgently needed to improve diagnosis, guide molecularly targeted therapy and monitor activity and therapeutic response across a wide spectrum of disease. Proteomics methods based on mass spectrometry hold special promise for the discovery of novel biomarkers that might form the foundation for new clinical blood tests, but to date their contribution to the diagnostic armamentarium has been disappointing. This is due in part to the lack of a coherent pipeline connecting marker discovery with well-established methods for validation. Advances in methods and technology now enable construction of a comprehensive biomarker pipeline from six essential process components: candidate discovery, qualification, verification, research assay optimization, biomarker validation and commercialization. Better understanding of the overall process of biomarker discovery and validation and of the challenges and strategies inherent in each phase should improve experimental study design, in turn increasing the efficiency of biomarker development and facilitating the delivery and deployment of novel clinical tests.

...read moreread less

1,702 citations

Journal Article•DOI•

Proteogenomics connects somatic mutations to signalling in breast cancer

[...]

Philipp Mertins¹, D. R. Mani¹, Kelly V. Ruggles², Michael A. Gillette¹, Michael A. Gillette³, Karl R. Clauser¹, Pei Wang⁴, Xianlong Wang⁵, Jana W. Qiao¹, Song Cao⁶, Francesca Petralia⁴, Emily Kawaler², Filip Mundt¹, Filip Mundt⁷, Karsten Krug¹, Zhidong Tu⁴, Jonathan T. Lei⁸, Michael L. Gatza⁹, Matthew D. Wilkerson⁹, Charles M. Perou⁹, Venkata Yellapantula⁶, Kuan-lin Huang⁶, Chenwei Lin⁵, Michael D. McLellan⁶, Ping Yan⁵, Sherri R. Davies⁶, R. Reid Townsend⁶, Steven J. Skates³, Jing Wang¹⁰, Bing Zhang¹⁰, Christopher R. Kinsinger¹¹, Mehdi Mesri¹¹, Henry Rodriguez¹¹, Li Ding⁶, Amanda G. Paulovich⁵, David Fenyö², Matthew J. Ellis⁸, Steven A. Carr¹ - Show less +34 more•Institutions (11)

Broad Institute¹, New York University², Harvard University³, Icahn School of Medicine at Mount Sinai⁴, Fred Hutchinson Cancer Research Center⁵, Washington University in St. Louis⁶, Karolinska Institutet⁷, Baylor College of Medicine⁸, University of North Carolina at Chapel Hill⁹, Vanderbilt University¹⁰, National Institutes of Health¹¹

02 Jun 2016-Nature

TL;DR: It is demonstrated that proteogenomic analysis of breast cancer elucidates functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.

...read moreread less

Abstract: Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.

...read moreread less

1,296 citations

Journal Article•DOI•

Proteogenomic characterization of human colon and rectal cancer

[...]

Bing Zhang¹, Jing Wang¹, Xiaojing Wang¹, Jing Zhu¹, Qi Liu¹, Zhiao Shi¹, Matthew C. Chambers¹, Lisa J. Zimmerman¹, Kent Shaddox¹, Sangtae Kim², Sherri R. Davies³, Sean Wang⁴, Pei Wang⁵, Christopher R. Kinsinger⁶, Robert Rivers⁶, Henry Rodriguez⁶, R. Reid Townsend³, Matthew J. Ellis³, Steven A. Carr⁷, Steven A. Carr⁸, David L. Tabb¹, Robert J. Coffey¹, Robbert J.C. Slebos¹, Daniel C. Liebler¹, Michael A. Gillette⁸, Karl R. Klauser⁸, Eric Kuhn⁸, D. R. Mani⁸, Philipp Mertins⁸, Karen A. Ketchum, Amanda G. Paulovich⁴, Jeffrey R. Whiteaker⁴, Nathan Edwards⁹, Peter B. McGarvey⁹, Subha Madhavan⁹, Daniel W. Chan¹⁰, Akhilesh Pandey¹⁰, Ie Ming Shih¹⁰, Hui Zhang¹⁰, Zhen Zhang¹⁰, Heng Zhu¹⁰, Gordon Whiteley¹¹, Steven J. Skates⁸, Forest M. White⁷, Douglas A. Levine¹², Emily S. Boja⁶, Tara Hiltke⁶, Mehdi Mesri⁶, Kenna M. Shaw⁶, Stephen E. Stein¹³, David Fenyö¹⁴, Tao Liu², Jason E. McDermott², Samuel H. Payne², Karin D. Rodland², Richard D. Smith², Paul A. Rudnick, Michael Snyder¹⁵, Yingming Zhao¹⁶, Xian Chen¹⁷, David F. Ransohoff¹⁷, Andrew N. Hoofnagle¹⁸, Melinda E. Sanders¹, Yue Wang¹⁹, Li Ding³ - Show less +61 more•Institutions (19)

Vanderbilt University¹, Pacific Northwest National Laboratory², Washington University in St. Louis³, Fred Hutchinson Cancer Research Center⁴, Icahn School of Medicine at Mount Sinai⁵, National Institutes of Health⁶, Massachusetts Institute of Technology⁷, Harvard University⁸, Georgetown University⁹, Johns Hopkins University¹⁰, Leidos¹¹, Memorial Sloan Kettering Cancer Center¹², National Institute of Standards and Technology¹³, New York University¹⁴, Stanford University¹⁵, University of Chicago¹⁶, University of North Carolina at Chapel Hill¹⁷, University of Washington¹⁸, Virginia Tech¹⁹

18 Sep 2014-Nature

TL;DR: Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology.

...read moreread less

Abstract: Extensive genomic characterization of human cancers presents the problem of inference from genomic abnormalities to cancer phenotypes. To address this problem, we analysed proteomes of colon and rectal tumours characterized previously by The Cancer Genome Atlas (TCGA) and perform integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants. Messenger RNA transcript abundance did not reliably predict protein abundance differences between tumours. Proteomics identified five proteomic subtypes in the TCGA cohort, two of which overlapped with the TCGA 'microsatellite instability/CpG island methylation phenotype' transcriptomic subtype, but had distinct mutation, methylation and protein expression patterns associated with different clinical outcomes. Although copy number alterations showed strong cis- and trans-effects on mRNA abundance, relatively few of these extend to the protein level. Thus, proteomics data enabled prioritization of candidate driver genes. The chromosome 20q amplicon was associated with the largest global changes at both mRNA and protein levels; proteomics data highlighted potential 20q candidates, including HNF4A (hepatocyte nuclear factor 4, alpha), TOMM34 (translocase of outer mitochondrial membrane 34) and SRC (SRC proto-oncogene, non-receptor tyrosine kinase). Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology.

...read moreread less

1,183 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

[...]

Massachusetts Institute of Technology¹, Harvard University², Duke University³, Fred Hutchinson Cancer Research Center⁴

25 Oct 2005-Proceedings of the National Academy of Sciences of the United States of America

...read moreread less

34,830 citations

Journal Article•DOI•

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nature Protocols

TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

31,015 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth¹, Gordon K. Smyth⁵ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Journal Article•DOI•

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nucleic Acids Research

TL;DR: The survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

Abstract: Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

13,102 citations

Posted Content•

Inductive Representation Learning on Large Graphs

[...]

William L. Hamilton, Rex Ying, Jure Leskovec

07 Jun 2017-arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

Abstract: Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

...read moreread less

7,926 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse