Home
/
Authors
/
Robert Gentleman

Author

Robert Gentleman

Other affiliations: Harvard University, Brigham and Women's Hospital, Fred Hutchinson Cancer Research Center ...read more

Bio: Robert Gentleman is an academic researcher from Genentech. The author has contributed to research in topics: Bioconductor & Gene expression profiling. The author has an hindex of 52, co-authored 139 publications receiving 48510 citations. Previous affiliations of Robert Gentleman include Harvard University & Brigham and Women's Hospital.

Papers published on a yearly basis

2023
2021
2020
2019
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1997
1996
1994

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Recurrent Loss of NFE2L2 Exon 2 Is a Mechanism for Nrf2 Pathway Activation in Human Cancers

[...]

Leonard D. Goldstein¹, James Lee¹, Florian Gnad¹, Christiaan Klijn¹, Annalisa Schaub¹, Jens Reeder¹, Anneleen Daemen¹, Corey E. Bakalarski¹, Thomas Holcomb¹, David S. Shames¹, Ryan J. Hartmaier², Juliann Chmielecki², Somasekar Seshagiri¹, Robert Gentleman¹, David Stokoe¹ - Show less +11 more•Institutions (2)

Genentech¹, Foundation Medicine²

06 Sep 2016-Cell Reports

TL;DR: An analysis of splice variants in oncogenes revealed that such tumors express abnormal transcript variants from the NFE2L2 gene that lack exon 2, or exons 2 and 3, and encode Nrf2 protein isoforms missing the KEAP1 interaction domain.

...read moreread less

142 citations

Journal Article•DOI•

Reproducible research: a bioinformatics case study.

[...]

Robert Gentleman¹•Institutions (1)

Harvard University¹

11 Jan 2005-Statistical Applications in Genetics and Molecular Biology

TL;DR: The authors apply these concepts to a seminal paper in bioinformatics, namely The Molecular Classification of Cancer, Golub et al (1999), and demonstrate that such a reproduction is possible and instead concentrate on demonstrating the usefulness of the compendium concept itself.

...read moreread less

Abstract: While scientific research and the methodologies involved have gone through substantial technological evolution the technology involved in the publication of the results of these endeavors has remained relatively stagnant. Publication is largely done in the same manner today as it was fifty years ago. Many journals have adopted electronic formats, however, their orientation and style is little different from a printed document. The documents tend to be static and take little advantage of computational resources that might be available. Recent work, Gentleman and Temple Lang (2003), suggests a methodology and basic infrastructure that can be used to publish documents in a substantially different way. Their approach is suitable for the publication of papers whose message relies on computation. Stated quite simply, Gentleman and Temple Lang (2003) propose a paradigm where documents are mixtures of code and text. Such documents may be self-contained or they may be a component of a compendium which provides the infrastructure needed to provide access to data and supporting software. These documents, or compendiums, can be processed in a number of different ways. One transformation will be to replace the code with its output -- thereby providing the familiar, but limited, static document. In this paper we apply these concepts to a seminal paper in bioinformatics, namely The Molecular Classification of Cancer, Golub et al (1999). The authors of that paper have generously provided data and other information that have allowed us to largely reproduce their results. Rather than reproduce this paper exactly we demonstrate that such a reproduction is possible and instead concentrate on demonstrating the usefulness of the compendium concept itself.

...read moreread less

142 citations

Journal Article•DOI•

Differential genomic targeting of the transcription factor TAL1 in alternate haematopoietic lineages

[...]

Carmen G. Palii¹, Carolina Perez-Iratxeta¹, Zizhen Yao², Yi Cao², Fengtao Dai³, Fengtao Dai¹, Jerry Davison², Harold L. Atkins¹, David S. Allan¹, F. Jeffrey Dilworth¹, F. Jeffrey Dilworth³, Robert Gentleman², Stephen J. Tapscott², Marjorie Brand³, Marjorie Brand¹ - Show less +11 more•Institutions (3)

Ottawa Hospital Research Institute¹, Fred Hutchinson Cancer Research Center², University of Ottawa³

02 Feb 2011-The EMBO Journal

TL;DR: In this article, the authors used a combination of ChIP sequencing and gene expression profiling to compare the function of TAL1 in normal erythroid and leukaemic T cells.

...read moreread less

Abstract: TAL1/SCL is a master regulator of haematopoiesis whose expression promotes opposite outcomes depending on the cell type: differentiation in the erythroid lineage or oncogenesis in the T-cell lineage. Here, we used a combination of ChIP sequencing and gene expression profiling to compare the function of TAL1 in normal erythroid and leukaemic T cells. Analysis of the genome-wide binding properties of TAL1 in these two haematopoietic lineages revealed new insight into the mechanism by which transcription factors select their binding sites in alternate lineages. Our study shows limited overlap in the TAL1-binding profile between the two cell types with an unexpected preference for ETS and RUNX motifs adjacent to E-boxes in the T-cell lineage. Furthermore, we show that TAL1 interacts with RUNX1 and ETS1, and that these transcription factors are critically required for TAL1 binding to genes that modulate T-cell differentiation. Thus, our findings highlight a critical role of the cellular environment in modulating transcription factor binding, and provide insight into the mechanism by which TAL1 inhibits differentiation leading to oncogenesis in the T-cell lineage.

...read moreread less

141 citations

Journal Article•DOI•

Gene expression profiles of B-lineage adult acute lymphocytic leukemia reveal genetic patterns that identify lineage derivation and distinct mechanisms of transformation.

[...]

Sabina Chiaretti¹, Xiaochun Li¹, Robert Gentleman¹, Antonella Vitale², Kathy S. Wang¹, Franco Mandelli², Robin Foà, Jerome Ritz¹ - Show less +4 more•Institutions (2)

Harvard University¹, Sapienza University of Rome²

15 Oct 2005-Clinical Cancer Research

TL;DR: Genomic signatures are associated with phenotypically and molecularly well defined subgroups of adult ALL, which identifies genes associated with poor outcome in cases without molecular aberrations and specific genes that may be new therapeutic targets in adult ALL.

...read moreread less

Abstract: Purpose: To characterize gene expression signatures in acute lymphocytic leukemia (ALL) cells associated with known genotypic abnormalities in adult patients. Experimental Design: Gene expression profiles from 128 adult patients with newly diagnosed ALL were characterized using high-density oligonucleotide microarrays. All patients were enrolled in the Italian GIMEMA multicenter clinical trial 0496 and samples had >90% leukemic cells. Uniform phenotypic, cytogenetic, and molecular data were also available for all cases. Results: T-lineage ALL was characterized by a homogeneous gene expression pattern, whereas several subgroups of B-lineage ALL were evident. Within B-lineage ALL, distinct signatures were associated with ALL1/AF4 and E2A/PBX1 gene rearrangements. Expression profiles associated with ALL1/AF4 and E2A/PBX1 are similar in adults and children. BCR/ABL + gene expression pattern was more heterogeneous and was most similar to ALL without known molecular rearrangements. We also identified a set of 83 genes that were highly expressed in leukemia blasts from patients without known molecular abnormalities who subsequently relapsed following therapy. Supervised analysis of kinase genes revealed a high-level FLT3 expression in a subset of cases without molecular rearrangements. Two other kinases (PRKCB1 and DDR1) were highly expressed in cases without molecular rearrangements, as well as in BCR/ABL-positive ALL. Conclusions: Genomic signatures are associated with phenotypically and molecularly well defined subgroups of adult ALL. Genomic profiling also identifies genes associated with poor outcome in cases without molecular aberrations and specific genes that may be new therapeutic targets in adult ALL.

...read moreread less

140 citations

Journal Article•DOI•

Global expression changes of constitutive and hormonally regulated genes during endometrial neoplastic transformation.

[...]

George L. Mutter¹, Jan P.A. Baak¹, Jeffrey T. Fitzgerald¹, Robert Gray², Donna Neuberg², Gregory A. Kust¹, Robert Gentleman², Steven R. Gullans¹, Lee-Jen Wei², Marsha A. Wilcox² - Show less +6 more•Institutions (2)

Brigham and Women's Hospital¹, Harvard University²

01 Nov 2001-Gynecologic Oncology

TL;DR: It is found that 100 genes which are hormonally regulated in normal tissues are expressed in a disordered and heterogeneous fashion in cancers, with tumors resembling proliferative more than secretory endometrium.

...read moreread less

140 citations

1
2
3
…
4
5
6
7
8
9
10
…
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Michael I. Love², Wolfgang Huber, Simon Anders•Institutions (2)

Max Planck Society¹, Harvard University²

05 Dec 2014-Genome Biology

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

...read moreread less

47,038 citations

Journal Article•DOI•

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

[...]

Mark D. Robinson¹, Davis J. McCarthy¹, Gordon K. Smyth¹•Institutions (1)

Walter and Eliza Hall Institute of Medical Research¹

01 Jan 2010-Bioinformatics

TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.

...read moreread less

Abstract: Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

...read moreread less

29,413 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth⁵, Gordon K. Smyth¹ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Journal Article•DOI•

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

[...]

Aaron McKenna¹, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran V. Garimella, David Altshuler, Stacey Gabriel, Mark J. Daly, Mark A. DePristo - Show less +7 more•Institutions (1)

Broad Institute¹

01 Sep 2010-Genome Research

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

20,557 citations

Posted Content•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Wolfgang Huber, Simon Anders•Institutions (1)

Harvard University¹

17 Nov 2014-bioRxiv

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-Seq data, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data. DESeq2 uses shrinkage estimation for dispersions and fold changes to improve stability and interpretability of the estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression and facilitates downstream tasks such as gene ranking and visualization. DESeq2 is available as an R/Bioconductor package.

...read moreread less

17,014 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse