Home
/
Authors
/
Robert Gentleman

Author

Robert Gentleman

Other affiliations: Harvard University, Brigham and Women's Hospital, Fred Hutchinson Cancer Research Center ...read more

Bio: Robert Gentleman is an academic researcher from Genentech. The author has contributed to research in topics: Bioconductor & Gene expression profiling. The author has an hindex of 52, co-authored 139 publications receiving 48510 citations. Previous affiliations of Robert Gentleman include Harvard University & Brigham and Women's Hospital.

Papers published on a yearly basis

2023
2021
2020
2019
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1997
1996
1994

Papers

PDF

Open Access

More filters

Journal Article•DOI•

flowCore: a Bioconductor package for high throughput flow cytometry.

[...]

Florian Hahne¹, Nolwenn LeMeur¹, Nolwenn LeMeur², Ryan R. Brinkman³, Byron Ellis, Perry D. Haaland⁴, Deepayan Sarkar¹, Josef Spidlen³, Errol Strain⁴, Robert Gentleman¹ - Show less +6 more•Institutions (4)

Fred Hutchinson Cancer Research Center¹, University of Rennes², BC Cancer Agency³, Research Triangle Park⁴

09 Apr 2009-BMC Bioinformatics

TL;DR: A set of flexible open source computational tools in the R package flowCore that constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians will foster the development of novel analytic methods for flow cytometry.

...read moreread less

Abstract: Background: Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates. Results: We developed a set of flexible open source computational tools in the R package flowCore to facilitate the analysis of these complex data. A key component of which is having suitable data structures that support the application of similar operations to a collection of samples or a clinical cohort. In addition, our software constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians. This platform will foster the development of novel analytic methods for flow cytometry. Conclusion: The software has been applied in the analysis of various data sets and its data structures have proven to be highly efficient in capturing and organizing the analytic work flow. Finally, a number of additional Bioconductor packages successfully build on the infrastructure provided by flowCore, open new avenues for flow data analysis.

...read moreread less

468 citations

Journal Article•DOI•

Genome-wide MyoD Binding in Skeletal Muscle Cells: A Potential for Broad Cellular Reprogramming

[...]

Yi Cao¹, Zizhen Yao¹, Deepayan Sarkar¹, Michael S. Lawrence¹, Gilson J. Sanchez², Gilson J. Sanchez¹, Maura H. Parker¹, Kyle L. MacQuarrie¹, Kyle L. MacQuarrie², Jerry Davison¹, Martin Morgan¹, Walter L. Ruzzo², Walter L. Ruzzo¹, Robert Gentleman¹, Stephen J. Tapscott¹, Stephen J. Tapscott² - Show less +12 more•Institutions (2)

Fred Hutchinson Cancer Research Center¹, University of Washington²

20 Apr 2010-Developmental Cell

TL;DR: Findings were that MyoD was constitutively bound to thousands of additional sites in both myoblasts and myotubes, and that the genome-wide binding of Myo D was associated with regional histone acetylation.

...read moreread less

459 citations

Journal Article•DOI•

DUX4 Activates Germline Genes, Retroelements, and Immune Mediators: Implications for Facioscapulohumeral Dystrophy

[...]

Linda Geng¹, Zizhen Yao¹, Lauren Snider¹, Abraham P. Fong¹, Jennifer N. Cech¹, Janet M. Young¹, Silvère M. van der Maarel², Walter L. Ruzzo³, Robert Gentleman⁴, Rabi Tawil⁵, Stephen J. Tapscott¹ - Show less +7 more•Institutions (5)

Fred Hutchinson Cancer Research Center¹, Leiden University Medical Center², University of Washington³, Genentech⁴, University of Rochester⁵

17 Jan 2012-Developmental Cell

TL;DR: It is shown that DUX4 binds and activates LTR elements from a class of MaLR endogenous primate retrotransposons and suppresses the innate immune response to viral infection, at least in part through the activation of DEFB103, a human defensin that can inhibit muscle differentiation.

...read moreread less

376 citations

Journal Article•DOI•

Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival

[...]

Sabina Chiaretti¹, Xiaochun Li¹, Xiaochun Li², Robert Gentleman², Robert Gentleman¹, Antonella Vitale², Antonella Vitale¹, Marco Vignetti², Marco Vignetti¹, Franco Mandelli², Franco Mandelli¹, Jerome Ritz³, Robin Foà¹, Robin Foà² - Show less +10 more•Institutions (3)

Brigham and Women's Hospital¹, Sapienza University of Rome², Harvard University³

01 Apr 2004-Blood

TL;DR: It is demonstrated that gene expression profiling can identify a limited number of genes that are predictive of response to induction therapy and remission duration in adult patients with T-ALL.

...read moreread less

367 citations

Journal Article•DOI•

MicroRNA Discovery and Profiling in Human Embryonic Stem Cells by Deep Sequencing of Small RNA Libraries

[...]

Merav Bar¹, Stacia K. Wyman¹, Brian R. Fritz¹, Junlin Qi², Kavita Garg², Kavita Garg¹, Rachael K. Parkin¹, Evan M. Kroh¹, Ausra Bendoraite¹, Patrick S. Mitchell¹, Angelique M. Nelson², Walter L. Ruzzo², Carol B. Ware², Jerald P. Radich¹, Robert Gentleman¹, Hannele Ruohola-Baker², Muneesh Tewari¹ - Show less +13 more•Institutions (2)

Fred Hutchinson Cancer Research Center¹, University of Washington²

01 Oct 2008-Stem Cells

TL;DR: The data indicate that hESC express a larger complement of miRNAs than previously appreciated, and they provide a resource for additional studies of miRNA regulation of h ESC physiology.

...read moreread less

Abstract: We used massively parallel pyrosequencing to discover and characterize microRNAs (miRNAs) expressed in human embryonic stem cells (hESC). Sequencing of small RNA cDNA libraries derived from undifferentiated hESC and from isogenic differentiating cultures yielded a total of 425,505 high-quality sequence reads. A custom data analysis pipeline delineated expression profiles for 191 previously annotated miRNAs, 13 novel miRNAs and 56 candidate miRNAs. Further characterization of a subset of the novel miRNAs in Dicer-knockdown hESC demonstrated Dicer-dependent expression, providing additional validation of our results. A set of 14 miRNAs (9 known and 5 novel) were noted to be expressed in undifferentiated hESC and then strongly down-regulated with differentiation. Functional annotation analysis of predicted targets of these miRNAs and comparison to a null model using non-hESC-expressed miRNAs identified statistically enriched functional categories, including chromatin remodeling and lineage-specific differentiation annotations. Finally, integration of our data with genome-wide chromatin immunoprecipitation data on OCT4, SOX2 and NANOG binding sites implicates these transcription factors in the regulation of nine of the novel/candidate miRNAs identified here. Comparison of our results to those of recent deep sequencing studies in mouse ESC and human ESC show that most of the novel/candidate miRNAs found here were not identified in the other studies. The data indicate that hESC express a larger complement of miRNAs than previously appreciated, and provide a resource for further studies of miRNA regulation of hESC physiology.

...read moreread less

318 citations

…
1
2
3
4
5
6
7
…
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Michael I. Love², Wolfgang Huber, Simon Anders•Institutions (2)

Max Planck Society¹, Harvard University²

05 Dec 2014-Genome Biology

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

...read moreread less

47,038 citations

Journal Article•DOI•

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

[...]

Mark D. Robinson¹, Davis J. McCarthy¹, Gordon K. Smyth¹•Institutions (1)

Walter and Eliza Hall Institute of Medical Research¹

01 Jan 2010-Bioinformatics

TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.

...read moreread less

Abstract: Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

...read moreread less

29,413 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth⁵, Gordon K. Smyth¹ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Journal Article•DOI•

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

[...]

Aaron McKenna¹, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran V. Garimella, David Altshuler, Stacey Gabriel, Mark J. Daly, Mark A. DePristo - Show less +7 more•Institutions (1)

Broad Institute¹

01 Sep 2010-Genome Research

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

20,557 citations

Posted Content•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Wolfgang Huber, Simon Anders•Institutions (1)

Harvard University¹

17 Nov 2014-bioRxiv

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-Seq data, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data. DESeq2 uses shrinkage estimation for dispersions and fold changes to improve stability and interpretability of the estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression and facilitates downstream tasks such as gene ranking and visualization. DESeq2 is available as an R/Bioconductor package.

...read moreread less

17,014 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse