Home
/
Authors
/
James T. Robinson

Author

James T. Robinson

Other affiliations: Massachusetts Institute of Technology, Baylor College of Medicine, Broad Institute

Bio: James T. Robinson is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Gene & Zoom. The author has an hindex of 23, co-authored 41 publications receiving 30957 citations. Previous affiliations of James T. Robinson include Massachusetts Institute of Technology & Baylor College of Medicine.

Topics: Gene, Zoom, Genome, Human genome, CTCF ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Integrative genomics viewer

[...]

James T. Robinson¹, Helga Thorvaldsdottir¹, Wendy Winckler¹, Mitchell Guttman¹, Eric S. Lander¹, Eric S. Lander², Gad Getz¹, Jill P. Mesirov¹ - Show less +4 more•Institutions (2)

Massachusetts Institute of Technology¹, Harvard University²

10 Jan 2011-Nature Biotechnology

TL;DR: In this article, the authors present an approach for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.

...read moreread less

Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

...read moreread less

10,798 citations

Journal Article•DOI•

Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

[...]

Helga Thorvaldsdottir¹, James T. Robinson, Jill P. Mesirov•Institutions (1)

Broad Institute¹

01 Mar 2013-Briefings in Bioinformatics

TL;DR: The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution.

...read moreread less

Abstract: Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today’s sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.

...read moreread less

6,930 citations

Journal Article•DOI•

A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping

[...]

Suhas S.P. Rao¹, Miriam H. Huntley¹, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov¹, James T. Robinson¹, James T. Robinson², Adrian L. Sanborn¹, Ido Machol¹, Ido Machol³, Arina D. Omer³, Arina D. Omer¹, Eric S. Lander², Eric S. Lander⁴, Eric S. Lander⁵, Erez Lieberman Aiden - Show less +12 more•Institutions (5)

Baylor College of Medicine¹, Broad Institute², Rice University³, Massachusetts Institute of Technology⁴, Harvard University⁵

18 Dec 2014-Cell

TL;DR: In situ Hi-C is used to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types, identifying ∼10,000 loops that frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species.

...read moreread less

5,945 citations

Journal Article•DOI•

Comprehensive molecular profiling of lung adenocarcinoma: The cancer genome atlas research network

[...]

Eric A. Collisson¹, Joshua D. Campbell², Angela N. Brooks², Angela N. Brooks³ +315 more•Institutions (41)

01 Jan 2014-Nature

TL;DR: In this paper, the authors report molecular profiling of 230 resected lung adnocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses.

...read moreread less

Abstract: Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen (mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis.

...read moreread less

4,104 citations

Journal Article•DOI•

Discovery and saturation analysis of cancer genes across 21 tumour types

[...]

Michael S. Lawrence¹, Petar Stojanov², Craig H. Mermel², James T. Robinson¹, Levi A. Garraway², Todd R. Golub³, Matthew Meyerson², Stacey Gabriel¹, Eric S. Lander⁴, Gad Getz² - Show less +6 more•Institutions (4)

Broad Institute¹, Harvard University², Howard Hughes Medical Institute³, Massachusetts Institute of Technology⁴

23 Jan 2014-Nature

TL;DR: It is found that large-scale genomic analysis can identify nearly all known cancer genes in these cancer types and 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis.

...read moreread less

Abstract: Although a few cancer genes are mutated in a high proportion of tumours of a given type (.20%), most are mutated at intermediate frequencies (2–20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600– 5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics. Comprehensive knowledge of the genes underlying human cancers is a critical foundation for cancer diagnostics, therapeutics, clinical-trial design and selection of rational combination therapies. It is now possible to use genomic analysis to identify cancer genes in an unbiased fashion, based on the presence of somatic mutations at a rate significantly higher than the expected background level. Systematic studies have revealed many new cancer genes, as well as new classes of cancer genes 1,2 . They have also made clear that, although some cancer genes are mutated at high frequencies, most cancer genes in most patients occur at intermediate frequencies (2–20%) or lower. Accordingly, a complete catalogue of mutations in this frequency class will be essential for recognizing dysregulated pathways and optimal targets for therapeutic intervention. However, recent work suggests major gaps in our knowledge of cancer genes of intermediate frequency. For example, a study of 183 lung adenocarcinomas 3 found that 15% of patients lacked even a single mutation affecting any of the 10 known hallmarks of cancer, and 38% had 3 or fewer such mutations. In this paper, we analysed somatic point mutations (substitutions and small insertion and deletions) in nearly 5,000 human cancers and their matched normal-tissue samples (‘tumour–normal pairs’) across 21 tumour types. The questions that we examine here are: first, whether large-scale genomic analysis across tumour types can reliably identify all known cancer genes; second, whether it will reveal many new candidate cancer genes; and third, how far we are from having a complete catalogue of cancer genes (at least those of intermediate frequency). We used rigorous statistical methods to enumerate candidate cancer genes and then carefully inspected each gene to identify those with strong biological connections to cancer and mutational patterns consistent with the expected function. The analysis reveals nearly all known cancer genes and revealed 33 novel candidates, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Importantly, the data show that the

...read moreread less

2,565 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Full-length transcriptome assembly from RNA-Seq data without a reference genome.

[...]

Manfred Grabherr¹, Brian J. Haas¹, Moran Yassour², Moran Yassour¹, Joshua Z. Levin¹, Dawn Thompson¹, Ido Amit¹, Xian Adiconis¹, Lin Fan¹, Raktima Raychowdhury¹, Qiandong Zeng¹, Zehua Chen¹, Evan Mauceli¹, Nir Hacohen¹, Andreas Gnirke¹, Nicholas Rhind³, Federica Di Palma¹, Bruce W. Birren¹, Chad Nusbaum¹, Kerstin Lindblad-Toh⁴, Kerstin Lindblad-Toh¹, Nir Friedman², Aviv Regev¹ - Show less +19 more•Institutions (4)

Massachusetts Institute of Technology¹, Hebrew University of Jerusalem², University of Massachusetts Medical School³, Science for Life Laboratory⁴

01 Jul 2011-Nature Biotechnology

TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.

...read moreread less

Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

...read moreread less

15,665 citations

Journal Article•DOI•

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

[...]

Bo Li¹, Colin N. Dewey¹•Institutions (1)

University of Wisconsin-Madison¹

04 Aug 2011-BMC Bioinformatics

TL;DR: It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.

...read moreread less

Abstract: RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.

...read moreread less

14,524 citations

Journal Article•DOI•

The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data

[...]

Ethan Cerami¹, Jianjiong Gao, Ugur Dogrusoz, Benjamin Gross, Selcuk Onur Sumer, Bulent Arman Aksoy, Anders Jacobsen, Caitlin Byrne, Michael Heuer, Erik G. Larsson, Yevgeniy Antipin, Boris Reva, Arthur P. Goldberg, Chris Sander, Nikolaus Schultz - Show less +11 more•Institutions (1)

Memorial Sloan Kettering Cancer Center¹

01 May 2012-Cancer Discovery

TL;DR: The cBio Cancer Genomics Portal significantly lowers the barriers between complex genomic data and cancer researchers who want rapid, intuitive, and high-quality access to molecular profiles and clinical attributes from large-scale cancer genomics projects and empowers researchers to translate these rich data sets into biologic insights and clinical applications.

...read moreread less

Abstract: The cBio Cancer Genomics Portal (http://cbioportal.org) is an open-access resource for interactive exploration of multidimensional cancer genomics data sets, currently providing access to data from more than 5,000 tumor samples from 20 cancer studies. The cBio Cancer Genomics Portal significantly lowers the barriers between complex genomic data and cancer researchers who want rapid, intuitive, and high-quality access to molecular profiles and clinical attributes from large-scale cancer genomics projects and empowers researchers to translate these rich data sets into biologic insights and clinical applications.

...read moreread less

11,912 citations

Journal Article•DOI•

Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal

[...]

Jianjiong Gao¹, Bulent Arman Aksoy¹, Ugur Dogrusoz², Gideon Dresdner¹, Benjamin Gross¹, S. Onur Sumer¹, Yichao Sun¹, Anders Jacobsen¹, Rileen Sinha¹, Erik Larsson³, Ethan Cerami¹, Chris Sander¹, Nikolaus Schultz¹ - Show less +9 more•Institutions (3)

Memorial Sloan Kettering Cancer Center¹, Bilkent University², University of Gothenburg³

02 Apr 2013-Science Signaling

TL;DR: A practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics, which makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries.

...read moreread less

Abstract: The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a Web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. The portal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface combined with customized data storage enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes. The portal provides graphical summaries of gene-level data from multiple platforms, network visualization and analysis, survival analysis, patient-centric queries, and software programmatic access. The intuitive Web interface of the portal makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries. Here, we provide a practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics.

...read moreread less

10,947 citations

Journal Article•DOI•

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks

[...]

Cole Trapnell¹, Adam Roberts², Loyal A. Goff³, Loyal A. Goff⁴, Loyal A. Goff¹, Geo Pertea⁵, Daehwan Kim⁶, Daehwan Kim⁷, David R. Kelley³, David R. Kelley¹, Harold Pimentel², Steven L. Salzberg⁵, John L. Rinn³, John L. Rinn¹, Lior Pachter² - Show less +11 more•Institutions (7)

Broad Institute¹, University of California, Berkeley², Harvard University³, Massachusetts Institute of Technology⁴, Johns Hopkins University⁵, University of Maryland, College Park⁶, Johns Hopkins University School of Medicine⁷

01 Mar 2012-Nature Protocols

TL;DR: This protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results, which takes less than 1 d of computer time for typical experiments and ∼1 h of hands-on time.

...read moreread less

Abstract: Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ∼1 h of hands-on time.

...read moreread less

10,913 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse