scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The cancer genome atlas pan-cancer analysis project

01 Oct 2013-Nature Genetics (Nature Publishing Group)-Vol. 45, Iss: 10, pp 1113-1120
TL;DR: The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA with a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages.
Abstract: The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Salmon is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.
Abstract: We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

6,095 citations

Journal ArticleDOI
Zefang Tang1, Chenwei Li1, Boxi Kang1, Ge Gao1, Cheng Li1, Zemin Zhang 
TL;DR: GEPIA (Gene Expression Profiling Interactive Analysis) fills in the gap between cancer genomics big data and the delivery of integrated information to end users, thus helping unleash the value of the current data resources.
Abstract: Tremendous amount of RNA sequencing data have been produced by large consortium projects such as TCGA and GTEx, creating new opportunities for data mining and deeper understanding of gene functions. While certain existing web servers are valuable and widely used, many expression analysis functions needed by experimental biologists are still not adequately addressed by these tools. We introduce GEPIA (Gene Expression Profiling Interactive Analysis), a web-based tool to deliver fast and customizable functionalities based on TCGA and GTEx data. GEPIA provides key interactive and customizable functions including differential expression analysis, profiling plotting, correlation analysis, patient survival analysis, similar gene detection and dimensionality reduction analysis. The comprehensive expression analyses with simple clicking through GEPIA greatly facilitate data mining in wide research areas, scientific discussion and the therapeutic discovery process. GEPIA fills in the gap between cancer genomics big data and the delivery of integrated information to end users, thus helping unleash the value of the current data resources. GEPIA is available at http://gepia.cancer-pku.cn/.

5,980 citations


Cites background from "The cancer genome atlas pan-cancer ..."

  • ...To solve the imbalance between the tumor and normal data which can cause inefficiency in various differential analyses, we download the TCGA and GTEx gene expression data that are re-computed from raw RNA-Seq data by the UCSC Xena project based on a uniform pipeline (Figure 1)....

    [...]

  • ...We introduce GEPIA (Gene Expression Profiling Interactive Analysis), a web-based tool to deliver fast and customizable functionalities based on TCGA and GTEx data....

    [...]

  • ...In recent years, the Cancer Genome Atlas (TCGA) (3) and Genotype-Tissue Expression (GTEx) (4,5) projects produced RNA-Seq data for tens of thousands of cancer and non-cancer samples, providing an unprecedented opportunity for many related fields including cancer biology....

    [...]

  • ...By using GEPIA, experimental biologists can easily explore the large TCGA and GTEx datasets, ask specific questions and test their hypotheses....

    [...]

  • ...GEPIA is an interactive web application for gene expression analysis based on 9736 tumors and 8587 normal samples from the TCGA and the GTEx databases, using the output of a standard processing pipeline for RNA sequencing data....

    [...]

Journal ArticleDOI
TL;DR: The current status of TCGA Research Network structure, purpose, and achievements are discussed, to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer.
Abstract: The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements.

2,530 citations

Journal ArticleDOI
TL;DR: Measurements of TMB from comprehensive genomic profiling are strongly reflective of measurements from whole exome sequencing and model that below 0.5 Mb the variance in measurement increases significantly, demonstrating that many disease types have a substantial portion of patients with high TMB who might benefit from immunotherapy.
Abstract: High tumor mutational burden (TMB) is an emerging biomarker of sensitivity to immune checkpoint inhibitors and has been shown to be more significantly associated with response to PD-1 and PD-L1 blockade immunotherapy than PD-1 or PD-L1 expression, as measured by immunohistochemistry (IHC). The distribution of TMB and the subset of patients with high TMB has not been well characterized in the majority of cancer types. In this study, we compare TMB measured by a targeted comprehensive genomic profiling (CGP) assay to TMB measured by exome sequencing and simulate the expected variance in TMB when sequencing less than the whole exome. We then describe the distribution of TMB across a diverse cohort of 100,000 cancer cases and test for association between somatic alterations and TMB in over 100 tumor types. We demonstrate that measurements of TMB from comprehensive genomic profiling are strongly reflective of measurements from whole exome sequencing and model that below 0.5 Mb the variance in measurement increases significantly. We find that a subset of patients exhibits high TMB across almost all types of cancer, including many rare tumor types, and characterize the relationship between high TMB and microsatellite instability status. We find that TMB increases significantly with age, showing a 2.4-fold difference between age 10 and age 90 years. Finally, we investigate the molecular basis of TMB and identify genes and mutations associated with TMB level. We identify a cluster of somatic mutations in the promoter of the gene PMS2, which occur in 10% of skin cancers and are highly associated with increased TMB. These results show that a CGP assay targeting ~1.1 Mb of coding genome can accurately assess TMB compared with sequencing the whole exome. Using this method, we find that many disease types have a substantial portion of patients with high TMB who might benefit from immunotherapy. Finally, we identify novel, recurrent promoter mutations in PMS2, which may be another example of regulatory mutations contributing to tumorigenesis.

2,304 citations


Cites background or methods from "The cancer genome atlas pan-cancer ..."

  • ...across cancer types and found a wide distribution of TMB across ~20–30 cancer types [28, 51, 54]....

    [...]

  • ...We also analyzed whole-exome sequencing data from 35 studies, published as part of TCGA, examining a total of 8917 cancer specimens [54]....

    [...]

  • ...TCGA data were obtained from public repositories [54]....

    [...]

References
More filters
Journal ArticleDOI
04 Mar 2011-Cell
TL;DR: Recognition of the widespread applicability of these concepts will increasingly affect the development of new means to treat human cancer.

51,099 citations

Journal ArticleDOI
07 Jan 2000-Cell
TL;DR: This work has been supported by the Department of the Army and the National Institutes of Health, and the author acknowledges the support and encouragement of the National Cancer Institute.

28,811 citations

Journal ArticleDOI
17 Aug 2000-Nature
TL;DR: Variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals were characterized using complementary DNA microarrays representing 8,102 human genes, providing a distinctive molecular portrait of each tumour.
Abstract: Human breast tumours are diverse in their natural history and in their responsiveness to treatments. Variation in transcriptional programs accounts for much of the biological diversity of human cells and tumours. In each cell, signal transduction and regulatory systems transduce information from the cell's identity to its environmental status, thereby controlling the level of expression of every gene in the genome. Here we have characterized variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals, using complementary DNA microarrays representing 8,102 human genes. These patterns provided a distinctive molecular portrait of each tumour. Twenty of the tumours were sampled twice, before and after a 16-week course of doxorubicin chemotherapy, and two tumours were paired with a lymph node metastasis from the same patient. Gene expression patterns in two tumour samples from the same individual were almost always more similar to each other than either was to any other sample. Sets of co-expressed genes were identified for which variation in messenger RNA levels could be related to specific features of physiological variation. The tumours could be classified into subtypes distinguished by pervasive differences in their gene expression patterns.

14,768 citations

Journal ArticleDOI
27 Jun 2002-Nature
TL;DR: BRAF somatic missense mutations in 66% of malignant melanomas and at lower frequency in a wide range of human cancers, with a single substitution (V599E) accounting for 80%.
Abstract: Cancers arise owing to the accumulation of mutations in critical genes that alter normal programmes of cell proliferation, differentiation and death. As the first stage of a systematic genome-wide screen for these genes, we have prioritized for analysis signalling pathways in which at least one gene is mutated in human cancer. The RAS RAF MEK ERK MAP kinase pathway mediates cellular responses to growth signals. RAS is mutated to an oncogenic form in about 15% of human cancer. The three RAF genes code for cytoplasmic serine/threonine kinases that are regulated by binding RAS. Here we report BRAF somatic missense mutations in 66% of malignant melanomas and at lower frequency in a wide range of human cancers. All mutations are within the kinase domain, with a single substitution (V599E) accounting for 80%. Mutated BRAF proteins have elevated kinase activity and are transforming in NIH3T3 cells. Furthermore, RAS function is not required for the growth of cancer cell lines with the V599E mutation. As BRAF is a serine/threonine kinase that is commonly activated by somatic point mutation in human cancer, it may provide new therapeutic opportunities in malignant melanoma.

9,785 citations

Journal ArticleDOI
04 Oct 2012-Nature
TL;DR: The ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity.
Abstract: We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

9,355 citations