scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Genomic analyses identify molecular subtypes of pancreatic cancer

Peter Bailey1, David K. Chang2, Katia Nones1, Katia Nones3, Amber L. Johns4, Ann-Marie Patch3, Ann-Marie Patch1, Marie-Claude Gingras5, David Miller1, David Miller4, Angelika N. Christ1, Timothy J. C. Bruxner1, Michael C.J. Quinn1, Michael C.J. Quinn3, Craig Nourse2, Craig Nourse1, Murtaugh Lc6, Ivon Harliwong1, Senel Idrisoglu1, Suzanne Manning1, Ehsan Nourbakhsh1, Shivangi Wani1, Shivangi Wani3, J. Lynn Fink1, Oliver Holmes1, Oliver Holmes3, Chin4, Matthew J. Anderson1, Stephen H. Kazakoff3, Stephen H. Kazakoff1, Conrad Leonard1, Conrad Leonard3, Felicity Newell1, Nicola Waddell1, Scott Wood3, Scott Wood1, Qinying Xu3, Qinying Xu1, Peter J. Wilson1, Nicole Cloonan3, Nicole Cloonan1, Karin S. Kassahn7, Karin S. Kassahn1, Karin S. Kassahn8, Darrin Taylor1, Kelly Quek1, Alan J. Robertson1, Lorena Pantano9, Laura Mincarelli2, Luis Navarro Sanchez2, Lisa Evers2, Jianmin Wu4, Mark Pinese4, Mark J. Cowley4, Jones4, Jones2, Emily K. Colvin4, Adnan Nagrial4, Emily S. Humphrey4, Lorraine A. Chantrill4, Lorraine A. Chantrill10, Amanda Mawson4, Jeremy L. Humphris4, Angela Chou11, Angela Chou4, Marina Pajic12, Marina Pajic4, Christopher J. Scarlett4, Christopher J. Scarlett13, Andreia V. Pinho4, Marc Giry-Laterriere4, Ilse Rooman4, Jaswinder S. Samra14, James G. Kench15, James G. Kench16, James G. Kench4, Jessica A. Lovell4, Neil D. Merrett12, Christopher W. Toon4, Krishna Epari17, Nam Q. Nguyen18, Andrew Barbour19, Nikolajs Zeps20, Kim Moran-Jones2, Nigel B. Jamieson2, Janet Graham2, Janet Graham21, Fraser Duthie22, Karin A. Oien22, Karin A. Oien4, Hair J22, Robert Grützmann23, Anirban Maitra24, Christine A. Iacobuzio-Donahue25, Christopher L. Wolfgang26, Richard A. Morgan26, Rita T. Lawlor, Corbo, Claudio Bassi, Borislav Rusev, Paola Capelli27, Roberto Salvia, Giampaolo Tortora, Debabrata Mukhopadhyay28, Gloria M. Petersen28, Munzy Dm5, William E. Fisher5, Saadia A. Karim, Eshleman26, Ralph H. Hruban26, Christian Pilarsky23, Jennifer P. Morton, Owen J. Sansom2, Aldo Scarpa27, Elizabeth A. Musgrove2, Ulla-Maja Bailey2, Oliver Hofmann2, Oliver Hofmann9, R. L. Sutherland4, David A. Wheeler5, Anthony J. Gill4, Anthony J. Gill15, Richard A. Gibbs5, John V. Pearson1, John V. Pearson3, Andrew V. Biankin, Sean M. Grimmond1, Sean M. Grimmond29, Sean M. Grimmond2 
03 Mar 2016-Nature (Nature Publishing Group)-Vol. 531, Iss: 7592, pp 47-52
TL;DR: Detailed genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing.
Abstract: Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63∆N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.
Citations
More filters
Journal ArticleDOI
17 Apr 2018-Immunity
TL;DR: An extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA identifies six immune subtypes that encompass multiple cancer types and are hypothesized to define immune response patterns impacting prognosis.

3,246 citations

Journal ArticleDOI
TL;DR: An integrated multi-platform analysis of 150 pancreatic ductal adenocarcinoma specimens reveals a complex molecular landscape of PDAC and provides a roadmap for precision medicine.

1,259 citations


Cites background or methods or result from "Genomic analyses identify molecular..."

  • ...The percentage of PDAC samples with a mutation detected by automated calling is noted at the left....

    [...]

  • ...Previous analyses of gene expression have identified mRNA subtypes of pancreatic cancer (Bailey et al., 2016; Collisson et al., 2011; Moffitt et al., 2015)....

    [...]

  • ...Gene expression studies have identified subtypes of PDACwith prognostic and biological relevance (Bailey et al., 2016; Collisson et al., 2011; Moffitt et al., 2015)....

    [...]

  • ...Along the right side of the heatmap are red and blue indicators of whether or not a gene was significantly over or under expressed in one of the Bailey et al. subtypes (supplemental table, Bailey et al.(Bailey et al., 2016))....

    [...]

  • ...Our results suggest a potentially important relationship between non-coding RNAs and differentiation genes, including GATA6, that have previously been associated with classical/progenitor subtype tumors (Bailey et al., 2016; Collisson et al., 2011; Moffitt et al., 2015), as well as potentially new relationships between non-coding RNA and the more aggressive basal-like/squamous subtype tumors (Bailey et al....

    [...]

Journal ArticleDOI
TL;DR: This review aims to outline the most up-to-date knowledge of pancreatic adenocarcinoma risk, diagnostics, treatment and outcomes, while identifying gaps that aim to stimulate further research in this understudied malignancy.
Abstract: This review aims to outline the most up-to-date knowledge of pancreatic adenocarcinoma risk, diagnostics, treatment and outcomes, while identifying gaps that aim to stimulate further research in this understudied malignancy. Pancreatic adenocarcinoma is a lethal condition with a rising incidence, predicted to become the second leading cause of cancer death in some regions. It often presents at an advanced stage, which contributes to poor five-year survival rates of 2%-9%, ranking firmly last amongst all cancer sites in terms of prognostic outcomes for patients. Better understanding of the risk factors and symptoms associated with this disease is essential to inform both health professionals and the general population of potential preventive and/or early detection measures. The identification of high-risk patients who could benefit from screening to detect pre-malignant conditions such as pancreatic intraepithelial neoplasia, intraductal papillary mucinous neoplasms and mucinous cystic neoplasms is urgently required, however an acceptable screening test has yet to be identified. The management of pancreatic adenocarcinoma is evolving, with the introduction of new surgical techniques and medical therapies such as laparoscopic techniques and neo-adjuvant chemoradiotherapy, however this has only led to modest improvements in outcomes. The identification of novel biomarkers is desirable to move towards a precision medicine era, where pancreatic cancer therapy can be tailored to the individual patient, while unnecessary treatments that have negative consequences on quality of life could be prevented for others. Research efforts must also focus on the development of new agents and delivery systems. Overall, considerable progress is required to reduce the burden associated with pancreatic cancer. Recent, renewed efforts to fund large consortia and research into pancreatic adenocarcinoma are welcomed, but further streams will be necessary to facilitate the momentum needed to bring breakthroughs seen for other cancer sites.

951 citations

Journal ArticleDOI
TL;DR: A new population of CAFs that express MHC class II and CD74, but do not express classical co-stimulatory molecules are described, and it is found that they activate CD4+ T cells in an antigen-specific fashion in a model system, confirming their putative immune-modulatory capacity.
Abstract: Cancer-associated fibroblasts (CAF) are major players in the progression and drug resistance of pancreatic ductal adenocarcinoma (PDAC). CAFs constitute a diverse cell population consisting of several recently described subtypes, although the extent of CAF heterogeneity has remained undefined. Here we use single-cell RNA sequencing to thoroughly characterize the neoplastic and tumor microenvironment content of human and mouse PDAC tumors. We corroborate the presence of myofibroblastic CAFs and inflammatory CAFs and define their unique gene signatures in vivo. Moreover, we describe a new population of CAFs that express MHC class II and CD74, but do not express classic costimulatory molecules. We term this cell population "antigen-presenting CAFs" and find that they activate CD4+ T cells in an antigen-specific fashion in a model system, confirming their putative immune-modulatory capacity. Our cross-species analysis paves the way for investigating distinct functions of CAF subtypes in PDAC immunity and progression. SIGNIFICANCE: Appreciating the full spectrum of fibroblast heterogeneity in pancreatic ductal adenocarcinoma is crucial to developing therapies that specifically target tumor-promoting CAFs. This work identifies MHC class II-expressing CAFs with a capacity to present antigens to CD4+ T cells, and potentially to modulate the immune response in pancreatic tumors.See related commentary by Belle and DeNardo, p. 1001.This article is highlighted in the In This Issue feature, p. 983.

900 citations

Journal ArticleDOI
23 Nov 2017-Nature
TL;DR: In this paper, the authors used genetic, immunohistochemical and transcriptional immunoprofiling, computational biophysics, and functional assays to identify T-cell antigens in long-term survivors of pancreatic cancer.
Abstract: Pancreatic ductal adenocarcinoma is a lethal cancer with fewer than 7% of patients surviving past 5 years. T-cell immunity has been linked to the exceptional outcome of the few long-term survivors, yet the relevant antigens remain unknown. Here we use genetic, immunohistochemical and transcriptional immunoprofiling, computational biophysics, and functional assays to identify T-cell antigens in long-term survivors of pancreatic cancer. Using whole-exome sequencing and in silico neoantigen prediction, we found that tumours with both the highest neoantigen number and the most abundant CD8+ T-cell infiltrates, but neither alone, stratified patients with the longest survival. Investigating the specific neoantigen qualities promoting T-cell activation in long-term survivors, we discovered that these individuals were enriched in neoantigen qualities defined by a fitness model, and neoantigens in the tumour antigen MUC16 (also known as CA125). A neoantigen quality fitness model conferring greater immunogenicity to neoantigens with differential presentation and homology to infectious disease-derived peptides identified long-term survivors in two independent datasets, whereas a neoantigen quantity model ascribing greater immunogenicity to increasing neoantigen number alone did not. We detected intratumoural and lasting circulating T-cell reactivity to both high-quality and MUC16 neoantigens in long-term survivors of pancreatic cancer, including clones with specificity to both high-quality neoantigens and predicted cross-reactive microbial epitopes, consistent with neoantigen molecular mimicry. Notably, we observed selective loss of high-quality and MUC16 neoantigenic clones on metastatic progression, suggesting neoantigen immunoediting. Our results identify neoantigens with unique qualities as T-cell targets in pancreatic ductal adenocarcinoma. More broadly, we identify neoantigen quality as a biomarker for immunogenic tumours that may guide the application of immunotherapies.

774 citations

References
More filters
Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

30,684 citations

Journal ArticleDOI
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Abstract: Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

29,413 citations

Journal ArticleDOI
TL;DR: It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.
Abstract: RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.

14,524 citations

Journal ArticleDOI
TL;DR: The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis that includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software.
Abstract: Correlation networks are increasingly being used in bioinformatics applications For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets These methods have been successfully applied in various biological contexts, eg cancer, mouse genetics, yeast genetics, and analysis of brain imaging data While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software Along with the R package we also present R software tutorials While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings The WGCNA package provides R functions for weighted correlation network analysis, eg co-expression network analysis of gene expression data The R package along with its source code and additional material are freely available at http://wwwgeneticsuclaedu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA

14,243 citations

Related Papers (5)