scispace - formally typeset
Search or ask a question
Author

Sohrab P. Shah

Bio: Sohrab P. Shah is an academic researcher from Memorial Sloan Kettering Cancer Center. The author has contributed to research in topics: Cancer & Diffuse large B-cell lymphoma. The author has an hindex of 68, co-authored 179 publications receiving 25390 citations. Previous affiliations of Sohrab P. Shah include University of British Columbia & BC Cancer Agency.


Papers
More filters
Journal ArticleDOI
21 Jun 2012-Nature
TL;DR: The results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome, and identify novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort.
Abstract: The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in 40% of genes, with the landscape dominated by cisand trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

4,722 citations

Journal ArticleDOI
21 Jun 2012-Nature
TL;DR: It is shown that understanding the biology and therapeutic responses of patients with TNBC will require the determination of individual tumour clonal genotypes, and for the first time in an epithelial tumour subtype, the relative abundance of clonal frequencies among cases representative of the population is determined.
Abstract: Primary triple-negative breast cancers (TNBCs), a tumour type defined by lack of oestrogen receptor, progesterone receptor and ERBB2 gene amplification, represent approximately 16% of all breast cancers. Here we show in 104 TNBC cases that at the time of diagnosis these cancers exhibit a wide and continuous spectrum of genomic evolution, with some having only a handful of coding somatic aberrations in a few pathways, whereas others contain hundreds of coding somatic mutations. High-throughput RNA sequencing (RNA-seq) revealed that only approximately 36% of mutations are expressed. Using deep re-sequencing measurements of allelic abundance for 2,414 somatic mutations, we determine for the first time-to our knowledge-in an epithelial tumour subtype, the relative abundance of clonal frequencies among cases representative of the population. We show that TNBCs vary widely in their clonal frequencies at the time of diagnosis, with the basal subtype of TNBC showing more variation than non-basal TNBC. Although p53 (also known as TP53), PIK3CA and PTEN somatic mutations seem to be clonally dominant compared to other genes, in some tumours their clonal frequencies are incompatible with founder status. Mutations in cytoskeletal, cell shape and motility proteins occurred at lower clonal frequencies, suggesting that they occurred later during tumour progression. Taken together, our results show that understanding the biology and therapeutic responses of patients with TNBC will require the determination of individual tumour clonal genotypes.

1,821 citations

Journal ArticleDOI
TL;DR: These data implicate ARID1A as a tumor-suppressor gene frequently disrupted in ovarian clear-cell and endometrioid carcinomas.
Abstract: Background Ovarian clear-cell and endometrioid carcinomas may arise from endometriosis, but the molecular events involved in this transformation have not been described. Methods We sequenced the whole transcriptomes of 18 ovarian clear-cell carcinomas and 1 ovarian clear-cell carcinoma cell line and found somatic mutations in ARID1A (the AT-rich interactive domain 1A [SWI-like] gene) in 6 of the samples. ARID1A encodes BAF250a, a key component of the SWI–SNF chromatin remodeling complex. We sequenced ARID1A in an additional 210 ovarian carcinomas and a second ovarian clear-cell carcinoma cell line and measured BAF250a expression by means of immunohistochemical analysis in an additional 455 ovarian carcinomas. Results ARID1A mutations were seen in 55 of 119 ovarian clear-cell carcinomas (46%), 10 of 33 endometrioid carcinomas (30%), and none of the 76 high-grade serous ovarian carcinomas. Seventeen carcinomas had two somatic mutations each. Loss of the BAF250a protein correlated strongly with the ovarian c...

1,485 citations

Journal ArticleDOI
TL;DR: Recurrent somatic mutations affecting the polycomb-group oncogene EZH2, which encodes a histone methyltransferase responsible for trimethylating Lys27 of histone H3 (H3K27), are reported, consistent with the notion that EZh2 proteins with mutant Tyr641 have reduced enzymatic activity in vitro.
Abstract: Marco Marra and colleagues identify somatic mutations in EZH2 in diffuse large B-cell lymphomas and follicular lymphomas. EZH2 is a histone methyltransferase that participates in trimethylation of H3 Lys27 (H3K27) as part of the PRC2 complex. The mutations alter a single tyrosine residue in the SET domain of EZH2 and reduce the ability of PRC2 to trimethylate H3K27 in vitro.

1,468 citations

Journal ArticleDOI
TL;DR: This study sequence 173 genes in 2,433 primary breast tumours that have copy number aberration, gene expression and long-term clinical follow-up data, and determines associations between mutations, driver CNA profiles, clinical-pathological parameters and survival.
Abstract: The genomic landscape of breast cancer is complex, and inter- and intra-tumour heterogeneity are important challenges in treating the disease. In this study, we sequence 173 genes in 2,433 primary breast tumours that have copy number aberration (CNA), gene expression and long-term clinical follow-up data. We identify 40 mutation-driver (Mut-driver) genes, and determine associations between mutations, driver CNA profiles, clinical-pathological parameters and survival. We assess the clonal states of Mut-driver mutations, and estimate levels of intra-tumour heterogeneity using mutant-allele fractions. Associations between PIK3CA mutations and reduced survival are identified in three subgroups of ER-positive cancer (defined by amplification of 17q23, 11q13–14 or 8q24). High levels of intra-tumour heterogeneity are in general associated with a worse outcome, but highly aggressive tumours with 11q13–14 amplification have low levels of intra-tumour heterogeneity. These results emphasize the importance of genome-based stratification of breast cancer, and have important implications for designing therapeutic strategies.

1,205 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: Preliminary clinical findings with blockers of additional immune-checkpoint proteins, such as programmed cell death protein 1 (PD1), indicate broad and diverse opportunities to enhance antitumour immunity with the potential to produce durable clinical responses.
Abstract: Immune checkpoints refer to the plethora of inhibitory pathways that are crucial to maintaining self-tolerance. Tumour cells induce immune checkpoints to evade immunosurveillance. This Review discusses the progress in targeting immune checkpoints, the considerations for combinatorial therapy and the potential for additional immune-checkpoint targets.

10,602 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
04 Oct 2012-Nature
TL;DR: The ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity.
Abstract: We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

9,355 citations

Journal ArticleDOI
TL;DR: StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts produces more complete and accurate reconstructions of genes and better estimates of expression levels.
Abstract: Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph. For example, on 90 million reads from human blood, StringTie correctly assembled 10,990 transcripts, whereas the next best assembly was of 7,187 transcripts by Cufflinks, which is a 53% increase in transcripts assembled. On a simulated data set, StringTie correctly assembled 7,559 transcripts, which is 20% more than the 6,310 assembled by Cufflinks. As well as producing a more complete transcriptome assembly, StringTie runs faster on all data sets tested to date compared with other assembly software, including Cufflinks.

6,594 citations