scispace - formally typeset
Search or ask a question

Showing papers by "Michael R. Stratton published in 2019"


Journal ArticleDOI
16 Aug 2019-Nature
TL;DR: Genome sequencing of hundreds of normal colonic crypts from 42 individuals sheds light on mutational processes and driver mutations in normal colorectal epithelial cells, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorean epithelium.
Abstract: The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic changes and consequent clonal expansions that lead to cancer1. However, our understanding of the earliest phases of colorectal neoplastic changes-which may occur in morphologically normal tissue-is comparatively limited, as for most cancer types. Here we use whole-genome sequencing to analyse hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed; some of these were ubiquitous and continuous, whereas others were only found in some individuals, in some crypts or during certain periods of life. Probable driver mutations were present in around 1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. Colorectal cancers exhibit substantially increased mutational burdens relative to normal cells. Sequencing normal colorectal cells provides quantitative insights into the genomic and clonal evolution of cancer.

414 citations


Journal ArticleDOI
07 Mar 2019-Cell
TL;DR: Exome sequences of 1,001 human cancer cell lines and 577 xenografts revealed most common mutational signatures, indicating past activity of the underlying processes, usually in appropriate cancer types, and potentially retain patterns of activity and regulation operative in primary human cancers.

270 citations


Journal ArticleDOI
23 Oct 2019-Nature
TL;DR: It is shown that cirrhotic liver has a higher mutational burden than normal liver, and structural variants, including chromothripsis, were prominent in cirrhosis.
Abstract: The most common causes of chronic liver disease are excess alcohol intake, viral hepatitis and non-alcoholic fatty liver disease, with the clinical spectrum ranging in severity from hepatic inflammation to cirrhosis, liver failure or hepatocellular carcinoma (HCC). The genome of HCC exhibits diverse mutational signatures, resulting in recurrent mutations across more than 30 cancer genes1-7. Stem cells from normal livers have a low mutational burden and limited diversity of signatures8, which suggests that the complexity of HCC arises during the progression to chronic liver disease and subsequent malignant transformation. Here, by sequencing whole genomes of 482 microdissections of 100-500 hepatocytes from 5 normal and 9 cirrhotic livers, we show that cirrhotic liver has a higher mutational burden than normal liver. Although rare in normal hepatocytes, structural variants, including chromothripsis, were prominent in cirrhosis. Driver mutations, such as point mutations and structural variants, affected 1-5% of clones. Clonal expansions of millimetres in diameter occurred in cirrhosis, with clones sequestered by the bands of fibrosis that surround regenerative nodules. Some mutational signatures were universal and equally active in both non-malignant hepatocytes and HCCs; some were substantially more active in HCCs than chronic liver disease; and others-arising from exogenous exposures-were present in a subset of patients. The activity of exogenous signatures between adjacent cirrhotic nodules varied by up to tenfold within each patient, as a result of clone-specific and microenvironmental forces. Synchronous HCCs exhibited the same mutational signatures as background cirrhotic liver, but with higher burden. Somatic mutations chronicle the exposures, toxicity, regeneration and clonal structure of liver tissue as it progresses from health to disease.

217 citations


Journal ArticleDOI
TL;DR: The SigProfilerMatrixGenerator tool is the first to provide support for classifying doublet base substitutions and small insertions and deletions, and is also faster and more memory efficient than existing tools that generate only a single matrix.
Abstract: Cancer genomes are peppered with somatic mutations imprinted by different mutational processes. The mutational pattern of a cancer genome can be used to identify and understand the etiology of the underlying mutational processes. A plethora of prior research has focused on examining mutational signatures and mutational patterns from single base substitutions and their immediate sequencing context. We recently demonstrated that further classification of small mutational events (including substitutions, insertions, deletions, and doublet substitutions) can be used to provide a deeper understanding of the mutational processes that have molded a cancer genome. However, there has been no standard tool that allows fast, accurate, and comprehensive classification for all types of small mutational events. Here, we present SigProfilerMatrixGenerator, a computational tool designed for optimized exploration and visualization of mutational patterns for all types of small mutational events. SigProfilerMatrixGenerator is written in Python with an R wrapper package provided for users that prefer working in an R environment. SigProfilerMatrixGenerator produces fourteen distinct matrices by considering transcriptional strand bias of individual events and by incorporating distinct classifications for single base substitutions, doublet base substitutions, and small insertions and deletions. While the tool provides a comprehensive classification of mutations, SigProfilerMatrixGenerator is also faster and more memory efficient than existing tools that generate only a single matrix. SigProfilerMatrixGenerator provides a standardized method for classifying small mutational events that is both efficient and scalable to large datasets. In addition to extending the classification of single base substitutions, the tool is the first to provide support for classifying doublet base substitutions and small insertions and deletions. SigProfilerMatrixGenerator is freely available at https://github.com/AlexandrovLab/SigProfilerMatrixGenerator with an extensive documentation at https://osf.io/s93d5/wiki/home/ .

138 citations


Journal ArticleDOI
TL;DR: It is found that circular, and not linear, CNOT2 levels are predictive for progression-free survival time to aromatase inhibitor therapy in advanced breast cancer patients, and found that circCNOT2 is detectable in cell-free RNA from plasma.
Abstract: Circular RNAs (circRNAs) are a class of RNAs that is under increasing scrutiny, although their functional roles are debated. We analyzed RNA-seq data of 348 primary breast cancers and developed a method to identify circRNAs that does not rely on unmapped reads or known splice junctions. We identified 95,843 circRNAs, of which 20,441 were found recurrently. Of the circRNAs that match exon boundaries of the same gene, 668 showed a poor or even negative (R < 0.2) correlation with the expression level of the linear gene. In silico analysis showed only a minority (8.5%) of circRNAs could be explained by known splicing events. Both these observations suggest that specific regulatory processes for circRNAs exist. We confirmed the presence of circRNAs of CNOT2, CREBBP, and RERE in an independent pool of primary breast cancers. We identified circRNA profiles associated with subgroups of breast cancers and with biological and clinical features, such as amount of tumor lymphocytic infiltrate and proliferation index. siRNA-mediated knockdown of circCNOT2 was shown to significantly reduce viability of the breast cancer cell lines MCF-7 and BT-474, further underlining the biological relevance of circRNAs. Furthermore, we found that circular, and not linear, CNOT2 levels are predictive for progression-free survival time to aromatase inhibitor (AI) therapy in advanced breast cancer patients, and found that circCNOT2 is detectable in cell-free RNA from plasma. We showed that circRNAs are abundantly present, show characteristics of being specifically regulated, are associated with clinical and biological properties, and thus are relevant in breast cancer.

79 citations


Journal ArticleDOI
06 Dec 2019-Science
TL;DR: Phylogenetic analyses of bilateral tumors indicated that clonal expansions can evolve before the divergence of left and right kidney primordia, and reveal embryonal precursors from which unilateral and multifocal cancers develop.
Abstract: Adult cancers often arise from premalignant clonal expansions. Whether the same is true of childhood tumors has been unclear. To investigate whether Wilms tumor (nephroblastoma; a childhood kidney cancer) develops from a premalignant background, we examined the phylogenetic relationship between tumors and corresponding normal tissues. In 14 of 23 cases studied (61%), we found premalignant clonal expansions in morphologically normal kidney tissues that preceded tumor development. These clonal expansions were defined by somatic mutations shared between tumor and normal tissues but absent from blood cells. We also found hypermethylation of the H19 locus, a known driver of Wilms tumor development, in 58% of the expansions. Phylogenetic analyses of bilateral tumors indicated that clonal expansions can evolve before the divergence of left and right kidney primordia. These findings reveal embryonal precursors from which unilateral and multifocal cancers develop.

72 citations


Journal ArticleDOI
Adrian Baez-Ortega1, Kevin Gori1, Andrea Strakova1, Janice L Allen, Karen M Allum, Leontine Bansse-Issa2, Thinlay N Bhutia3, Jocelyn L Bisson4, Jocelyn L Bisson1, Cristóbal Briceño5, Artemio Castillo Domracheva6, Anne M Corrigan7, Hugh R Cran, Jane T Crawford, Eric Davis8, Karina F de Castro, Andrigo Barboza De Nardi9, Anna P de Vos, Laura Delgadillo Keenan, Edward M Donelan, Adela R. Espinoza Huerta, Ibikunle A Faramade, Mohammed Fazil, Eleni Fotopoulou, Skye N Fruean, Fanny Gallardo-Arrieta10, Olga Glebova, Pagona G. Gouletsou11, Rodrigo F Häfelin Manrique5, Joaquim Henriques, Rodrigo dos Santos Horta, Natalia Ignatenko, Yaghouba Kane12, Cathy King, Debbie Koenig, Ada Krupa13, Steven J. Kruzeniski, Young Mi Kwon1, Marta Lanza-Perea7, Mihran Lazyan, Adriana M Lopez Quintana, Thibault Losfelt, Gabriele Marino14, Simón Martínez Castañeda15, Mayra F Martínez-López16, Michael C. Meyer, Edward J. Migneco, Berna Nakanwagi, Karter B. Neal, Winifred Neunzig, Máire Ní Leathlobhair1, Sally J Nixon, Antonio Ortega-Pacheco17, Francisco Pedraza-Ordoñez18, Maria C. Peleteiro19, Katherine Polak, Ruth J. Pye, John F Reece, Jose Rojas Gutierrez, Haleema Sadia20, Sheila K Schmeling, Olga Shamanova, Alan G. Sherlock, Maximilian R Stammnitz1, Audrey E Steenland-Smit2, Alla Svitich, Lester J. Tapia Martínez, Ismail Thoya Ngoka21, Cristian G. Torres5, Elizabeth M Tudor22, Mirjam G van der Wel, Bogdan A. Viţălaru, Sevil Atalay Vural23, Oliver Walkinton, Jinhong Wang1, Alvaro S Wehrle-Martinez, Sophie A.E. Widdowson, Michael R. Stratton24, Ludmil B. Alexandrov25, Inigo Martincorena24, Elizabeth P. Murchison1 
02 Aug 2019-Science
TL;DR: The phylogenetic history of the CTVT lineage suggests that neutral genetic drift is the dominant evolutionary force operating on cancer over the long term, in contrast to the ongoing positive selection that is often observed in short-lived human cancers.
Abstract: The canine transmissible venereal tumor (CTVT) is a cancer lineage that arose several millennia ago and survives by "metastasizing" between hosts through cell transfer. The somatic mutations in this cancer record its phylogeography and evolutionary history. We constructed a time-resolved phylogeny from 546 CTVT exomes and describe the lineage's worldwide expansion. Examining variation in mutational exposure, we identify a highly context-specific mutational process that operated early in the cancer's evolution but subsequently vanished, correlate ultraviolet-light mutagenesis with tumor latitude, and describe tumors with heritable hyperactivity of an endogenous mutational process. CTVT displays little evidence of ongoing positive selection, and negative selection is detectable only in essential genes. We illustrate how long-lived clonal organisms capture changing mutagenic environments, and reveal that neutral genetic drift is the dominant feature of long-term cancer evolution.

56 citations


Journal ArticleDOI
TL;DR: In breast cancer the authors find that hyper-variability of partially methylated domains is the prime source of DNA methylation variation and that these domains fuel CpG island hypermethylation.
Abstract: Global loss of DNA methylation and CpG island (CGI) hypermethylation are key epigenomic aberrations in cancer. Global loss manifests itself in partially methylated domains (PMDs) which extend up to megabases. However, the distribution of PMDs within and between tumor types, and their effects on key functional genomic elements including CGIs are poorly defined. We comprehensively show that loss of methylation in PMDs occurs in a large fraction of the genome and represents the prime source of DNA methylation variation. PMDs are hypervariable in methylation level, size and distribution, and display elevated mutation rates. They impose intermediate DNA methylation levels incognizant of functional genomic elements including CGIs, underpinning a CGI methylator phenotype (CIMP). Repression effects on tumor suppressor genes are negligible as they are generally excluded from PMDs. The genomic distribution of PMDs reports tissue-of-origin and may represent tissue-specific silent regions which tolerate instability at the epigenetic, transcriptomic and genetic level.

42 citations


Journal ArticleDOI
Serena Nik-Zainal1, Serena Nik-Zainal2, Helen Davies2, Johan Staaf3, Manasa Ramakrishna2, Dominik Glodzik2, Xueqing Zou2, Inigo Martincorena2, Ludmil B. Alexandrov2, Ludmil B. Alexandrov4, Sancha Martin2, David C. Wedge2, Peter Van Loo5, Peter Van Loo2, Young Seok Ju2, Marcel Smid6, Arie B. Brinkman7, Sandro Morganella8, Miriam Ragle Aure9, Ole Christian Lingjærde9, Anita Langerød9, Markus Ringnér3, Sung-Min Ahn10, Sandrine Boyault, Jane E. Brock11, Annegien Broeks12, Adam Butler2, Christine Desmedt13, Luc Dirix14, Serge Dronov2, Aquila Fatima11, John A. Foekens6, Moritz Gerstung2, Gerrit K. J. Hooijer15, Se Jin Jang16, David Jones2, Hyung-Yong Kim17, Tari A. King18, Savitri Krishnamurthy19, Hee Jin Lee16, Jeong-Yeon Lee17, Yang Li2, Stuart McLaren2, Andrew Menzies2, Ville Mustonen2, Sarah O’Meara2, Iris Pauporté, Xavier Pivot20, Colin A. Purdie21, Keiran Raine2, Kamna Ramakrishnan2, Germán Fg Rodríguez-González6, Gilles Romieu22, Anieta M. Sieuwerts6, Peter T. Simpson23, Rebecca Shepherd2, Lucy Stebbings2, Olafur A. Stefansson24, Jon W. Teague2, Stefania Tommasi, Isabelle Treilleux, Gert Van den Eynden14, Peter B. Vermeulen14, Anne Vincent-Salomon25, Lucy R. Yates2, Carlos Caldas26, Laura Van't Veer12, Andrew Tutt27, Andrew Tutt28, Stian Knappskog29, Benita Kiat Tee Bk Tan30, Jos Jonkers12, Åke Borg3, Naoto T. Ueno19, Christos Sotiriou13, Alain Viari31, P. Andrew Futreal2, P. Andrew Futreal19, Peter J. Campbell2, Paul N. Span7, Steven Van Laere14, Sunil R. Lakhani32, Sunil R. Lakhani23, Jorunn E. Eyfjord24, Alastair M Thompson21, Alastair M Thompson19, Ewan Birney8, Hendrik G. Stunnenberg7, Marc J. van de Vijver15, John W.M. Martens6, Anne Lise Børresen-Dale9, Andrea L. Richardson11, Gu Kong17, Gilles Thomas, Michael R. Stratton2 
07 Feb 2019-Nature
TL;DR: In the Methods section of this Article, ‘greater than’ should have been ‘less than‘ in the sentence ‘Putative regions of clustered rearrangements were identified as having an average inter-rearrangement distance that was at least 10 times greater than the whole-genome average for the individual sample’.
Abstract: In the Methods section of this Article, ‘greater than’ should have been ‘less than’ in the sentence ‘Putative regions of clustered rearrangements were identified as having an average inter-rearrangement distance that was at least 10 times greater than the whole-genome average for the individual sample. ’. The Article has not been corrected.

13 citations


Journal ArticleDOI
TL;DR: In a subset of T-cell acute lymphoblastic leukaemia (T-ALL) the authors identified non-methionine mutations of the key modifiable H3 residues, lysine (K) 27 and 36, which suggest that H3 mutations in solid tumours are influenced by H3 methylation and acetylation patterns.
Abstract: Mutations affecting key modifiable histone type 3 (H3; Table SI) residues are frequent oncogenic events in certain solid tumours (Feinberg et al, 2016), and have also recently been implicated in a subset of acute myeloid leukaemia (AML) (Lehnertz et al, 2017). Here, we systematically reviewed the somatic mutations in >20 000 cancer specimens to identify tumours harbouring H3 mutations. In a subset of T-cell acute lymphoblastic leukaemia (T-ALL) we identified non-methionine mutations of the key modifiable H3 residues, lysine (K) 27 and 36. The starting point of our investigation was a search for H3 hotspot mutations in 1020 human cancer cell lines (Table SII). In two cell lines, both derived from T-ALL, we found lysineto-arginine mutations at H3K27 and H3K36 (Table I). One of the cell lines, LOUCY, is derived from a NOTCH1 wild-type adult T-ALL (Ben-Bassat et al, 1990). The second, CML-T1, was derived from the T-lymphoblastic blast crisis of chronic myeloid leukaemia (Kuriyama et al, 1989). Ten further T-ALL cell lines lacked coding H3 mutations (Table SIII). In solid tumours, H3K27 and H3K36 are typically mutated to methionine (Fig 1) (Feinberg et al, 2016). However, recent functional studies of H3 lysine-to-isoleucine mutations in AML demonstrate that the latter also dramatically alter global H3 methylation and acetylation patterns (Lehnertz et al, 2017). Therefore, we speculated that lysine-to-non-methionine mutations may also be drivers of a subset of T-ALL. We next searched for canonical H3 mutations in a published targeted sequencing study of 633 epigenetic regulator genes in >1000 childhood tumours encompassing 21 cancer subtypes (Huether et al, 2014). Amongst 91 T-ALL specimens, there were two cases with canonical H3 mutations: H3F3A p.K27R and H3F3A p.K36R (Table I). Both mutations were clonal, with a variant allele fraction (VAF) of 38% and 55%, respectively. Among the 37 tumours with H3K mutations, lysine-to-arginine mutations were restricted to TALL (P = 0 001502; Fisher’s exact test). We then extended our screen for H3 mutations to 18 704 tumours, encompassing >60 cancer types other than T-ALL (Tables SIV and SV). This dataset comprised 8764 internally sequenced specimens and 9940 TCGA samples re-analysed using an in-house variant calling pipeline as previously described (Martincorena et al, 2017). We identified only one neomorphic H3 mutation in an acute leukaemia specimen: a previously reported HIST1H3D p.K27M mutation in an adult AML case (TCGA-AB2927-03) (Lehnertz et al, 2017). Finally, we examined an additional T-ALL cohort by capillary sequencing of recurrently mutated modifiable residues K27, G34, and K36 across four frequently mutated H3 genes (Tables SVI and SVII). The cohort comprised 38 T-ALL cases described in detail previously (Maser et al, 2007). One specimen from a 30-year-old patient harboured a H3F3A p.K27N mutation (Figure S1). Interestingly, a H3F3A p.K27N mutation and a H3F3A p.K27T variant were previously identified in a T-ALL RNA sequencing study (n = 31) (Atak et al, 2013). Collectively, our findings indicate that H3K27 and H3K36 mutations are recurrent in T-ALL, a result we were able to reproduce across multiple different cohorts encompassing adult and paediatric cases. This finding is congruent with the fact that mutations in SETD2 and EZH2, methyltransferases that catalyse trimethylation (me3) of H3K36 and H3K27, respectively, are frequent T-ALL drivers (Belver & Ferrando, 2016). Disruptive SETD2 alterations occur in 7 8% of early T cell precursor acute lymphoblastic leukaemia (ETP-ALL), an aggressive subtype with stem cell-like features (Belver & Ferrando, 2016). Interestingly, both T-ALL specimens with H3K36R mutations originated from ETP-ALL (Table I). Notably, mutually exclusive SETD2 and H3K36/H3K34 mutations are reported in paediatric high grade glioma, where both result in reduced H3K36me3 mediated by SETD2 (Feinberg et al, 2016). It is unclear whether a similar co-mutation pattern exists in T-ALL, as H3 genes have not been included in targeted sequencing panels used by the largest T-ALL genomic studies (Belver & Ferrando, 2016). The role of H3K27 modifications in T-ALL pathogenesis is complex (Belver & Ferrando, 2016). It is plausible that mutations affecting this residue could impact the activity of several histone modifiers with established roles in T-ALL pathogenesis. Loss-of-function mutations in EZH2 or other core components of Polycomb repressive complex 2 (PRC2) are found in 42% of ETP-ALL and 25% of T-ALL overall (Belver & Ferrando, 2016). Impaired PRC2 catalytic activity in T-ALL is associated with reduced H3K27me3, stemness and poor prognosis (Belver & Ferrando, 2016). H3F3A p.K27M mutations appear to act predominantly by blocking H3K27 diand trimethylation and increasing H3K27 acetylation (Feinberg et al, 2016). Recent work demonstrates that H3K27I mutations in AML are associated with similar changes in H3 modification patterns (Lehnertz et al, 2017), suggesting that other non-methionine mutations at modifiable H3 residues may influence the activity of PRC2 and correspondence

7 citations


Proceedings ArticleDOI
TL;DR: Remarkably, recurrent acquisitions of certain cancer-associated mutations, particularly those that are advantageous to cell growth, proliferation and migration, are identified, and it is shown that such events occur early in life, potentially even before adolescence.
Abstract: Human endometrium is a highly dynamic tissue that undergoes hundreds of cycles of breakdown, rapid repair and remodelling in response to the oscillating levels of oestrogen and progesterone during female reproductive years. The marked regenerative capacity of this tissue’s epithelial compartment is maintained by intra-glandular adult stem cells (ASCs) that reside within the stratum basalis which is retained during menstruation. Although the endometrial ASCs were first described over a decade ago, they remain poorly characterised in comparison to their counterparts in other tissues, such as the small and the large intestines. In particular, the size of the stem cell pool within individual glands, the rates of their division, and mutational landscape are largely unknown. In this study, we isolated 215 morphologically normal endometrial glands from women aged 19 to 81 using laser capture microscopy. Analysis of whole-genome sequencing data identified that the overwhelming majority of the glands were clonal cell populations, and thus originating from a single ASC. Somatic mutations were found to accumulate at a linear rate during adult life. Elevated body mass index (BMI), a well-recognised risk factor for endometrial cancer, accelerated the rate of mutation acquisition. Surprisingly, despite the heterogeneity in age, reproductive history and BMI in our cohort, we find relatively homogenous mutational processes within normal endometrium. Comparison with cancer, shows lower somatic mutation burden and fewer operative signatures. Remarkably, we not only identify recurrent acquisitions of certain cancer-associated mutations, particularly those that are advantageous to cell growth, proliferation and migration, but also show that such events occur early in life, potentially even before adolescence. Over time, these mutant ASCs serve as a reservoir for the acquisition of further driver mutations to the extent that in some cases, the entire sampled endometrium becomes ‘neoplastic’ on the genomic level while still retaining the apparently normal phenotype. In older individuals, we observe a shift in the spectrum of acquired cancer-associated mutations, possibly reflecting post-menopausal changes in the levels of sex-steroid hormones and the resultant tissue microenvironment. Citation Format: Luiza Moore, Daniel Leongamornlert, Tim Coorens, Mathijs Sanders, Peter Ellis, Francesco Maura, Kevin Dawson, Simon F. Brunner, Jyoti Nangalia, Henry Lee-Six, Raheleh Rahbari, Patrick Tarpey, Yvette Hooks, Krishnaa Mahbubani, Christine A. Iacobuzio-Donahue, Jan J. Brosens, Inigo Martincorena, Kourosh Saeb-Parsy, Peter J. Campbell, Michael R. Stratton. The mutational landscape of normal human endometrial epithelium [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 970.

Posted ContentDOI
29 May 2019-bioRxiv
TL;DR: SigProfilerMatrixGenerator is a computational tool designed for optimized exploration and visualization of mutational patterns for all types of small mutational events and is the first to provide support for classifying doublet base substitutions and small insertions and deletions.
Abstract: Background Cancer genomes are peppered with somatic mutations imprinted by different mutational processes. The mutational pattern of a cancer genome can be used to identify and understand the etiology of the underlying mutational processes. A plethora of prior research has focused on examining mutational signatures and mutational patterns from single base substitutions and their immediate sequencing context. We recently demonstrated that further classification of small mutational events (including substitutions, insertions, deletions, and doublet substitutions) can be used to provide a deeper understanding of the mutational processes that have molded a cancer genome. However, there has been no standard tool that allows fast, accurate, and comprehensive classification for all types of small mutational events Results Here, we present SigProfilerMatrixGenerator, a computational tool designed for optimized exploration and visualization of mutational patterns for all types of small mutational events. SigProfilerMatrixGenerator is written in Python with an R wrapper package provided for users that prefer working in an R environment. SigProfilerMatrixGenerator produces fourteen distinct matrices by considering transcriptional strand bias of individual events and by incorporating distinct classifications for single base substitutions, doublet base substitutions, and small insertions and deletions. While the tool provides a comprehensive classification of mutations, SigProfilerMatrixGenerator is also faster and more memory efficient than existing tools that generate only a single matrix. Conclusions SigProfilerMatrixGenerator provides a standardized method for classifying small mutational events that is both efficient and scalable to large datasets. In addition to extending the classification of single base substitutions, the tool is the first to provide support for classifying doublet base substitutions and small insertions and deletions. SigProfilerMatrixGenerator is freely available at https://github.com/AlexandrovLab/SigProfilerMatrixGenerator with an extensive documentation at https://osf.io/s93d5/wiki/home/.

Posted ContentDOI
11 Nov 2019-bioRxiv
TL;DR: The base substitution rate of affected colonic epithelial cells to be doubled after IBD onset and non-synonymous mutations in ARID1A, PIGR and ZC3H12A, and genes in the interleukin 17 and Toll-like receptor pathways, were under positive selection in colonic crypts from IBD patients.
Abstract: Summary paragraph Inflammatory bowel disease (IBD) is a chronic inflammatory disease associated with increased risk of gastrointestinal cancers1–3 but our understanding of the effects of IBD on the mutational profile and clonal structure of the colon is limited. Here, we isolated and whole-genome sequenced 370 colonic crypts from 45 IBD patients, and compared these to 413 crypts from 41 non-IBD controls. We estimated the base substitution rate of affected colonic epithelial cells to be doubled after IBD onset. This change was primarily driven by acceleration of mutational processes ubiquitously observed in normal colon, and we did not detect an IBD-specific mutational process. In contrast to the normal colon, where clonal expansions outside the confines of the crypt are rare, we observed widespread millimeter-scale clonal expansions. We also found that non-synonymous mutations in ARID1A, PIGR and ZC3H12A, and genes in the interleukin 17 and Toll-like receptor pathways, were under positive selection in colonic crypts from IBD patients. With the exception of ARID1A, these genes and pathways have not been previously associated with cancer risk. Our results provide new insights into the consequences of chronic intestinal inflammation on the mutational profile and clonal structure of colonic epithelia and point to potential therapeutic targets for IBD.