scispace - formally typeset
Search or ask a question
Posted ContentDOI

Germline loss of MBD4 predisposes to leukaemia due to a mutagenic cascade driven by 5mC

TL;DR: A novel cancer predisposition syndrome resulting from germline biallelic inactivation of MBD4 that leads to the development of acute myeloid leukaemia (AML), and a critical interaction with somatic mutations in DNMT3A that accelerates leukaemogenesis and accounts for the conserved path to AML is highlighted.
Abstract: Cytosine methylation is essential for normal mammalian development, yet also provides a major mutagenic stimulus. Methylcytosine (5mC) is prone to spontaneous deamination, which introduces cytosine to thymine transition mutations (C>T) upon replication. Cells endure hundreds of 5mC deamination events each day and an intricate repair network is engaged to restrict this damage. Central to this network are the DNA glycosylases MBD4 and TDG, which recognise T:G mispairing and initiate base excision repair (BER). Here we describe a novel cancer predisposition syndrome resulting from germline biallelic inactivation of MBD4 that leads to the development of acute myeloid leukaemia (AML). These leukaemias have an extremely high burden of C>T mutations, specifically in the context of methylated CG dinucleotides (CG>TG). This dependence on 5mC as a source of mutations may explain the remarkable observation that MBD4-deficient AMLs share a common set of driver mutations, including biallelic mutations in DNMT3A and hotspot mutations in IDH1/IDH2. By assessing serial samples taken over the course of treatment, we highlight a critical interaction with somatic mutations in DNMT3A that accelerates leukaemogenesis and accounts for the conserved path to AML. MBD4-deficiency was also detected, rarely, in sporadic cancers, which display the same mutational signature. Collectively these cancers provide a model of 5mC-dependent hypermutation and reveal factors that shape its mutagenic influence.

Summary (2 min read)

Affiliation

  • These authors contributed equally to this work 9 MBD4-deficiency was also detected, rarely, in sporadic cancers, which display the same mutational signature.
  • Both cases exhibited an elevated mutation rate and strong enrichment for CG>TG mutations (Fig. 1d, Extended Data Fig. 1a).
  • This shift in functional activity – the expansion of DNMT3Amutant clones – increases the likelihood that cells with biallelic DNMT3A mutations will emerge, which appears to be key for initiating AML in MBD4-deficient patients.
  • The authors confirmed that recombinant DNMT3A enhances TDG glycosylase activity in vitro (Fig. 4a), but had no impact on MBD4 glycosylase activity (Extended Data Fig. 7).

Contributions

  • All authors discussed the results and agree with the conclusions presented.
  • C, Relative mutation rate in different genomic features per Mb of CG dinucleotides (CG corrected), or corrected for methylation status in CD34+ cells (5mC corrected).
  • Each coloured area is proportional to the representation of the clone and vertical lines indicate sampling points31.
  • B, Schematic representation of the repair pathways governing T:G mismatch repair and the combined influence of germline mutations in MBD4 and somatic mutations in DNMT3A (at top) in AML.

Extended Data References – pg. 20-21

  • Supplementary Information Somatic mutations detected in MBD4-deficient AML at diagnosis (hg19).
  • A quality score is provided , variants with a score >0.5 were used for mutation signature analysis.

AML cases

  • Sanger sequencing traces were generated from cloned PCR products after amplification from DNA (top).
  • B, A schematic of the MBD4 gene is shown at top together with the position of two candidate loss-of-function variants that impact splice sites.
  • Sites with mutations were typically fully methylated in the control sample.
  • Individual values are plotted (n=2) and the bar shows the mean.
  • The relative mutation rate was calculated per bin based on CG or 5mCG abundance (as in a).

Clinical synopsis

  • The AML was negative for NPM1, FLT3 and CEBPa mutations.
  • She had induction chemotherapy (high dose cytarabine, idarubicin and etoposide) and achieved complete morphologic and cytogenetic remission.
  • Bone marrow examination 5 weeks post allogeneic HSCT showed complete morphologic and cytogenetic remission; and full donor chimerism.
  • Relapsed AML (of WEHI-AML-1 origin) occurred 11 weeks post allogeneic HSCT.

Methods

  • Patient characteristics and sample collection EMC-AML-1, WEHI-AML-1 and WEHI-AML-2 were diagnosed with AML and treated with combination chemotherapy as per the protocols at their respective institutions [see Clinical Synopsis].
  • They gave informed consent according to the Declaration of Helsinki for participation in research and for collection of samples over the course of their treatment.
  • DNA libraries were quantified and used for both whole genome sequencing and whole exome sequencing.
  • Reduced representation bisulfite sequencing (RRBS) For WEHI-AML-1 and WEHI-AML-2, between 75 to 100 ng of DNA was used to construct RRBS libraries using the Ovation RRBS Methyl-Seq System (NuGEN, San Carlos, CA, USA).
  • DNA was restriction enzyme digested using Mspl followed by ligation with indexed adaptors.

RNA sequencing

  • For WEHI-AML-1 and WEHI-AML-2, total RNA was extracted using TRIzol (Thermo Fisher Scientific, Waltham, MA, USA) as per manufacturer’s instructions.
  • As the mutations occurred almost exclusively in a CG context, the rate of CG>TG mutations per CG was calculated for each genomic feature.
  • Transcriptional strand and expression level: Transcriptional strand bias analysis was performed by determining the template and non-template strands per gene as reported in Ensembl v7513.
  • Libraries were generated as per manufacturer’s instructions and the sequencing was performed on a MiSeq.

Site-directed mutagenesis and cloning

  • And anti-sense 5’- TTGTATTTCCAGGGCGGCACGACTGGGCTGGAGAGTCT-3’. QuikChange II XL Site-Directed Mutagenesis Kit (Agilent Technologies, Santa Clara, CA, USA) was used to generate the DNMT3A and MBD4 mutants.
  • Proteins were verified by SDS-PAGE using a NuPage Novex 4-12% Bis-Tris Protein Gel run in a Bis-Tris XCell SureLock™ Mini-Cell system (Thermo Fisher Scientific, Waltham, MA, USA) with 1x MOPS at 200V for 90 minutes.
  • MBD4 and TDG glycosylase activity assays MBD4 and TDG glycosylase activity assays were performed as described (Hashimoto et al., NAR, 2012).

Data availability

  • Sequencing data from WEHI-AML-1 and WEHI-AML-2 have been deposited at the European Genome Phenome Archive (EGA) [EGAS00001002581].
  • The data are available for ethically approved research into haematological malignancy upon completion of a data transfer agreement.
  • Sequencing data from EMC-AML-1 were sourced from the dbGaP under accession phs001027.
  • TCGA data were downloaded from the GDC Data Commons.
  • Code to reproduce the figures and data are made available through GitHub (https://github.com/MathijsSanders/AML-RoaMeR).

Extended Data References

  • Distinct evolution and dynamics of epigenetic and genetic heterogeneity in acute myeloid leukemia.
  • The UCSC Genome Browser database: 2017 update.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

1
Title
Germline loss of MBD4 predisposes to leukaemia due to a mutagenic cascade
driven by 5mC
Authors
Mathijs A. Sanders
1,8
, Edward Chew
2,3,4,5,8
, Christoffer Flensburg
2,4
, Annelieke
Zeilemaker
1
, Sarah E. Miller
2
, Adil S. al Hinai
1,6
, Ashish Bajel
3,5
, Bram Luiken
1
,
Melissa Rijken
1
, Tamara Mclennan
7
, Remco M. Hoogenboezem
1
, François G.
Kavelaars
1
, Marnie E. Blewitt
4,7
, Eric M. Bindels
1
, Warren S. Alexander
2,4
, Bob
Löwenberg
1
, Andrew W. Roberts
2,3,4,5
, Peter J.M. Valk
1,9
*, Ian J. Majewski
2,4,9
*
Affiliation
1
Department of Hematology, Erasmus University Medical Center, Rotterdam, The
Netherlands
2
Division of Cancer and Haematology, The Walter and Eliza Hall Institute of Medical
Research, Parkville, Australia
3
Department of Clinical Haematology and Bone Marrow Transplantation, Royal
Melbourne Hospital, Parkville, Australia
4
Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne,
Parkville, Australia
5
Victorian Comprehensive Cancer Centre, Parkville, Australia
6
National Genetic Center, Royal Hospital, Ministry of Health, Sultanate of Oman
7
Division of Molecular Medicine, The Walter and Eliza Hall Institute of Medical
Research, Parkville, Australia
8
These authors contributed equally to this work
9
These authors jointly directed this work
* Correspondence
Peter J.M. Valk
Department of Hematology
Erasmus University Medical Center
Em: p.valk@erasmusmc.nl
Ian J. Majewski
Cancer and Haematology Division
The Walter and Eliza Hall Institute of Medical Research
Em: majewski@wehi.edu.au
.CC-BY-NC-ND 4.0 International licenseavailable under a
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 1, 2017. ; https://doi.org/10.1101/180588doi: bioRxiv preprint

2
Cytosine methylation is essential for normal mammalian development, yet also
provides a major mutagenic stimulus. Methylcytosine (5mC) is prone to spontaneous
deamination, which introduces cytosine to thymine transition mutations (C>T) upon
replication
1
. Cells endure hundreds of 5mC deamination events each day and an
intricate repair network is engaged to restrict this damage. Central to this network
are the DNA glycosylases MBD4
2
and TDG
3,4
, which recognise T:G mispairing and
initiate base excision repair (BER). Here we describe a novel cancer predisposition
syndrome resulting from germline biallelic inactivation of MBD4 that leads to the
development of acute myeloid leukaemia (AML). These leukaemias have an
extremely high burden of C>T mutations, specifically in the context of methylated CG
dinucleotides (CG>TG). This dependence on 5mC as a source of mutations may
explain the remarkable observation that MBD4-deficient AMLs share a common set
of driver mutations, including biallelic mutations in DNMT3A and hotspot mutations in
IDH1/IDH2. By assessing serial samples taken over the course of treatment, we
highlight a critical interaction with somatic mutations in DNMT3A that accelerates
leukaemogenesis and accounts for the conserved path to AML. MBD4-deficiency
was also detected, rarely, in sporadic cancers, which display the same mutational
signature. Collectively these cancers provide a model of 5mC-dependent
hypermutation and reveal factors that shape its mutagenic influence.
We identified three patients with AML, including two siblings, that were distinctive
because of their early age of onset (all <35 years old) and an extremely high
mutational burden (~33-fold above what is typical for AML) (Fig. 1a, Clinical
Synopsis). Virtually all of the somatic mutations identified were C>T in the context of
a CG dinucleotide (>95% of SNVs) (Fig. 1b, Extended Data Fig. 1). This differs
markedly from the distribution of C>T mutations in AML generally and is more
refined than the mutational signature ascribed to ageing, which includes a strong
contribution from 5mC deamination
5
. All three cases carried rare germline loss-of-
function variants in the gene encoding the DNA glycosylase MBD4
2
(Fig. 1c,
Extended Data Table 1). Case EMC-AML-1 carried a homozygous MBD4 in-frame
deletion of Histidine 567 (His567) in the glycosylase domain. An in vitro glycosylase
assay confirmed that loss of His567 results in a catalytically inactive MBD4 protein
(Fig. 1c). The siblings (WEHI-AML-1, WEHI-AML-2) were compound heterozygotes
with a frameshift in exon 3 and a variant that disrupts the splice acceptor of exon 7
(Fig. 1c, Extended Data Table 1). Analysis of the MBD4 mRNA allowed for phasing
of the variants to distinct alleles and confirmed aberrant splicing that excludes exon 7
and disrupts the glycosylase domain (Extended Data Fig. 2). MBD4 has not
previously been associated with haematological malignancy, but somatic mutations
have been detected in sporadic colon cancers with mismatch repair (MMR)
deficiency
6,7
. Two patients (EMC-AML-1, WEHI-AML-2) also had colorectal polyps, a
common manifestation of DNA repair defects, including those associated with loss of
BER components MUTYH
8-10
and NTHL1
11
.
We accessed large cancer databases to explore the link between MBD4-deficiency
and the distinctive CG>TG signature. Analysis of the Cancer Genome Atlas (TCGA)
identified nine cases, from 10,683 total, that carried germline loss-of-function
.CC-BY-NC-ND 4.0 International licenseavailable under a
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 1, 2017. ; https://doi.org/10.1101/180588doi: bioRxiv preprint

3
variants in MBD4 (Fig. 1c, Extended Data Table 1). In two cases, a uveal
melanoma (TCGA-UVM-1) and a glioblastoma multiforme (TCGA-GBM-1), splice
site mutations were accompanied by loss of the wildtype MBD4 allele due to somatic
copy number alterations (Extended Data Fig. 3a). Analysis of RNA sequencing from
both tumours confirmed aberrant splicing of MBD4, predicted to result in protein
truncation and loss of function (Extended Data Fig. 3b). Both cases exhibited an
elevated mutation rate and strong enrichment for CG>TG mutations (Fig. 1d,
Extended Data Fig. 1a). This signature was also observed in a glioma cell line,
SW1783, that carries a homozygous truncating variant in MBD4 at Leu563
(Extended Data Fig. 1a). The cancers that retained a wildtype allele did not display
a prominent CG>TG signature (Fig. 1d). These results suggest both alleles of MBD4
must be inactivated to block its repair activity, which is consistent with other BER-
associated cancer syndromes
8,11
. Analysis of a larger cohort will be required to
determine whether heterozygous loss of MBD4 predisposes to cancer.
Whole genome sequencing and methylation profiling were performed to refine the
mutational signature associated with MBD4-deficiency in AML. While MBD4 is
known to interact with the MMR pathway
12
, MBD4-deficienct leukaemias were
largely devoid of small insertions and deletions, suggesting MMR remains intact.
Overall, >15,000 substitution mutations were identified in each AML genome, of
which >90% were CG>TG (Fig. 2a, Extended Data Fig. 1b). The proportion of
mutations was higher in the context of the ACG triplet and lower in the context of
TCG, with CCG and GCG being intermediate. This difference remained after
correction for trimer abundance and methylation status (Fig. 2b), and was found to
be significant in the exome data from the five MBD4-deficient cancers (p= 0.007937,
Mann-Whitney U test) (Extended Data Fig. 1). The ACA trimer was the most
commonly mutated site outside of a CG context, and this matches the most common
site of non-CG methylation
13
. The mutation rate for a given region was linked to 5mC
abundance. Sparsely methylated regions, such as promoters and CG islands, were
rarely mutated (Fig. 2c). Correcting for 5mC abundance revealed a consistent
mutation rate across different genomic features (Fig. 2c). Reduced representation
bisulfite sequencing (RRBS) confirmed that >95% of CG sites mutated in the AML
were fully methylated in matched normal bone marrow available for two cases (Fig.
2d). Assessment of the mutated sites in each AML directly revealed ~50%
methylation, indicating the non-mutated CG site on the alternate allele was
methylated (Fig. 2d). Similar results were obtained when we assessed sites mutated
in the MBD4-deficient glioblastoma (Extended Data Fig. 4). We next assessed the
influence of genetic and epigenetic features known to influence mutation rate
14
.
Extending the analysis of sequence context to include one base either side of the
CG identified higher mutation rates in the context of a 3’ cytosine (NCGC), with the
highest rate at ACGC (Fig. 2e). The relative mutation rate was not influenced by the
transcriptional strand (Extended Data Fig. 5a), but was higher in late replicating
regions (Fig. 2f) and at lowly expressed genes (Extended Data Fig. 5b).
Collectively these results suggest that while 5mC is the dominant factor contributing
to the mutation rate, the local sequence context, replication timing and expression
status also contribute. The differences between tetramers and enrichment in late
.CC-BY-NC-ND 4.0 International licenseavailable under a
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 1, 2017. ; https://doi.org/10.1101/180588doi: bioRxiv preprint

4
replicating regions were also evident in rare germline CG>TG SNPs from the
gnomAD database
15
, indicating this phenomenon is not restricted to cancer
(Extended Data Fig. 5c).
The three cases with germline MBD4-deficiency shared a common path to AML.
They acquired biallelic DNMT3A mutations and IDH1/IDH2 hotspot mutations, all of
which were CG>TG (Fig. 3). Biallelic DNMT3A mutations are uncommon in AML,
affecting ~3% of patients in TCGA-AML, and when considering they also have
coincident IDH1/IDH2 mutations, it is highly unlikely that these three individuals
share this pattern of driver mutations by chance. These mutations impact 5mC at
multiple levels deposition (DNMT3A), removal (IDH1/IDH2) and repair (MBD4)
and this convergence suggests that modulating DNA methylation is central to AML
pathogenesis in MBD4-deficient cases. Analysis of sequential bone marrow biopsies
taken during treatment and single cell genotyping allowed us to refine the order of
somatic mutation acquisition in two cases (EMC-AML-1, WEHI-AML-1) (Fig. 3a-b,
Extended Data Fig. 6). DNMT3A mutations present in the AML at diagnosis were
also detected in non-malignant bone marrow populations in both cases, indicating
that these mutations are among the first acquired. Mutations in DNMT3A enhance
the self-renewal capacity of haematopoietic stem cells (HSCs) and are associated
with age-related clonal haematopoiesis
16-19
. In both patients, marked expansion of
clones carrying DNMT3A mutations occurred with treatment (Fig. 3a-b), suggesting
a strong advantage over normal HSCs. EMC-AML-1 experienced multiple clonal
outgrowths, with nine distinct DNMT3A mutations, and repeated selection of clones
with biallelic mutations. This shift in functional activity the expansion of DNMT3A-
mutant clones increases the likelihood that cells with biallelic DNMT3A mutations
will emerge, which appears to be key for initiating AML in MBD4-deficient patients.
There is a marked discrepancy between the substantial mutation burden in MBD4-
deficient AMLs and the modest 2-3 fold increase in mutation rate in MBD4-deficient
mice
20,21
. It is unclear whether this difference is a reflection of longer disease latency
in humans, as compared to mice, or whether somatic mutations in the AML further
compromise DNA repair. Mutations in DNMT3A and IDH1/IDH2 have been
associated with altered DNA repair in model systems
22,23
. It also remains unclear
why TDG, a glycosylase with overlapping substrate specificity, does not compensate
for MBD4 loss. One possible explanation stems from the observation that
DNMT3A/B can directly stimulate TDG glycosylase activity
24,25
. We confirmed that
recombinant DNMT3A enhances TDG glycosylase activity in vitro (Fig. 4a), but had
no impact on MBD4 glycosylase activity (Extended Data Fig. 7). Mutant forms of
DNMT3A showed weaker stimulation, and even inhibit TDG at higher concentrations
(Fig. 4a). We propose a model for AML pathogenesis whereby inhibition of DNMT3A
contributes in two ways: loss of one allele enables expansion of a premalignant
clone, then acquisition of a second DNMT3A mutation increases the CG>TG
mutation rate due to impaired TDG activity (Fig. 4b). Supporting this model, the
premalignant clone identified in WEHI-AML-1, which had a monoallelic DNMT3A
mutation, did not carry additional mutations that would suggest an elevated mutation
rate. The sporadic cancers that became MBD4-deficient (TCGA-UVM-1 and TCGA-
.CC-BY-NC-ND 4.0 International licenseavailable under a
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 1, 2017. ; https://doi.org/10.1101/180588doi: bioRxiv preprint

5
GBM-1) did not acquire mutations in DNMT3A or IDH1/IDH2, which may indicate
that this interaction is specific to the haematopoietic compartment.
The last five years have seen a concerted effort to define mutational processes that
shape the cancer genome
5
. Deamination of 5mC is the most common source of
somatic mutations and this damage continues to accumulate with age
26
. Our results
highlight the important role for MBD4 in safeguarding against damage wrought by
5mC deamination. One manifestation of this damage is clonal haematopoiesis, a
phenomenon typically observed in people >70 years of age. Individuals with biallelic
loss of MBD4 in the germline sustain high levels of damage from 5mC deamination
and experience clonal expansions decades earlier, which eventually progress to
AML. There are more than 40 million 5mC residues in the genome, yet these
individuals develop the same type of cancer AML with a common set of driver
mutations. Our results indicate this convergence results from the combination of a
highly restricted mutational signature, which accesses a select set of driver genes,
and the dual role of DNMT3A, which regulates HSC function and directly contributes
to DNA repair. This interaction between mutational process, driver landscape and
stem cell biology has broad implications, and may explain the tissue restricted
pattern of disease in this and other cancer predisposition syndromes.
Acknowledgements
The authors would like to thank Simon He, Anita Rijneveld, Kirsten van Lom and
Kirsten Gussinklo for providing clinical information and reviewing samples; Meaghan
Wall for assistance with cytogenetics; Naomi Sprigg for assistance with sample
collection; Elwin Rombouts for assistance with single cell sorting; Hideharu
Hashimoto and Xiaodong Cheng for the TDG expression vector; Sari van Rossum
and Joyce Lebbink for assistance with recombinant protein isolation; the
Australasian Leukaemia and Lymphoma Group for access to clinical samples; and
Stephen Wilcox for technical assistance with sequencing. Additional sequencing was
performed at The Australian Genome Research Facility (Melbourne, Australia) and
the Kinghorn Centre for Clinical Genomics (Sydney, Australia).! Sean Grimmond,
Jason Wong, Oliver Sieber, Alicia Oshlack and Stephen Nutt provided valuable
feedback on the manuscript.
!
This work was made possible through support from the Australian National Health
and Medical Research Council (NHMRC) (Program Grant 1113577, to W.S.A and
A.W.R), an Independent Research Institutes Infrastructure Support Scheme Grant
(9000220), a Victorian State Government Operational Infrastructure Support Grant,
the Netherlands Organisation for Scientific Research (NWO) and the Center for
Translational Molecular Medicine (CTMM). M.A.S is supported by a grant from
CTMM (GR03O-102) and a Rubicon fellowship from NWO (019.153LW.038), E.C. is
a recipient of a PhD scholarship from the Leukaemia Foundation of Australia, A.H. is
a recipient of a PhD scholarship from the Ministry of Health - Sultanate of Oman,
M.E.B is supported by the Bellberry-Viertel fellowship, W.S.A and A.W.R are
supported by fellowships from NHMRC (1058344 and 1079560, respectively), and
I.J.M. is supported by the Victorian Cancer Agency.! We wish to acknowledge the
.CC-BY-NC-ND 4.0 International licenseavailable under a
was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 1, 2017. ; https://doi.org/10.1101/180588doi: bioRxiv preprint

Citations
More filters
Journal ArticleDOI
TL;DR: SuperFreq is a cancer exome sequencing analysis pipeline that integrates identification of somatic single nucleotide variants (SNVs) and copy number alterations (CNAs) and clonal tracking for both and can be applied in many different experimental settings for the analysis of exomes and other capture libraries.
Abstract: Analysing multiple cancer samples from an individual patient can provide insight into the way the disease evolves. Monitoring the expansion and contraction of distinct clones helps to reveal the mutations that initiate the disease and those that drive progression. Existing approaches for clonal tracking from sequencing data typically require the user to combine multiple tools that are not purpose-built for this task. Furthermore, most methods require a matched normal (non-tumour) sample, which limits the scope of application. We developed SuperFreq, a cancer exome sequencing analysis pipeline that integrates identification of somatic single nucleotide variants (SNVs) and copy number alterations (CNAs) and clonal tracking for both. SuperFreq does not require a matched normal and instead relies on unrelated controls. When analysing multiple samples from a single patient, SuperFreq cross checks variant calls to improve clonal tracking, which helps to separate somatic from germline variants, and to resolve overlapping CNA calls. To demonstrate our software we analysed 304 cancer-normal exome samples across 33 cancer types in The Cancer Genome Atlas (TCGA) and evaluated the quality of the SNV and CNA calls. We simulated clonal evolution through in silico mixing of cancer and normal samples in known proportion. We found that SuperFreq identified 93% of clones with a cellular fraction of at least 50% and mutations were assigned to the correct clone with high recall and precision. In addition, SuperFreq maintained a similar level of performance for most aspects of the analysis when run without a matched normal. SuperFreq is highly versatile and can be applied in many different experimental settings for the analysis of exomes and other capture libraries. We demonstrate an application of SuperFreq to leukaemia patients with diagnosis and relapse samples.

33 citations

Journal ArticleDOI
TL;DR: This review considers both coding and non-coding driver mutations, and discusses how such mutations might be identified from cancer sequencing datasets, and some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes.
Abstract: In the last decade, the costs of genome sequencing have decreased considerably. The commencement of large-scale cancer sequencing projects has enabled cancer genomics to join the big data revolution. One of the challenges still facing cancer genomics research is determining which are the driver mutations in an individual cancer, as these contribute only a small subset of the overall mutation profile of a tumour. Focusing primarily on somatic single nucleotide mutations in this review, we consider both coding and non-coding driver mutations, and discuss how such mutations might be identified from cancer sequencing datasets. We describe some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes. We also address the use of genome-wide variation in mutation load to establish background mutation rates from which to identify driver mutations under positive selection. Finally, we describe the ways in which mutational signatures can act as clues for the identification of cancer drivers, as these mutations may cause, or arise from, certain mutational processes. By defining the molecular changes responsible for driving cancer development, new cancer treatment strategies may be developed or novel preventative measures proposed.

23 citations


Cites background from "Germline loss of MBD4 predisposes t..."

  • ...numbers of C > T mutations (associated with signature 1, following the deamination of methylated cytosines), researchers uncovered a germline mutation in the DNA glycosylase MBD4 that may predispose cells to subsequently developing certain driver mutations that accelerate oncogenesis (Sanders et al. 2017)....

    [...]

  • ...…of C > T mutations (associated with signature 1, following the deamination of methylated cytosines), researchers uncovered a germline mutation in the DNA glycosylase MBD4 that may predispose cells to subsequently developing certain driver mutations that accelerate oncogenesis (Sanders et al. 2017)....

    [...]

Posted ContentDOI
30 Jul 2018-bioRxiv
TL;DR: SuperFreq is a cancer exome sequencing analysis pipeline that integrates identification of somatic single nucleotide variants (SNVs) and copy number alterations (CNAs) and clonal tracking for both and can be applied in many different experimental settings for the analysis of exomes and other capture libraries.
Abstract: Motivation Analysing multiple tumour samples from an individual cancer patient allows insight into the way the disease evolves. Monitoring the expansion and contraction of distinct clones helps to reveal the mutations that initiate the disease and those that drive progression; therefore, the ability to identify and track clones using genomics data is of great interest. Existing approaches for clonal tracking typically require the user to combine multiple tools that are not purpose-made. Furthermore, most methods require a matched normal (non-tumour) sample, which limits the scope of application. Results We have built superFreq, a cancer exome sequencing analysis tool that calls and annotates somatic SNVs and CNAs and attributes them to clones. SuperFreq makes use of unrelated control samples and does not require matched normal samples. We demonstrate the ability of superFreq to track clones by combining real samples in known proportions to simulating a multi-sample analysis. In addition, we compared superFreq to other somatic SNV callers and CNA callers on exome sequencing data from cancer-normal pairs, including 304 participants gathered from 33 cancer types in The Cancer Genome Atlas (TCGA). SuperFreq offers a reliable platform to identify somatic mutations and to track clones. SuperFreq recalled 91% of somatic SNVs identified by a consensus of four other methods, with a median of 1 additional somatic SNV per sample that was not found by any other method. CNA calls from superFreq showed good agreement with those generated by Sequenza, or those from ASCAT generated using matched SNP arrays. Using our simulated data set for testing multi-sample clonal tracking, we found that superFreq identified 93% of clones with a cellular fraction of at least 50%, and mutations were assigned to clones with high recall and close to 100% precision. In addition, SuperFreq maintained a similar level of performance for most aspects of the analysis without a matched normal control. SuperFreq is a highly adaptable method and has already been used in multiple different projects. Availability SuperFreq is implemented in R and available on github at https://github.com/ChristofferFlensburg/superFreq.

22 citations


Cites background from "Germline loss of MBD4 predisposes t..."

  • ...SuperFreq was designed to detect and track somatic mutations in exomes, and it has been applied to study breast cancer metastasis [2, 21], lung cancer xenografts [22], gastric cancer organoids [23], and myeloid leukaemia [24]....

    [...]

Posted ContentDOI
16 Jan 2018-bioRxiv
TL;DR: Similar molecular processes shaping population-scale human genome variation also underlies the rapid evolution of an infant ultra-mutated leukemia, which is one of the earliest manifestations of cancer hypermutation recorded.
Abstract: Background: Mixed lineage leukemia/Histone-lysine N-methyltransferase 2A gene rearrangements occur in 80% of infant acute lymphoblastic leukemia, but the role of cooperating events is unknown. While infant leukemias typically carry few somatic lesions, we identified a case with over 100 somatic point mutations per megabase and here report unique genomic-features of this case. Results: The patient presented at 82 days of age, one of the earliest manifestations of cancer hypermutation recorded. The transcriptional profile showed global similarities to canonical cases. Coding lesions were predominantly clonal and almost entirely targeting alleles reported in human genetic variation databases with a notable exception in the mismatch repair gene, MSH2 . There were no rare germline alleles or somatic mutations affecting proof-reading polymerase genes POLE or POLD1 , however there was a predicted damaging mutation in the error prone replicative polymerase, POLK . The patient9s diagnostic leukemia transcriptome was depleted of rare and low-frequency germline alleles due to loss-of-heterozygosity, while somatic point mutations targeted low-frequency and common human alleles in proportions that offset this discrepancy. Somatic signatures of ultra-mutations were highly correlated with germline single nucleotide polymorphic sites indicating a common role for 5-methylcytosine deamination, DNA mismatch repair and DNA adducts. Conclusions: These data suggest similar molecular processes shaping population-scale human genome variation also underlies the rapid evolution of an infant ultra-mutated leukemia.
References
More filters
Journal ArticleDOI
TL;DR: It is shown that the DNA methyltransferase Dnmt3a interacts with thymine DNA glycosylase (TDG) in vitro, suggesting a mechanistic link between DNA repair and remethylation at sites affected by methylcytosine deamination.
Abstract: While methylcytosines serve as the fifth base encoding epigenetic information, they are also a dangerous endogenous mutagen due to their intrinsic instability. Methylcytosine undergoes spontaneous deamination, at a rate much higher than cytosine, to generate thymine. In mammals, two repair enzymes, thymine DNA glycosylase (TDG) and methyl-CpG binding domain 4 (MBD4), have evolved to counteract the mutagenic effect of methylcytosines. Both recognize G/T mismatches arising from methylcytosine deamination and initiate base-excision repair that corrects them to G/C pairs. However, the mechanism by which the methylation status of the repaired cytosines is restored has remained unknown. We show here that the DNA methyltransferase Dnmt3a interacts with TDG. Both the PWWP domain and the catalytic domain of Dnmt3a are able to mediate the interaction with TDG at its N-terminus. The interaction affects the enzymatic activity of both proteins: Dnmt3a positively regulates the glycosylase activity of TDG, while TDG inhibits the methylation activity of Dnmt3a in vitro. These data suggest a mechanistic link between DNA repair and remethylation at sites affected by methylcytosine deamination.

126 citations

Journal ArticleDOI
TL;DR: The simplicity, power, and flexibility of this tool make it valuable for visualizing tumor evolution, and it has potential utility in both research and clinical settings.
Abstract: Massively-parallel sequencing at depth is now enabling tumor heterogeneity and evolution to be characterized in unprecedented detail. Tracking these changes in clonal architecture often provides insight into therapeutic response and resistance. In complex cases involving multiple timepoints, standard visualizations, such as scatterplots, can be difficult to interpret. Current data visualization methods are also typically manual and laborious, and often only approximate subclonal fractions. We have developed an R package that accurately and intuitively displays changes in clonal structure over time. It requires simple input data and produces illustrative and easy-to-interpret graphs suitable for diagnosis, presentation, and publication. The simplicity, power, and flexibility of this tool make it valuable for visualizing tumor evolution, and it has potential utility in both research and clinical settings. The fishplot package is available at https://github.com/chrisamiller/fishplot .

121 citations

Journal ArticleDOI
TL;DR: It is demonstrated that MUTYH inactivation results in a particular mutational signature, which may serve as a useful marker of BER‐related genomic instability in new cancer types.
Abstract: Germline alterations in DNA repair genes are implicated in cancer predisposition and can result in characteristic mutational signatures. However, specific mutational signatures associated with base excision repair (BER) defects remain to be characterized. Here, by analysing a series of colorectal cancers (CRCs) using exome sequencing, we identified a particular spectrum of somatic mutations characterized by an enrichment of C > A transversions in NpCpA or NpCpT contexts in three tumours from a MUTYH-associated polyposis (MAP) patient and in two cases harbouring pathogenic germline MUTYH mutations. In two series of adrenocortical carcinomas (ACCs), we identified four tumours with a similar signature also presenting germline MUTYH mutations. Taken together, these findings demonstrate that MUTYH inactivation results in a particular mutational signature, which may serve as a useful marker of BER-related genomic instability in new cancer types. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

111 citations

Journal ArticleDOI
TL;DR: The recognition mechanism of flipped-out 5hmU bases inMBD4cat active site supports the potential role of MBD4, together with TDG, in maintenance of genome stability and active DNA demethylation in mammals.
Abstract: Active DNA demethylation in mammals occurs via hydroxylation of 5-methylcytosine to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation family of proteins (TETs). 5hmC residues in DNA can be further oxidized by TETs to 5-carboxylcytosines and/or deaminated by the Activation Induced Deaminase/Apolipoprotein B mRNA-editing enzyme complex family proteins to 5-hydromethyluracil (5hmU). Excision and replacement of these intermediates is initiated by DNA glycosylases such as thymine-DNA glycosylase (TDG), methyl-binding domain protein 4 (MBD4) and single-strand specific monofunctional uracil-DNA glycosylase 1 in the base excision repair pathway. Here, we report detailed biochemical and structural characterization of human MBD4 which contains mismatch-specific TDG activity. Full-length as well as catalytic domain (residues 426–580) of human MBD4 (MBD4cat) can remove 5hmU when opposite to G with good efficiency. Here, we also report six crystal structures of human MBD4cat: an unliganded form and five binary complexes with duplex DNA containing a T•G, 5hmU•G or AP•G (apurinic/apyrimidinic) mismatch at the target base pair. These structures reveal that MBD4cat uses a base flipping mechanism to specifically recognize thymine and 5hmU. The recognition mechanism of flipped-out 5hmU bases in MBD4cat active site supports the potential role of MBD4, together with TDG, in maintenance of genome stability and active DNA demethylation in mammals.

77 citations

Journal ArticleDOI
TL;DR: It is demonstrated that both Tdg and Dnmt3b are colocalized to heterochromatin and reduction of T.G mismatch repair efficiency upon loss of DNA methyltransferase expression, as well as a requirement for an RNA component for correct T.T.G mismatches to initiate base excision repair.

52 citations

Frequently Asked Questions (1)
Q1. What have the authors contributed in "Germline loss of mbd4 predisposes to leukaemia due to a mutagenic cascade driven by 5mc" ?

These authors contributed equally to this work 9 These authors jointly directed this work