scispace - formally typeset
Search or ask a question

Showing papers by "John Douglas Mcpherson published in 2014"


Journal ArticleDOI
20 Feb 2014-Nature
TL;DR: Highly purified haematopoietic stem cells, progenitor and mature cell fractions from the blood of AML patients were found to contain recurrent DNMT3A mutations at high allele frequency, but without coincident NPM1 mutations (NPM1c) present in AML blasts.
Abstract: In acute myeloid leukaemia (AML), the cell of origin, nature and biological consequences of initiating lesions, and order of subsequent mutations remain poorly understood, as AML is typically diagnosed without observation of a pre-leukaemic phase. Here, highly purified haematopoietic stem cells (HSCs), progenitor and mature cell fractions from the blood of AML patients were found to contain recurrent DNMT3A mutations (DNMT3A(mut)) at high allele frequency, but without coincident NPM1 mutations (NPM1c) present in AML blasts. DNMT3A(mut)-bearing HSCs showed a multilineage repopulation advantage over non-mutated HSCs in xenografts, establishing their identity as pre-leukaemic HSCs. Pre-leukaemic HSCs were found in remission samples, indicating that they survive chemotherapy. Therefore DNMT3A(mut) arises early in AML evolution, probably in HSCs, leading to a clonally expanded pool of pre-leukaemic HSCs from which AML evolves. Our findings provide a paradigm for the detection and treatment of pre-leukaemic clones before the acquisition of additional genetic lesions engenders greater therapeutic resistance.

1,142 citations


Journal ArticleDOI
17 Apr 2014-Nature
TL;DR: This corrects the article to show that the method used to derive the H2O2 “spatially aggregating force” is based on a two-step process, not a single step, called a “shots fired” process.
Abstract: Nature 506, 328–333 (2014); doi:10.1038/nature13038 Author Fouad Yousif (of the Ontario Institute for Cancer Research, Toronto, Canada) should have been included in the author list after Andrew M. K. Brown with affiliation number 7 and listed in the Author Contributions as performing and analysing targeted sequencing; these omissions have been corrected in the online versions of this Article.

466 citations


Journal ArticleDOI
TL;DR: Functional studies demonstrated that this kinase-activating alteration likely constitutes a driver of PLGA.
Abstract: Polymorphous low-grade adenocarcinoma (PLGA) is the second most frequent type of malignant tumor of the minor salivary glands. We identified PRKD1 hotspot mutations encoding p.Glu710Asp in 72.9% of PLGAs but not in other salivary gland tumors. Functional studies demonstrated that this kinase-activating alteration likely constitutes a driver of PLGA.

176 citations


Journal ArticleDOI
TL;DR: Next-generation sequencing is used to identify causative or candidate genes for 13 of 27 NS patients lacking known NS-associated mutations and identify gain-of-function alleles in Ras-like without CAAX 1 and mitogen-activated protein kinase kinase 1 (MAP2K1) and previously unseen loss-of the function variants in RAS p21 protein activator 2 (RASA2) that are likely to cause NS in these patients.
Abstract: Noonan syndrome (NS) is a relatively common genetic disorder, characterized by typical facies, short stature, developmental delay, and cardiac abnormalities. Known causative genes account for 70–80% of clinically diagnosed NS patients, but the genetic basis for the remaining 20–30% of cases is unknown. We performed next-generation sequencing on germ-line DNA from 27 NS patients lacking a mutation in the known NS genes. We identified gain-of-function alleles in Ras-like without CAAX 1 (RIT1) and mitogen-activated protein kinase kinase 1 (MAP2K1) and previously unseen loss-of-function variants in RAS p21 protein activator 2 (RASA2) that are likely to cause NS in these patients. Expression of the mutant RASA2, MAP2K1, or RIT1 alleles in heterologous cells increased RAS-ERK pathway activation, supporting a causative role in NS pathogenesis. Two patients had more than one disease-associated variant. Moreover, the diagnosis of an individual initially thought to have NS was revised to neurofibromatosis type 1 based on an NF1 nonsense mutation detected in this patient. Another patient harbored a missense mutation in NF1 that resulted in decreased protein stability and impaired ability to suppress RAS-ERK activation; however, this patient continues to exhibit a NS-like phenotype. In addition, a nonsense mutation in RPS6KA3 was found in one patient initially diagnosed with NS whose diagnosis was later revised to Coffin–Lowry syndrome. Finally, we identified other potential candidates for new NS genes, as well as potential carrier alleles for unrelated syndromes. Taken together, our data suggest that next-generation sequencing can provide a useful adjunct to RASopathy diagnosis and emphasize that the standard clinical categories for RASopathies might not be adequate to describe all patients.

156 citations


Journal ArticleDOI
TL;DR: NGS had the greatest detection sensitivity, largest dynamic range of detection and highest accuracy in differential expression analysis when compared with gold-standard quantitative real-time PCR, showing the superior sensitivity, accuracy and robustness of NGS for the comprehensive profiling of miRNAs in both frozen and FFPE tissues.

122 citations


Journal ArticleDOI
TL;DR: Analysis of morphological characters reinterpreted on a 27-gene paleognath topology indicates that many characters are convergent in the ratites, probably as the result of adaptation to a cursorial life style.
Abstract: One of the most startling discoveries in avian molecular phylogenetics is that the volant tinamous are embedded in the flightless ratites, but this topology remains controversial because recent morphological phylogenies place tinamous as the closest relative of a monophyletic ratite clade. Here, we integrate new phylogenomic sequences from 1,448 nuclear DNA loci totaling almost 1 million bp from the extinct little bush moa, Chilean tinamou, and emu with available sequences from ostrich, elegant crested tinamou, four neognaths, and the green anole. Phylogenetic analysis using standard homogeneous models and heterogeneous models robust to common topological artifacts recovered compelling support for ratite paraphyly with the little bush moa closest to tinamous within ratites. Ratite paraphyly was further corroborated by eight independent CR1 retroposon insertions. Analysis of morphological characters reinterpreted on a 27-gene paleognath topology indicates that many characters are convergent in the ratites, probably as the result of adaptation to a cursorial life style.

82 citations


Journal ArticleDOI
TL;DR: The results suggest that the quality and safety of human iPSCs might be enhanced by using antioxidants in the growth media during the generation and maintenance of i PSCs.
Abstract: Somatic cells can be reprogrammed to induced pluripotent stem cells (iPSCs) using oncogenic transcription factors. However, this method leads to genetic aberrations in iPSCs via unknown mechanisms, which may limit their clinical use. Here, we demonstrate that the supplementation of growth media with antioxidants reduces the genome instability of cells transduced with the reprogramming factors. Antioxidant supplementation did not affect transgene expression level or silencing kinetics. Importantly, iPSCs made with antioxidants had significantly fewer de novo copy number variations, but not fewer coding point mutations, than iPSCs made without antioxidants. Our results suggest that the quality and safety of human iPSCs might be enhanced by using antioxidants in the growth media during the generation and maintenance of iPSCs.

72 citations


Journal ArticleDOI
01 Oct 2014-Cancer
TL;DR: Patients' knowledge, attitudes, and expectations toward GTC are described, which characterizes genes that play an important role in the development and growth of a patient's cancer.
Abstract: BACKGROUND Genomic testing in cancer (GTC) characterizes genes that play an important role in the development and growth of a patient's cancer. This form of DNA testing is currently being studied for its ability to guide cancer therapy. The objective of the current study was to describe patients' knowledge, attitudes, and expectations toward GTC. METHODS A 42-item self-administered GTC questionnaire was developed by a multidisciplinary group and patient pretesting. The questionnaire was distributed to patients with advanced cancer who were referred to the Princess Margaret Cancer Center for a phase 1 clinical trial or GTC testing. RESULTS Results were reported from 98 patients with advanced cancer, representing 66% of the patients surveyed. Seventy-six percent of patients were interested in learning more about GTC, and 64% reported that GTC would significantly improve their cancer care. The median score on a 12-item questionnaire to assess knowledge of cancer genomics was 8 of 12 items correct (67%; interquartile range, 7-9 of 12 items correct [58%-75%]). Scores were associated significantly with patients' education level (P < .0001). Sixty-six percent of patients would consent to a needle biopsy, and 39% would consent to an invasive surgical biopsy if required for GTC. Only 48% of patients reported having sufficient knowledge to make an informed decision to pursue GTC whereas 34% of patients indicated a need for formal genetic counseling. CONCLUSIONS Patients with advanced cancer are motivated to participate in GTC. Patients require further education to understand the difference between somatic and germline mutations in the context of GTC. Educational programs are needed to support patients interested in pursuing GTC. Cancer 2014;120:3066–3073. © 2014 American Cancer Society.

68 citations


Journal ArticleDOI
TL;DR: ShatterProof is computationally efficient, having low memory requirements and near linear computation time, allowing it to become a standard component of sequencing analysis pipelines, enabling researchers to routinely and accurately assess samples for chromothripsis.
Abstract: Background Chromothripsis, a newly discovered type of complex genomic rearrangement, has been implicated in the evolution of several types of cancers. To date, it has been described in bone cancer, SHH-medulloblastoma and acute myeloid leukemia, amongst others, however there are still no formal or automated methods for detecting or annotating it in high throughput sequencing data. As such, findings of chromothripsis are difficult to compare and many cases likely escape detection altogether.

51 citations


Journal ArticleDOI
TL;DR: Genetic, expression and immunohistochemical data implicate COLCA1 and COLCA2 in the pathogenesis of colon cancer.
Abstract: A locus on human chromosome 11q23 tagged by marker rs3802842 was associated with colorectal cancer (CRC) in a genome-wide association study; this finding has been replicated in case–control studies worldwide. In order to identify biologic factors at this locus that are related to the etiopathology of CRC, we used microarray-based target selection methods, coupled to next-generation sequencing, to study 103 kb at the 11q23 locus. We genotyped 369 putative variants from 1,030 patients with CRC (cases) and 1,061 individuals without CRC (controls) from the Ontario Familial Colorectal Cancer Registry. Two previously uncharacterized genes, COLCA1 and COLCA2, were found to be co-regulated genes that are transcribed from opposite strands. Expression levels of COLCA1 and COLCA2 transcripts correlate with rs3802842 genotypes. In colon tissues, COLCA1 co-localizes with crystalloid granules of eosinophils and granular organelles of mast cells, neutrophils, macrophages, dendritic cells and differentiated myeloid-derived cell lines. COLCA2 is present in the cytoplasm of normal epithelial, immune and other cell lineages, as well as tumor cells. Tissue microarray analysis demonstrates the association of rs3802842 with lymphocyte density in the lamina propria (p = 0.014) and levels of COLCA1 in the lamina propria (p = 0.00016) and COLCA2 (tumor cells, p = 0.0041 and lamina propria, p = 6 × 10–5). In conclusion, genetic, expression and immunohistochemical data implicate COLCA1 and COLCA2 in the pathogenesis of colon cancer. Histologic analyses indicate the involvement of immune pathways.

43 citations


Journal ArticleDOI
01 Apr 2014-PLOS ONE
TL;DR: The results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes, however, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism.
Abstract: Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism.

Journal ArticleDOI
TL;DR: A revolution in DNA sequencing technology has enabled new insights from thousands of genomes sequenced across taxa, leading to new insights in medicine and science.
Abstract: A revolution in DNA sequencing technology has enabled new insights from thousands of genomes sequenced across taxa.

Journal ArticleDOI
TL;DR: WaveCNV, a software package to identify copy number alterations by detecting breakpoints of CNVs using translation-invariant discrete wavelet transforms and assign digitized copy numbers to each event using next-generation sequencing data, is developed.
Abstract: Motivation: Copy number variations (CNVs) are a major source of genomic variability and are especially significant in cancer. Until recently microarray technologies have been used to characterize CNVs in genomes. However, advances in next-generation sequencing technology offer significant opportunities to deduce copy number directly from genome sequencing data. Unfortunately cancer genomes differ from normal genomes in several aspects that make them far less amenable to copy number detection. For example, cancer genomes are often aneuploid and an admixture of diploid/non-tumor cell fractions. Also patient-derived xenograft models can be laden with mouse contamination that strongly affects accurate assignment of copy number. Hence, there is a need to develop analytical tools that can take into account cancer-specific parameters for detecting CNVs directly from genome sequencing data. Results: We have developed WaveCNV, a software package to identify copy number alterations by detecting breakpoints of CNVs using translation-invariant discrete wavelet transforms and assign digitized copy numbers to each event using next-generation sequencing data. We also assign alleles specifying the chromosomal ratio following duplication/loss. We verified copy number calls using both microarray (correlation coefficient 0.97) and quantitative polymerase chain reaction (correlation coefficient 0.94) and found them to be highly concordant. We demonstrate its utility in pancreatic primary and xenograft sequencing data. Availability and implementation: Source code and executables are available at https://github.com/WaveCNV. The segmentation algorithm is implemented in MATLAB, and copy number assignment is implemented Perl. Contact: lakshmi.muthuswamy@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
11 Apr 2014-PLOS ONE
TL;DR: Each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach, which provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.
Abstract: We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.

Posted ContentDOI
24 Dec 2014-bioRxiv
TL;DR: This benchmarking exercise has highlighted several fundamental parameters to consider in high-throughput sequencing, which will allow for better optimization and planning of both basic and translational studies.
Abstract: As next-generation sequencing becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Through the International Cancer Genome Consortium (ICGC), we compared sequencing pipelines at five independent centers (CNAG, DKFZ, OICR, RIKEN and WTSI) using a single tumor-blood DNA pair. Analyses by each center and with one standardized algorithm revealed significant discrepancies. Although most pipelines performed well for coding mutations, library preparation methods and sequencing coverage metrics clearly influenced downstream results. PCR-free methods showed reduced GC-bias and more even coverage. Increasing sequencing depth to ~100x (about three times current standards) showed a benefit, as long as the tumor:control coverage ratio remained balanced. To become part of routine clinical care, high-throughput sequencing must be globally compatible and comparable. This benchmarking exercise has highlighted several fundamental parameters to consider in this regard, which will allow for better optimization and planning of both basic and translational studies.

Journal ArticleDOI
13 Feb 2014-PLOS ONE
TL;DR: Both STR and SNP genotyping methods of sample identification are evaluated, with a focus on paired FFPE tumor/normal DNA samples intended for next-generation sequencing (NGS), to enable cost savings by reducing rework.
Abstract: Short tandem repeat (STR) analysis, such as the AmpFlSTR® Identifiler® Plus kit, is a standard, PCR-based human genotyping method used in the field of forensics. Misidentification of cell line and tissue DNA can be costly if not detected early; therefore it is necessary to have quality control measures such as STR profiling in place. A major issue in large-scale research studies involving archival formalin-fixed paraffin embedded (FFPE) tissues is that varying levels of DNA degradation can result in failure to correctly identify samples using STR genotyping. PCR amplification of STRs of several hundred base pairs is not always possible when DNA is degraded. The Sample ID Plus® panel from Sequenom allows for human DNA identification and authentication using SNP genotyping. In comparison to lengthy STR amplicons, this multiplexing PCR assay requires amplification of only 76–139 base pairs, and utilizes 47 SNPs to discriminate between individual samples. In this study, we evaluated both STR and SNP genotyping methods of sample identification, with a focus on paired FFPE tumor/normal DNA samples intended for next-generation sequencing (NGS). The ability to successfully validate the identity of FFPE samples can enable cost savings by reducing rework.

Journal ArticleDOI
TL;DR: SeqControl, a framework for predicting sequencing quality and coverage using a set of 15 metrics describing overall coverage, coverage distribution, basewise coverage and basewis quality, is developed.
Abstract: As high-throughput sequencing continues to increase in speed and throughput, routine clinical and industrial application draws closer. These 'production' settings will require enhanced quality monitoring and quality control to optimize output and reduce costs. We developed SeqControl, a framework for predicting sequencing quality and coverage using a set of 15 metrics describing overall coverage, coverage distribution, basewise coverage and basewise quality. Using whole-genome sequences of 27 prostate cancers and 26 normal references, we derived multivariate models that predict sequencing quality and depth. SeqControl robustly predicted how much sequencing was required to reach a given coverage depth (area under the curve (AUC) = 0.993), accurately classified clinically relevant formalin-fixed, paraffin-embedded samples, and made predictions from as little as one-eighth of a sequencing lane (AUC = 0.967). These techniques can be immediately incorporated into existing sequencing pipelines to monitor data quality in real time. SeqControl is available at http://labs.oicr.on.ca/Boutros-lab/software/SeqControl/.

Posted ContentDOI
24 Dec 2014-bioRxiv
TL;DR: It is concluded that somatic mutation calling remains an unsolved problem and critical issues that need to be addressed before this valuable technology can be routinely used to inform clinical decision-making are highlighted.
Abstract: The emergence of next generation DNA sequencing technology is enabling high-resolution cancer genome analysis. Large-scale projects like the International Cancer Genome Consortium (ICGC) are systematically scanning cancer genomes to identify recurrent somatic mutations. Second generation DNA sequencing, however, is still an evolving technology and procedures, both experimental and analytical, are constantly changing. Thus the research community is still defining a set of best practices for cancer genome data analysis, with no single protocol emerging to fulfil this role. Here we describe an extensive benchmark exercise to identify and resolve issues of somatic mutation calling. Whole genome sequence datasets comprising tumor-normal pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, were shared within the ICGC and submissions of somatic mutation calls were compared to verified mutations and to each other. Varying strategies to call mutations, incomplete awareness of sources of artefacts, and even lack of agreement on what constitutes an artefact or real mutation manifested in widely varying mutation call rates and somewhat low concordance among submissions. We conclude that somatic mutation calling remains an unsolved problem. However, we have identified many issues that are easy to remedy that are presented here. Our study highlights critical issues that need to be addressed before this valuable technology can be routinely used to inform clinical decision-making.

Book ChapterDOI
01 Oct 2014
TL;DR: This personalized medicine approach brings the hope that by understanding the specific underlying genetic alterations driving malignant growth of a tumor, treatment regimens specifically targeting the aberrant gene will be possible improving treatment outcomes.
Abstract: Cancer is a disease of the genome and all cancers arise due to alterations in DNA that allow the cell to override the checks and balances that are essential for controlled growth and differentiation. To date, the clinical management of cancer has largely been guided by knowing the tissue of origin and histopathological appearance of the tumor with only a limited use of measurement of altered protein expression and DNA aberrations. Sanger/capillary sequencing is the clinical mainstay for DNA sequencing but suffers from low detection sensitivity in heterogeneous tumor cell populations and limited throughput. Next-generation sequencing platforms enable massively parallel analysis of DNA generating millions of template sequences simultaneously. This heralds a new era of cancer genome investigation whereby the determination of the full spectrum of molecular changes in a tumor is now possible on a large scale. Rapid DNA analysis to derive a molecular profile of a tumour with respect to somatic mutations can be used to match specific drugs or treatments to the individual patient. This personalized medicine approach brings the hope that by understanding the specific underlying genetic alterations driving malignant growth of a tumor, treatment regimens specifically targeting the aberrant gene will be possible improving treatment outcomes.