scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Genome sequencing of 161 Mycobacterium tuberculosis isolates from China identifies genes and intergenic regions associated with drug resistance

TL;DR: It is suggested that the drug resistance–associated genes identified here likely contain essentially all the nonsynonymous SNPs that have arisen as a result of drug pressure in these isolates and should represent a near-complete set of drug resistance-associated genes for these isolate and antibiotics.
Abstract: The worldwide emergence of multidrug-resistant (MDR) and extensively drug-resistant (XDR) tuberculosis threatens to make this disease incurable. Drug resistance mechanisms are only partially understood, and whether the current understanding of the genetic basis of drug resistance in M. tuberculosis is sufficiently comprehensive remains unclear. Here we sequenced and analyzed 161 isolates with a range of drug resistance profiles, discovering 72 new genes, 28 intergenic regions (IGRs), 11 nonsynonymous SNPs and 10 IGR SNPs with strong, consistent associations with drug resistance. On the basis of our examination of the dN/dS ratios of nonsynonymous to synonymous SNPs among the isolates, we suggest that the drug resistance-associated genes identified here likely contain essentially all the nonsynonymous SNPs that have arisen as a result of drug pressure in these isolates and should thus represent a near-complete set of drug resistance-associated genes for these isolates and antibiotics. Our work indicates that the genetic basis of drug resistance is more complex than previously anticipated and provides a strong foundation for elucidating unknown drug resistance mechanisms.
Citations
More filters
Journal ArticleDOI
Ludmil B. Alexandrov1, Serena Nik-Zainal2, Serena Nik-Zainal3, David C. Wedge1, Samuel Aparicio4, Sam Behjati5, Sam Behjati1, Andrew V. Biankin, Graham R. Bignell1, Niccolo Bolli1, Niccolo Bolli5, Åke Borg3, Anne Lise Børresen-Dale6, Anne Lise Børresen-Dale7, Sandrine Boyault8, Birgit Burkhardt8, Adam Butler1, Carlos Caldas9, Helen Davies1, Christine Desmedt, Roland Eils5, Jorunn E. Eyfjord10, John A. Foekens11, Mel Greaves12, Fumie Hosoda13, Barbara Hutter5, Tomislav Ilicic1, Sandrine Imbeaud14, Sandrine Imbeaud15, Marcin Imielinsk14, Natalie Jäger5, David T. W. Jones16, David T. Jones1, Stian Knappskog11, Stian Knappskog17, Marcel Kool11, Sunil R. Lakhani18, Carlos López-Otín18, Sancha Martin1, Nikhil C. Munshi19, Nikhil C. Munshi20, Hiromi Nakamura13, Paul A. Northcott16, Marina Pajic21, Elli Papaemmanuil1, Angelo Paradiso22, John V. Pearson23, Xose S. Puente18, Keiran Raine1, Manasa Ramakrishna1, Andrea L. Richardson22, Andrea L. Richardson20, Julia Richter22, Philip Rosenstiel22, Matthias Schlesner5, Ton N. Schumacher24, Paul N. Span25, Jon W. Teague1, Yasushi Totoki13, Andrew Tutt24, Rafael Valdés-Mas18, Marit M. van Buuren25, Laura van ’t Veer26, Anne Vincent-Salomon27, Nicola Waddell23, Lucy R. Yates1, Icgc PedBrain24, Jessica Zucman-Rossi14, Jessica Zucman-Rossi15, P. Andrew Futreal1, Ultan McDermott1, Peter Lichter24, Matthew Meyerson20, Matthew Meyerson14, Sean M. Grimmond23, Reiner Siebert22, Elias Campo28, Tatsuhiro Shibata13, Stefan M. Pfister16, Stefan M. Pfister11, Peter J. Campbell29, Peter J. Campbell30, Peter J. Campbell2, Michael R. Stratton31, Michael R. Stratton2 
22 Aug 2013-Nature
TL;DR: It is shown that hypermutation localized to small genomic regions, ‘kataegis’, is found in many cancer types, and this results reveal the diversity of mutational processes underlying the development of cancer.
Abstract: All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

7,904 citations

Journal ArticleDOI
15 Mar 2018-Nature
TL;DR: The data suggest that 7–8% of the children in this cohort carry an unambiguous predisposing germline variant and that nearly 50% of paediatric neoplasms harbour a potentially druggable event, which is highly relevant for the design of future clinical trials.
Abstract: Pan-cancer analyses that examine commonalities and differences among various cancer types have emerged as a powerful way to obtain novel insights into cancer biology. Here we present a comprehensive analysis of genetic alterations in a pan-cancer cohort including 961 tumours from children, adolescents, and young adults, comprising 24 distinct molecular types of cancer. Using a standardized workflow, we identified marked differences in terms of mutation frequency and significantly mutated genes in comparison to previously analysed adult cancers. Genetic alterations in 149 putative cancer driver genes separate the tumours into two classes: small mutation and structural/copy-number variant (correlating with germline variants). Structural variants, hyperdiploidy, and chromothripsis are linked to TP53 mutation status and mutational signatures. Our data suggest that 7-8% of the children in this cohort carry an unambiguous predisposing germline variant and that nearly 50% of paediatric neoplasms harbour a potentially druggable event, which is highly relevant for the design of future clinical trials.

958 citations

Journal ArticleDOI
TL;DR: In this article, the authors analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGG (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing.
Abstract: Pediatric high-grade glioma (HGG) is a devastating disease with a less than 20% survival rate 2 years after diagnosis. We analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGGs (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing. We identified recurrent somatic mutations in ACVR1 exclusively in DIPGs (32%), in addition to previously reported frequent somatic mutations in histone H3 genes, TP53 and ATRX, in both DIPGs and NBS-HGGs. Structural variants generating fusion genes were found in 47% of DIPGs and NBS-HGGs, with recurrent fusions involving the neurotrophin receptor genes NTRK1, NTRK2 and NTRK3 in 40% of NBS-HGGs in infants. Mutations targeting receptor tyrosine kinase-RAS-PI3K signaling, histone modification or chromatin remodeling, and cell cycle regulation were found in 68%, 73% and 59% of pediatric HGGs, respectively, including in DIPGs and NBS-HGGs. This comprehensive analysis provides insights into the unique and shared pathways driving pediatric HGG within and outside the brainstem.

848 citations

Journal ArticleDOI
TL;DR: Heuristics for reliably detecting gene fusion events in RNA-seq data are developed and applied to nearly 7,000 samples from The Cancer Genome Atlas and are able to discover several novel and recurrent fusions involving kinases.
Abstract: Human cancer genomes harbour a variety of alterations leading to the deregulation of key pathways in tumour cells. The genomic characterization of tumours has uncovered numerous genes recurrently mutated, deleted or amplified, but gene fusions have not been characterized as extensively. Here we develop heuristics for reliably detecting gene fusion events in RNA-seq data and apply them to nearly 7,000 samples from The Cancer Genome Atlas. We thereby are able to discover several novel and recurrent fusions involving kinases. These findings have immediate clinical implications and expand the therapeutic options for cancer patients, as approved or exploratory drugs exist for many of these kinases.

746 citations

Journal ArticleDOI
TL;DR: In this article, the authors delineate the entire picture of genetic alterations and affected pathways in these glioma types, with sensitive detection of driver genes Grade II and III gliomas comprise three distinct subtypes characterized by discrete sets of mutations and distinct clinical behaviors, suggesting that there is functional interplay between the mutations that drive clonal selection.
Abstract: Grade II and III gliomas are generally slowly progressing brain cancers, many of which eventually transform into more aggressive tumors Despite recent findings of frequent mutations in IDH1 and other genes, knowledge about their pathogenesis is still incomplete Here, combining two large sets of high-throughput sequencing data, we delineate the entire picture of genetic alterations and affected pathways in these glioma types, with sensitive detection of driver genes Grade II and III gliomas comprise three distinct subtypes characterized by discrete sets of mutations and distinct clinical behaviors Mutations showed significant positive and negative correlations and a chronological hierarchy, as inferred from different allelic burdens among coexisting mutations, suggesting that there is functional interplay between the mutations that drive clonal selection Extensive serial and multi-regional sampling analyses further supported this finding and also identified a high degree of temporal and spatial heterogeneity generated during tumor expansion and relapse, which is likely shaped by the complex but ordered processes of multiple clonal selection and evolutionary events

675 citations

References
More filters
Journal ArticleDOI
TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

31,015 citations

Journal ArticleDOI
11 Jun 1998-Nature
TL;DR: The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve the understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions.
Abstract: Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.

7,779 citations

Journal ArticleDOI
TL;DR: A new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size is presented and its ability to detect tandem repeats that have undergone extensive mutational change is demonstrated.
Abstract: A tandem repeat in DNA is two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human disease, may play a variety of regulatory and evolutionary roles and are important laboratory and analytic tools. Extensive knowledge about pattern size, copy number, mutational history, etc. for tandem repeats has been limited by the inability to easily detect them in genomic sequence data. In this paper, we present a new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistically based recognition criteria. We demonstrate the algorithm’s speed and its ability to detect tandem repeats that have undergone extensive mutational change by analyzing four sequences: the human frataxin gene, the human β T cell receptor locus sequence and two yeast chromosomes. These sequences range in size from 3 kb up to 700 kb. A World Wide Web server interface at c3.biomath.mssm.edu/trf.html has been established for automated use of the program.

6,577 citations

Journal ArticleDOI
TL;DR: SOAP2 is a significantly improved version of the short oligonucleotide alignment program that both reduces computer memory usage and increases alignment speed at an unprecedented rate and is compatible with both single- and paired-end reads.
Abstract: SOAP2 is a significantly improved version of the short oligonucleotide alignment program that both reduces computer memory usage and increases alignment speed at an unprecedented rate. We used a Burrows Wheeler Transformation (BWT) compression index to substitute the seed strategy for indexing the reference sequence in the main memory. We tested it on the whole human genome and found that this new algorithm reduced memory usage from 14.7 to 5.4 GB and improved alignment speed by 20-30 times. SOAP2 is compatible with both single- and paired-end reads. Additionally, this tool now supports multiple text and compressed file formats. A consensus builder has also been developed for consensus assembly and SNP detection from alignment of short reads on a reference genome.

3,502 citations

Journal ArticleDOI
TL;DR: A novel method based on strain-dependent hybridization patterns of in vitro-amplified DNA with multiple spacer oligonucleotides was found to differentiate M. bovis from M. tuberculosis, a distinction which is often difficult to make by traditional methods.
Abstract: Widespread use of DNA restriction fragment length polymorphism (RFLP) to differentiate strains of Mycobacterium tuberculosis to monitor the transmission of tuberculosis has been hampered by the need to culture this slow-growing organism and by the level of technical sophistication needed for RFLP typing. We have developed a simple method which allows simultaneous detection and typing of M. tuberculosis in clinical specimens and reduces the time between suspicion of the disease and typing from 1 or several months to 1 or 3 days. The method is based on polymorphism of the chromosomal DR locus, which contains a variable number of short direct repeats interspersed with nonrepetitive spacers. The method is referred to as spacer oligotyping or "spoligotyping" because it is based on strain-dependent hybridization patterns of in vitro-amplified DNA with multiple spacer oligonucleotides. Most of the clinical isolates tested showed unique hybridization patterns, whereas outbreak strains shared the same spoligotype. The types obtained from direct examination of clinical samples were identical to those obtained by using DNA from cultured M. tuberculosis. This novel preliminary study shows that the novel method may be a useful tool for rapid disclosure of linked outbreak cases in a community, in hospitals, or in other institutions and for monitoring of transmission of multidrug-resistant M. tuberculosis. Unexpectedly, spoligotyping was found to differentiate M. bovis from M. tuberculosis, a distinction which is often difficult to make by traditional methods.

2,845 citations

Related Papers (5)