scispace - formally typeset
Search or ask a question

Showing papers in "Nature Genetics in 2014"


Journal ArticleDOI
TL;DR: The ability of CADD to prioritize functional, deleterious and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current single-annotation method.
Abstract: Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depletion (CADD), a framework that objectively integrates many diverse annotations into a single, quantitative score. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human derived alleles from 14.7 million simulated variants. We pre-compute “C-scores” for all 8.6 billion possible human single nucleotide variants and enable scoring of short insertions/deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects, and complex trait associations, and highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious, and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current annotation.

4,956 citations


Journal ArticleDOI
Andrew R. Wood1, Tõnu Esko2, Jian Yang3, Sailaja Vedantam4  +441 moreInstitutions (132)
TL;DR: This article identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height, and all common variants together captured 60% of heritability.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

1,872 citations


Journal ArticleDOI
TL;DR: This article conducted a meta-analysis of Parkinson's disease genome-wide association studies using a common set of 7,893,274 variants across 13,708 cases and 95,282 controls.
Abstract: We conducted a meta-analysis of Parkinson's disease genome-wide association studies using a common set of 7,893,274 variants across 13,708 cases and 95,282 controls. Twenty-six loci were identified as having genome-wide significant association; these and 6 additional previously reported loci were then tested in an independent set of 5,353 cases and 5,551 controls. Of the 32 tested SNPs, 24 replicated, including 6 newly identified loci. Conditional analyses within loci showed that four loci, including GBA, GAK-DGKQ, SNCA and the HLA region, contain a secondary independent risk variant. In total, we identified and replicated 28 independent risk variants for Parkinson's disease across 24 loci. Although the effect of each individual locus was small, risk profile analysis showed substantial cumulative risk in a comparison of the highest and lowest quintiles of genetic risk (odds ratio (OR) = 3.31, 95% confidence interval (CI) = 2.55–4.30; P = 2 × 10−16). We also show six risk loci associated with proximal gene expression or DNA methylation.

1,636 citations


Journal ArticleDOI
TL;DR: It was found that 73–75% of identified ccRCC driver aberrations were subclonal, confounding estimates of driver mutation prevalence, and the proportion of C>T transitions at CpG sites increased during tumor progression.
Abstract: Clear cell renal carcinomas (ccRCCs) can display intratumor heterogeneity (ITH). We applied multiregion exome sequencing (M-seq) to resolve the genetic architecture and evolutionary histories of ten ccRCCs. Ultra-deep sequencing identified ITH in all cases. We found that 73-75% of identified ccRCC driver aberrations were subclonal, confounding estimates of driver mutation prevalence. ITH increased with the number of biopsies analyzed, without evidence of saturation in most tumors. Chromosome 3p loss and VHL aberrations were the only ubiquitous events. The proportion of C>T transitions at CpG sites increased during tumor progression. M-seq permits the temporal resolution of ccRCC evolution and refines mutational signatures occurring during tumor development.

1,105 citations


Journal ArticleDOI
TL;DR: 2 independent domestications from genetic pools that diverged before human colonization are confirmed and a set of genes linked with increased leaf and seed size are identified and combined with quantitative trait locus data from Mesoamerican cultivars.
Abstract: Common bean (Phaseolus vulgaris L.) is the most important grain legume for human consumption and has a role in sustainable agriculture owing to its ability to fix atmospheric nitrogen. We assembled 473 Mb of the 587-Mb genome and genetically anchored 98% of this sequence in 11 chromosome-scale pseudomolecules. We compared the genome for the common bean against the soybean genome to find changes in soybean resulting from polyploidy. Using resequencing of 60 wild individuals and 100 landraces from the genetically differentiated Mesoamerican and Andean gene pools, we confirmed 2 independent domestications from genetic pools that diverged before human colonization. Less than 10% of the 74 Mb of sequence putatively involved in domestication was shared by the two domestication events. We identified a set of genes linked with increased leaf and seed size and combined these results with quantitative trait locus data from Mesoamerican cultivars. Genes affected by domestication may be useful for genomics-enabled crop improvement.

1,012 citations


Journal ArticleDOI
TL;DR: Autism's genetic architecture is reached: its narrow-sense heritability is ∼52.4%, with most due to common variation, and rare de novo mutations contribute substantially to individual liability, yet their contribution to variance in liability, 2.6%, is modest compared to that for heritable variation.
Abstract: Joseph Buxbaum and colleagues use an epidemiological sample from Sweden to investigate the genetic architecture of autism spectrum disorders. They conclude that most inherited risk for autism is determined by common variation and that rare variation explains a smaller fraction of total heritability.

1,011 citations


Journal ArticleDOI
TL;DR: The most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies, is reported, reporting genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood.
Abstract: Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism thus far, comprising 7,824 adult individuals from 2 European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity with more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information on gene expression, heritability and overlap with known loci for complex disorders, inborn errors of metabolism and pharmacological targets. We further developed a database and web-based resources for data mining and results visualization. Our findings provide new insights into the role of inherited variation in blood metabolic diversity and identify potential new opportunities for drug development and for understanding disease.

985 citations


Journal ArticleDOI
Anubha Mahajan1, Min Jin Go, Weihua Zhang2, Jennifer E. Below3  +392 moreInstitutions (104)
TL;DR: In this paper, the authors aggregated published meta-analyses of genome-wide association studies (GWAS), including 26,488 cases and 83,964 controls of European, east Asian, south Asian and Mexican and Mexican American ancestry.
Abstract: To further understanding of the genetic basis of type 2 diabetes (T2D) susceptibility, we aggregated published meta-analyses of genome-wide association studies (GWAS), including 26,488 cases and 83,964 controls of European, east Asian, south Asian and Mexican and Mexican American ancestry. We observed a significant excess in the directional consistency of T2D risk alleles across ancestry groups, even at SNPs demonstrating only weak evidence of association. By following up the strongest signals of association from the trans-ethnic meta-analysis in an additional 21,491 cases and 55,647 controls of European ancestry, we identified seven new T2D susceptibility loci. Furthermore, we observed considerable improvements in the fine-mapping resolution of common variant association signals at several T2D susceptibility loci. These observations highlight the benefits of trans-ethnic GWAS for the discovery and characterization of complex trait loci and emphasize an exciting opportunity to extend insight into the genetic architecture and pathogenesis of human diseases across populations of diverse ancestry.

954 citations


Journal ArticleDOI
TL;DR: This model is used to identify ∼1,000 genes that are significantly lacking in functional coding variation in non-ASD samples and are enriched for de novo loss-of-function mutations identified in ASD cases, suggesting that the role of de noVO mutations in ASDs might reside in fundamental neurodevelopmental processes.
Abstract: Mark Daly and colleagues present a statistical framework to evaluate the role of de novo mutations in human disease by calibrating a model of de novo mutation rates at the individual gene level. The mutation probabilities defined by their model and list of constrained genes can be used to help identify genetic variants that have a significant role in disease.

952 citations


Journal ArticleDOI
TL;DR: The performance of Platypus is demonstrated by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.
Abstract: High-throughput DNA sequencing technology has transformed genetic research and is starting to make an impact on clinical practice. However, analyzing high-throughput sequencing data remains challenging, particularly in clinical settings where accuracy and turnaround times are critical. We present a new approach to this problem, implemented in a software package called Platypus. Platypus achieves high sensitivity and specificity for SNPs, indels and complex polymorphisms by using local de novo assembly to generate candidate variants, followed by local realignment and probabilistic haplotype estimation. It is an order of magnitude faster than existing tools and generates calls from raw aligned read data without preprocessing. We demonstrate the performance of Platypus in clinically relevant experimental designs by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.

940 citations


Journal ArticleDOI
TL;DR: Exome-wide association study of liver fat content and adeno-associated virus–mediated short hairpin RNA knockdown of Tm6sf2 in mice indicate that TM6 SF2 activity is required for normal VLDL secretion and that impaired TM6SF2 function causally contributes to NAFLD.
Abstract: Helen Hobbs, Jonathan Cohen and colleagues identify a nonsynonymous variant in TM6SF2 associated with susceptibility to nonalcoholic fatty acid liver disease. They further show that knockdown of Tm6sf2 in mice results in increased liver triglyceride content and reduced very-low-density lipoprotein (VLDL) secretion, suggesting that impaired TM6SF2 function contributes causally to disease risk.

Journal ArticleDOI
TL;DR: Results from applying multiple sequentially Markovian coalescent (MSMC) to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago and give information about human population history as recent as 2,000 Years ago.
Abstract: The availability of complete human genome sequences from populations across the world has given rise to new population genetic inference methods that explicitly model ancestral relationships under recombination and mutation. So far, application of these methods to evolutionary history more recent than 20,000-30,000 years ago and to population separations has been limited. Here we present a new method that overcomes these shortcomings. The multiple sequentially Markovian coalescent (MSMC) analyzes the observed pattern of mutations in multiple individuals, focusing on the first coalescence between any two individuals. Results from applying MSMC to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago and give information about human population history as recent as 2,000 years ago, including the bottleneck in the peopling of the Americas and separations within Africa, East Asia and Europe.

Journal ArticleDOI
TL;DR: A new hormone, erythroferrone (ERFE), is identified that mediates hepcidin suppression during stress erythropoiesis, and is greatly increased in Hbbth3/+ mice with thalassemia intermedia, where it contributes to the suppression of hePCidin and the systemic iron overload characteristic of this disease.
Abstract: Recovery from blood loss requires a greatly enhanced supply of iron to support expanded erythropoiesis. After hemorrhage, suppression of the iron-regulatory hormone hepcidin allows increased iron absorption and mobilization from stores. We identified a new hormone, erythroferrone (ERFE), that mediates hepcidin suppression during stress erythropoiesis. ERFE is produced by erythroblasts in response to erythropoietin. ERFE-deficient mice fail to suppress hepcidin rapidly after hemorrhage and exhibit a delay in recovery from blood loss. ERFE expression is greatly increased in Hbb(th3/+) mice with thalassemia intermedia, where it contributes to the suppression of hepcidin and the systemic iron overload characteristic of this disease.

Journal ArticleDOI
TL;DR: The advantages and pitfalls of MLMA methods as a function of study design are described and quantify and recommendations for the application of these methods in practical settings are provided.
Abstract: Mixed linear models are emerging as a method of choice for conducting genetic association studies in humans and other organisms. The advantages of the mixed-linear-model association (MLMA) method include the prevention of false positive associations due to population or relatedness structure and an increase in power obtained through the application of a correction that is specific to this structure. An underappreciated point is that MLMA can also increase power in studies without sample structure by implicitly conditioning on associated loci other than the candidate locus. Numerous variations on the standard MLMA approach have recently been published, with a focus on reducing computational cost. These advances provide researchers applying MLMA methods with many options to choose from, but we caution that MLMA methods are still subject to potential pitfalls. Here we describe and quantify the advantages and pitfalls of MLMA methods as a function of study design and provide recommendations for the application of these methods in practical settings.

Journal ArticleDOI
TL;DR: In this article, the authors analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGG (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing.
Abstract: Pediatric high-grade glioma (HGG) is a devastating disease with a less than 20% survival rate 2 years after diagnosis. We analyzed 127 pediatric HGGs, including diffuse intrinsic pontine gliomas (DIPGs) and non-brainstem HGGs (NBS-HGGs), by whole-genome, whole-exome and/or transcriptome sequencing. We identified recurrent somatic mutations in ACVR1 exclusively in DIPGs (32%), in addition to previously reported frequent somatic mutations in histone H3 genes, TP53 and ATRX, in both DIPGs and NBS-HGGs. Structural variants generating fusion genes were found in 47% of DIPGs and NBS-HGGs, with recurrent fusions involving the neurotrophin receptor genes NTRK1, NTRK2 and NTRK3 in 40% of NBS-HGGs in infants. Mutations targeting receptor tyrosine kinase-RAS-PI3K signaling, histone modification or chromatin remodeling, and cell cycle regulation were found in 68%, 73% and 59% of pediatric HGGs, respectively, including in DIPGs and NBS-HGGs. This comprehensive analysis provides insights into the unique and shared pathways driving pediatric HGG within and outside the brainstem.

Journal ArticleDOI
TL;DR: A multidimensional and comprehensive genomic landscape that highlights the molecular complexity of gastric cancer and provides a road map to facilitate genome-guided personalized therapy is illustrated.
Abstract: Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and epigenetic perturbations and unique mutational signatures. We identified previously known (TP53, ARID1A and CDH1) and new (MUC6, CTNNA2, GLI3, RNF43 and others) significantly mutated driver genes. Specifically, we found RHOA mutations in 14.3% of diffuse-type tumors but not in intestinal-type tumors (P < 0.001). The mutations clustered in recurrent hotspots affecting functional domains and caused defective RHOA signaling, promoting escape from anoikis in organoid cultures. The top perturbed pathways in gastric cancer included adherens junction and focal adhesion, in which RHOA and other mutated genes we identified participate as key players. These findings illustrate a multidimensional and comprehensive genomic landscape that highlights the molecular complexity of gastric cancer and provides a road map to facilitate genome-guided personalized therapy.

Journal ArticleDOI
TL;DR: The genome size of the hot pepper was approximately fourfold larger than that of its close relative tomato, and the genome showed an accumulation of Gypsy and Caulimoviridae family elements.
Abstract: Doil Choi and colleagues report the genome sequence of the hot pepper, Capsicum annuum, as well as the resequencing of two cultivated peppers and a wild species, Capsicum chinense. Comparative genomic analysis across Solanaceae provides insights into genome expansion, pungency, ripening and disease resistance in hot peppers.

Journal ArticleDOI
TL;DR: Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.
Abstract: Yu-Xian Zhu, Jun Wang, Shuxun Yu and colleagues report sequencing and assembly of the genome of cultivated cotton, Gossypium arboreum. Comparison with the Gossypium raimondii genome sequence provides insights into genome evolution and speciation, and identifies two shared whole-genome duplication events occurring before the speciation event around 2–13 million years ago.

Journal ArticleDOI
TL;DR: The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated sequence variants and genotypes of key ancestor bulls.
Abstract: The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated sequence variants and genotypes of key ancestor bulls. In the first phase of the 1000 bull genomes project, we sequenced the whole genomes of 234 cattle to an average of 8.3-fold coverage. This sequencing includes data for 129 individuals from the global Holstein-Friesian population, 43 individuals from the Fleckvieh breed and 15 individuals from the Jersey breed. We identified a total of 28.3 million variants, with an average of 1.44 heterozygous sites per kilobase for each individual. We demonstrate the use of this database in identifying a recessive mutation underlying embryonic death and a dominant mutation underlying lethal chrondrodysplasia. We also performed genome-wide association studies for milk production and curly coat, using imputed sequence variants, and identified variants associated with these traits in cattle.

Journal ArticleDOI
TL;DR: A comprehensive analysis of tomato evolution based on the genome sequences of 360 accessions provides evidence that domestication and improvement focused on two independent sets of quantitative trait loci (QTLs), resulting in modern tomato fruit ∼100 times larger than its ancestor.
Abstract: The histories of crop domestication and breeding are recorded in genomes. Although tomato is a model species for plant biology and breeding, the nature of human selection that altered its genome remains largely unknown. Here we report a comprehensive analysis of tomato evolution based on the genome sequences of 360 accessions. We provide evidence that domestication and improvement focused on two independent sets of quantitative trait loci (QTLs), resulting in modern tomato fruit ∼100 times larger than its ancestor. Furthermore, we discovered a major genomic signature for modern processing tomatoes, identified the causative variants that confer pink fruit color and precisely visualized the linkage drag associated with wild introgressions. This study outlines the accomplishments as well as the costs of historical selection and provides molecular insights toward further improvement.

Journal ArticleDOI
Laurent C. Francioli1, Androniki Menelaou1, Sara L. Pulit1, Freerk van Dijk1, Pier Francesco Palamara2, Clara C. Elbers1, Pieter B. Neerincx1, Kai Ye3, Kai Ye4, Victor Guryev, Wigard P. Kloosterman1, Patrick Deelen1, Abdel Abdellaoui5, Elisabeth M. van Leeuwen6, Mannis van Oven6, Martijn Vermaat4, Mingkun Li7, Jeroen F. J. Laros4, Lennart C. Karssen6, Alexandros Kanterakis1, Najaf Amin6, Jouke-Jan Hottenga5, Eric-Wubbo Lameijer4, Mathijs Kattenberg5, Martijn Dijkstra1, Heorhiy Byelas1, Jessica van Setten8, Barbera D. C. van Schaik5, Jan Bot, Isaac J. Nijman1, Ivo Renkens1, Tobias Marschall9, Alexander Schönhuth, Jayne Y. Hehir-Kwa10, Robert E. Handsaker11, Robert E. Handsaker10, Paz Polak10, Mashaal Sohail10, Mashaal Sohail12, Dana Vuzman12, Fereydoun Hormozdiari, David van Enckevort, Hailiang Mei6, Vyacheslav Koval4, Matthijs Moed1, K. Joeri van der Velde1, Fernando Rivadeneira10, Fernando Rivadeneira6, Fernando Rivadeneira12, Karol Estrada6, Carolina Medina-Gomez6, Aaron Isaacs10, Aaron Isaacs11, Steven A. McCarroll4, Marian Beekman4, Anton J. M. de Craen4, H. Eka D. Suchiman4, Albert Hofman6, Ben A. Oostra6, André G. Uitterlinden6, Gonneke Willemsen5, Mathieu Platteel1, Jan H. Veldink8, Leonard H. van den Berg13, Steven J. Pitts13, Shobha Potluri13, Purnima Sundar13, David R. Cox10, David R. Cox12, Shamil R. Sunyaev4, Johan T. den Dunnen7, Mark Stoneking7, Peter de Knijff4, Manfred Kayser6, Qibin Li14, Yingrui Li14, Yuanping Du14, Ruoyan Chen14, Hongzhi Cao14, Ning Li, Sujie Cao, Jun Wang15, Jasper A. Bovenberg, Itsik Pe'er2, P. Eline Slagboom4, Cornelia M. van Duijn6, Dorret I. Boomsma5, Gert-Jan B. van Ommen4, Paul I.W. de Bakker8, Paul I.W. de Bakker1, Morris A. Swertz, Cisca Wijmenga 
TL;DR: The Genome of the Netherlands (GoNL) Project is described, in which the whole genomes of 250 Dutch parent-offspring families were sequenced and a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions were constructed.
Abstract: Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring families and constructed a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions. The intermediate coverage (∼13×) and trio design enabled extensive characterization of structural variation, including midsize events (30-500 bp) previously poorly catalogued and de novo mutations. We demonstrate that the quality of the haplotypes boosts imputation accuracy in independent samples, especially for lower frequency alleles. Population genetic analyses demonstrate fine-scale structure across the country and support multiple ancient migrations, consistent with historical changes in sea level and flooding. The GoNL Project illustrates how single-population whole-genome sequencing can provide detailed characterization of genetic variation and may guide the design of future population studies.

Journal ArticleDOI
TL;DR: It is found that massive gene loss occurred in the wake of sex-chromosome 'birth' in Cynoglossus semilaevis, and the sex chromosomes of these fish are derived from the same ancestral vertebrate protochromosome as the avian W and Z chromosomes.
Abstract: Genetic sex determination by W and Z chromosomes has developed independently in different groups of organisms. To better understand the evolution of sex chromosomes and the plasticity of sex-determination mechanisms, we sequenced the whole genomes of a male (ZZ) and a female (ZW) half-smooth tongue sole (Cynoglossus semilaevis). In addition to insights into adaptation to a benthic lifestyle, we find that the sex chromosomes of these fish are derived from the same ancestral vertebrate protochromosome as the avian W and Z chromosomes. Notably, the same gene on the Z chromosome, dmrt1, which is the male-determining gene in birds, showed convergent evolution of features that are compatible with a similar function in tongue sole. Comparison of the relatively young tongue sole sex chromosomes with those of mammals and birds identified events that occurred during the early phase of sex-chromosome evolution. Pertinent to the current debate about heterogametic sex-chromosome decay, we find that massive gene loss occurred in the wake of sex-chromosome 'birth'.

Journal ArticleDOI
TL;DR: This study provides new insights into the genetic basis of follicular lymphoma and the clonal dynamics of transformation and suggests that personalizing therapies to target key genetic alterations in the CPC represents an attractive therapeutic strategy.
Abstract: Follicular lymphoma is an incurable malignancy, with transformation to an aggressive subtype representing a critical event during disease progression. Here we performed whole-genome or whole-exome sequencing on 10 follicular lymphoma-transformed follicular lymphoma pairs followed by deep sequencing of 28 genes in an extension cohort, and we report the key events and evolutionary processes governing tumor initiation and transformation. Tumor evolution occurred through either a 'rich' or 'sparse' ancestral common progenitor clone (CPC). We identified recurrent mutations in linker histone, JAK-STAT signaling, NF-κB signaling and B cell developmental genes. Longitudinal analyses identified early driver mutations in chromatin regulator genes (CREBBP, EZH2 and KMT2D (MLL2)), whereas mutations in EBF1 and regulators of NF-κB signaling (MYD88 and TNFAIP3) were gained at transformation. Collectively, this study provides new insights into the genetic basis of follicular lymphoma and the clonal dynamics of transformation and suggests that personalizing therapies to target key genetic alterations in the CPC represents an attractive therapeutic strategy.

Journal ArticleDOI
TL;DR: A combination of hotspot TERT promoter mutation, TERT focal amplification and viral genome integration occurs in more than 68% of cases, implicating TERT as a central and ancestry-independent node of hepatocarcinogenesis.
Abstract: Diverse epidemiological factors are associated with hepatocellular carcinoma (HCC) prevalence in different populations. However, the global landscape of the genetic changes in HCC genomes underpinning different epidemiological and ancestral backgrounds still remains uncharted. Here a collection of data from 503 liver cancer genomes from different populations uncovered 30 candidate driver genes and 11 core pathway modules. Furthermore, a collaboration of two large-scale cancer genome projects comparatively analyzed the trans-ancestry substitution signatures in 608 liver cancer cases and identified unique mutational signatures that predominantly contribute to Asian cases. This work elucidates previously unexplored ancestry-associated mutational processes in HCC development. A combination of hotspot TERT promoter mutation, TERT focal amplification and viral genome integration occurs in more than 68% of cases, implicating TERT as a central and ancestry-independent node of hepatocarcinogenesis. Newly identified alterations in genes encoding metabolic enzymes, chromatin remodelers and a high proportion of mTOR pathway activations offer potential therapeutic and diagnostic opportunities.

Journal ArticleDOI
TL;DR: Analysis of comprehensive mapping of transcription start sites in human lymphoblastoid B cell and chronic myelogenous leukemic ENCODE Tier 1 cell lines identifies a common architecture of initiation, including tightly spaced (110 bp apart) divergent initiation, similar frequencies of core promoter sequence elements, highly positioned flanking nucleosomes and two modes of transcription factor binding.
Abstract: Despite the conventional distinction between them, promoters and enhancers share many features in mammals, including divergent transcription and similar modes of transcription factor binding. Here we examine the architecture of transcription initiation through comprehensive mapping of transcription start sites (TSSs) in human lymphoblastoid B cell (GM12878) and chronic myelogenous leukemic (K562) ENCODE Tier 1 cell lines. Using a nuclear run-on protocol called GRO-cap, which captures TSSs for both stable and unstable transcripts, we conduct detailed comparisons of thousands of promoters and enhancers in human cells. These analyses identify a common architecture of initiation, including tightly spaced (110 bp apart) divergent initiation, similar frequencies of core promoter sequence elements, highly positioned flanking nucleosomes and two modes of transcription factor binding. Post-initiation transcript stability provides a more fundamental distinction between promoters and enhancers than patterns of histone modification and association of transcription factors or co-activators. These results support a unified model of transcription initiation at promoters and enhancers.

Journal ArticleDOI
TL;DR: The mutational profile of ESCC closely resembles those of squamous cell carcinomas of other tissues but differs from that of esophageal adenocarcinoma, with mutations in epigenetic modulators with prognostic and potentially therapeutic implications highlighted.
Abstract: Esophageal squamous cell carcinoma (ESCC) is one of the deadliest cancers. We performed exome sequencing on 113 tumor-normal pairs, yielding a mean of 82 non-silent mutations per tumor, and 8 cell lines. The mutational profile of ESCC closely resembles those of squamous cell carcinomas of other tissues but differs from that of esophageal adenocarcinoma. Genes involved in cell cycle and apoptosis regulation were mutated in 99% of cases by somatic alterations of TP53 (93%), CCND1 (33%), CDKN2A (20%), NFE2L2 (10%) and RB1 (9%). Histone modifier genes were frequently mutated, including KMT2D (also called MLL2; 19%), KMT2C (MLL3; 6%), KDM6A (7%), EP300 (10%) and CREBBP (6%). EP300 mutations were associated with poor survival. The Hippo and Notch pathways were dysregulated by mutations in FAT1, FAT2, FAT3 or FAT4 (27%) or AJUBA (JUB; 7%) and NOTCH1, NOTCH2 or NOTCH3 (22%) or FBXW7 (5%), respectively. These results define the mutational landscape of ESCC and highlight mutations in epigenetic modulators with prognostic and potentially therapeutic implications.

Journal ArticleDOI
TL;DR: An expanded CNV morbidity map was created from 29,085 children with developmental delay in comparison to 19,584 healthy controls, identifying 70 significant CNVs and an integrated analysis of CNV and single-nucleotide variant (SNV) data pinpointed 10 genes enriched for putative loss of function.
Abstract: Copy number variants (CNVs) are associated with many neurocognitive disorders; however, these events are typically large, and the underlying causative genes are unclear. We created an expanded CNV morbidity map from 29,085 children with developmental delay in comparison to 19,584 healthy controls, identifying 70 significant CNVs. We resequenced 26 candidate genes in 4,716 additional cases with developmental delay or autism and 2,193 controls. An integrated analysis of CNV and single-nucleotide variant (SNV) data pinpointed 10 genes enriched for putative loss of function. Follow-up of a subset of affected individuals identified new clinical subtypes of pediatric disease and the genes responsible for disease-associated CNVs. These genetic changes include haploinsufficiency of SETBP1 associated with intellectual disability and loss of expressive language and truncations of ZMYND11 in individuals with autism, aggression and complex neuropsychiatric features. This combined CNV and SNV approach facilitates the rapid discovery of new syndromes and genes involved in neuropsychiatric disease despite extensive genetic heterogeneity.

Journal ArticleDOI
TL;DR: Aggressive and indolent ACCs correspond to two distinct molecular entities driven by different oncogenic alterations, which are validated in an independent cohort of 77 ACCs.
Abstract: Adrenocortical carcinomas (ACCs) are aggressive cancers originating in the cortex of the adrenal gland. Despite overall poor prognosis, ACC outcome is heterogeneous. We performed exome sequencing and SNP array analysis of 45 ACCs and identified recurrent alterations in known driver genes (CTNNB1, TP53, CDKN2A, RB1 and MEN1) and in genes not previously reported in ACC (ZNRF3, DAXX, TERT and MED12), which we validated in an independent cohort of 77 ACCs. ZNRF3, encoding a cell surface E3 ubiquitin ligase, was the most frequently altered gene (21%) and is a potential new tumor suppressor gene related to the β-catenin pathway. Our integrated genomic analyses further identified two distinct molecular subgroups with opposite outcome. The C1A group of ACCs with poor outcome displayed numerous mutations and DNA methylation alterations, whereas the C1B group of ACCs with good prognosis displayed specific deregulation of two microRNA clusters. Thus, aggressive and indolent ACCs correspond to two distinct molecular entities driven by different oncogenic alterations.

Journal ArticleDOI
TL;DR: A draft genome of domesticated C. carpio (strain Songpu) is presented, whose current assembly contains 52,610 protein-coding genes and approximately 92.3% coverage of its paleotetraploidized genome (2n = 100).
Abstract: The common carp, Cyprinus carpio, is one of the most important cyprinid species and globally accounts for 10% of freshwater aquaculture production. Here we present a draft genome of domesticated C. carpio (strain Songpu), whose current assembly contains 52,610 protein-coding genes and approximately 92.3% coverage of its paleotetraploidized genome (2n = 100). The latest round of whole-genome duplication has been estimated to have occurred approximately 8.2 million years ago. Genome resequencing of 33 representative individuals from worldwide populations demonstrates a single origin for C. carpio in 2 subspecies (C. carpio Haematopterus and C. carpio carpio). Integrative genomic and transcriptomic analyses were used to identify loci potentially associated with traits including scaling patterns and skin color. In combination with the high-resolution genetic map, the draft genome paves the way for better molecular studies and improved genome-assisted breeding of C. carpio and other closely related species.

Journal ArticleDOI
TL;DR: A new monoallelic inflammasome defect is described that expands the monogenic autoinflammatory disease spectrum to include MAS and suggests new targets for therapy.
Abstract: Inflammasomes are innate immune sensors that respond to pathogen- and damage-associated signals with caspase-1 activation, interleukin (IL)-1β and IL-18 secretion, and macrophage pyroptosis. The discovery that dominant gain-of-function mutations in NLRP3 cause the cryopyrin-associated periodic syndromes (CAPS) and trigger spontaneous inflammasome activation and IL-1β oversecretion led to successful treatment with IL-1-blocking agents. Herein we report a de novo missense mutation (c.1009A > T, encoding p.Thr337Ser) affecting the nucleotide-binding domain of the inflammasome component NLRC4 that causes early-onset recurrent fever flares and macrophage activation syndrome (MAS). Functional analyses demonstrated spontaneous inflammasome formation and production of the inflammasome-dependent cytokines IL-1β and IL-18, with the latter exceeding the levels seen in CAPS. The NLRC4 mutation caused constitutive caspase-1 cleavage in cells transduced with mutant NLRC4 and increased production of IL-18 in both patient-derived and mutant NLRC4-transduced macrophages. Thus, we describe a new monoallelic inflammasome defect that expands the monogenic autoinflammatory disease spectrum to include MAS and suggests new targets for therapy.