Showing papers by "Michael Boehnke published in 2018"
••
TL;DR: In insights into the role of alcohol consumption in the genetic architecture of hypertension, a large two-stage investigation incorporating joint testing of main genetic effects and single nucleotide variant (SNV)-alcohol consumption interactions is conducted.
Abstract: Heavy alcohol consumption is an established risk factor for hypertension; the mechanism by which alcohol consumption impact blood pressure (BP) regulation remains unknown. We hypothesized that a genome-wide association study accounting for gene-alcohol consumption interaction for BP might identify additional BP loci and contribute to the understanding of alcohol-related BP regulation. We conducted a large two-stage investigation incorporating joint testing of main genetic effects and single nucleotide variant (SNV)-alcohol consumption interactions. In Stage 1, genome-wide discovery meta-analyses in ≈131K individuals across several ancestry groups yielded 3,514 SNVs (245 loci) with suggestive evidence of association (P < 1.0 x 10-5). In Stage 2, these SNVs were tested for independent external replication in ≈440K individuals across multiple ancestries. We identified and replicated (at Bonferroni correction threshold) five novel BP loci (380 SNVs in 21 genes) and 49 previously reported BP loci (2,159 SNVs in 109 genes) in European ancestry, and in multi-ancestry meta-analyses (P < 5.0 x 10-8). For African ancestry samples, we detected 18 potentially novel BP loci (P < 5.0 x 10-8) in Stage 1 that warrant further replication. Additionally, correlated meta-analysis identified eight novel BP loci (11 genes). Several genes in these loci (e.g., PINX1, GATA4, BLK, FTO and GABBR2) have been previously reported to be associated with alcohol consumption. These findings provide insights into the role of alcohol consumption in the genetic architecture of hypertension.
1,218 citations
••
University of Oxford1, University of Michigan2, Wellcome Trust Sanger Institute3, Amgen4, University of Cambridge5, University of Copenhagen6, University of Liverpool7, University of Freiburg8, Boston University9, University of Tartu10, Erasmus University Medical Center11, Leiden University Medical Center12, Pasteur Institute13, Icahn School of Medicine at Mount Sinai14, UCLA Medical Center15, Vanderbilt University Medical Center16, Wake Forest University17, National University of Singapore18, London North West Healthcare NHS Trust19, Imperial College London20, Charité21, Innsbruck Medical University22, Washington University in St. Louis23, Queen Mary University of London24, University of Southern Denmark25, National and Kapodistrian University of Athens26, Robertson Centre for Biostatistics27, University of Exeter28, Uppsala University29, University of Düsseldorf30, Steno Diabetes Center31, Aalborg University32, University of Eastern Finland33, Broad Institute34, Frederiksberg Hospital35, Lund University36, University of Bergen37, Technische Universität München38, University of North Carolina at Chapel Hill39, Ninewells Hospital40, University of Edinburgh41, University of Minnesota42, University of Glasgow43, Ludwig Maximilian University of Munich44, University of Iceland45, Aarhus University46, Science for Life Laboratory47, Stanford University48, University of Helsinki49, National Institutes of Health50, University of Dundee51, Harvard University52
TL;DR: Combining 32 genome-wide association studies with high-density imputation provides a comprehensive view of the genetic contribution to type 2 diabetes in individuals of European ancestry with respect to locus discovery, causal-variant resolution, and mechanistic insight.
Abstract: We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency 2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).
1,136 citations
••
Evangelos Evangelou1, Evangelos Evangelou2, Helen R. Warren3, Helen R. Warren4 +338 more•Institutions (93)
TL;DR: In this article, the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to date in over 1 million people of European ancestry was conducted.
Abstract: High blood pressure is a highly heritable and modifiable risk factor for cardiovascular disease We report the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to date in over 1 million people of European ancestry We identify 535 novel blood pressure loci that not only offer new biological insights into blood pressure regulation but also highlight shared genetic architecture between blood pressure and lifestyle exposures Our findings identify new biological pathways for blood pressure regulation with potential for improved cardiovascular disease prevention in the future
728 citations
••
TL;DR: For the first time, specific loci that distinguish between BD and SCZ are discovered and polygenic components underlying multiple symptom dimensions are identified that point to the utility of genetics to inform symptomology and potential treatment.
569 citations
••
TL;DR: It is suggested that many of the putative atrial fibrillation genes act via cardiac structural remodeling, potentially in the form of an ‘atrial cardiomyopathy’2, either during fetal heart development or as a response to stress in the adult heart.
Abstract: To identify genetic variation underlying atrial fibrillation, the most common cardiac arrhythmia, we performed a genome-wide association study of >1,000,000 people, including 60,620 atrial fibrillation cases and 970,216 controls. We identified 142 independent risk variants at 111 loci and prioritized 151 functional candidate genes likely to be involved in atrial fibrillation. Many of the identified risk variants fall near genes where more deleterious mutations have been reported to cause serious heart defects in humans (GATA4, MYH6, NKX2-5, PITX2, TBX5)1, or near genes important for striated muscle function and integrity (for example, CFL2, MYH7, PKP2, RBM20, SGCG, SSPN). Pathway and functional enrichment analyses also suggested that many of the putative atrial fibrillation genes act via cardiac structural remodeling, potentially in the form of an 'atrial cardiomyopathy'2, either during fetal heart development or as a response to stress in the adult heart.
447 citations
01 Jan 2018
Abstract: We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency <5%, 14 with estimated allelic odds ratio >2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).Combining 32 genome-wide association studies with high-density imputation provides a comprehensive view of the genetic contribution to type 2 diabetes in individuals of European ancestry with respect to locus discovery, causal-variant resolution, and mechanistic insight.
379 citations
••
TL;DR: Trans-ethnic analyses of exome array data identify new risk loci for type 2 diabetes and fine-mapping analyses using genome-wide association data show that the index coding variants represent the likely causal variants at only a subset of these loci.
Abstract: We aggregated coding variant data for 81,412 type 2 diabetes cases and 370,832 controls of diverse ancestry, identifying 40 coding variant association signals (P < 2.2 × 10−7); of these, 16 map outside known risk-associated loci. We make two important observations. First, only five of these signals are driven by low-frequency variants: even for these, effect sizes are modest (odds ratio ≤1.29). Second, when we used large-scale genome-wide association data to fine-map the associated variants in their regional context, accounting for the global enrichment of complex trait associations in coding sequence, compelling evidence for coding variant causality was obtained for only 16 signals. At 13 others, the associated coding variants clearly represent ‘false leads’ with potential to generate erroneous mechanistic inference. Coding variant associations offer a direct route to biological insight for complex diseases and identification of validated therapeutic targets; however, appropriate mechanistic inference requires careful specification of their causal contribution to disease predisposition.
318 citations
••
TL;DR: Exome-wide analysis identifies rare and low-frequency coding variants associated with body mass index that confirm enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically supported therapeutic targets in obesity.
Abstract: Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which 8 variants were in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2 and ZNF169) newly implicated in human obesity, 2 variants were in genes (MC4R and KSR2) previously observed to be mutated in extreme obesity and 2 variants were in GIPR. The effect sizes of rare variants are ~10 times larger than those of common variants, with the largest effect observed in carriers of an MC4R mutation introducing a stop codon (p.Tyr35Ter, MAF = 0.01%), who weighed ~7 kg more than non-carriers. Pathway analyses based on the variants associated with BMI confirm enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically supported therapeutic targets in obesity.
252 citations
••
TL;DR: In this article, the authors performed two genome-wide association studies (GWASs), on HapMap and 1000 Genomes imputed data, of circulating amounts of CRP by using data from 88 studies comprising 204,402 European individuals.
Abstract: C-reactive protein (CRP) is a sensitive biomarker of chronic low-grade inflammation and is associated with multiple complex diseases. The genetic determinants of chronic inflammation remain largely unknown, and the causal role of CRP in several clinical outcomes is debated. We performed two genome-wide association studies (GWASs), on HapMap and 1000 Genomes imputed data, of circulating amounts of CRP by using data from 88 studies comprising 204,402 European individuals. Additionally, we performed in silico functional analyses and Mendelian randomization analyses with several clinical outcomes. The GWAS meta-analyses of CRP revealed 58 distinct genetic loci (p < 5 × 10−8). After adjustment for body mass index in the regression analysis, the associations at all except three loci remained. The lead variants at the distinct loci explained up to 7.0% of the variance in circulating amounts of CRP. We identified 66 gene sets that were organized in two substantially correlated clusters, one mainly composed of immune pathways and the other characterized by metabolic pathways in the liver. Mendelian randomization analyses revealed a causal protective effect of CRP on schizophrenia and a risk-increasing effect on bipolar disorder. Our findings provide further insights into the biology of inflammation and could lead to interventions for treating inflammation and its clinical consequences.
244 citations
••
Yun J. Sung1, Thomas W. Winkler2, Lisa de las Fuentes1, Amy R. Bentley3 +326 more•Institutions (104)
TL;DR: The identified loci show strong evidence for regulatory features and support shared pathophysiology with cardiometabolic and addiction traits and highlight a role in BP regulation for biological candidates such as modulators of vascular structure and function.
Abstract: Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture We performed genome-wide association meta-analyses of systolic and diastolic BP incorporating gene-smoking interactions in 610,091 individuals Stage 1 analysis examined ∼188 million SNPs and small insertion/deletion variants in 129,913 individuals from four ancestries (European, African, Asian, and Hispanic) with follow-up analysis of promising variants in 480,178 additional individuals from five ancestries We identified 15 loci that were genome-wide significant (p < 5 × 10−8) in stage 1 and formally replicated in stage 2 A combined stage 1 and 2 meta-analysis identified 66 additional genome-wide significant loci (13, 35, and 18 loci in European, African, and trans-ancestry, respectively) A total of 56 known BP loci were also identified by our results (p < 5 × 10−8) Of the newly identified loci, ten showed significant interaction with smoking status, but none of them were replicated in stage 2 Several loci were identified in African ancestry, highlighting the importance of genetic studies in diverse populations The identified loci show strong evidence for regulatory features and support shared pathophysiology with cardiometabolic and addiction traits They also highlight a role in BP regulation for biological candidates such as modulators of vascular structure and function (CDKN1B, BCAR1-CFDP1, PXDN, EEA1), ciliopathies (SDCCAG8, RPGRIP1L), telomere maintenance (TNKS, PINX1, AKTIP), and central dopaminergic signaling (MSRA, EBF2)
110 citations
••
Regeneron1, University of Oxford2, University of Michigan3, Copenhagen University Hospital4, Amgen5, Lund University6, Broad Institute7, University of Cambridge8, Tunghai University9, University of Pennsylvania10, Duke University11, University of Helsinki12, National Autonomous University of Mexico13, University of Liverpool14, University of Tartu15, deCODE genetics16, UCLA Medical Center17, National Taiwan University18, Veterans Health Administration19, Harvard University20, Norwegian University of Science and Technology21, Levanger Hospital22, Churchill Hospital23
TL;DR: It is found that predicted loss-of-function variants in ANGPTL4 are associated with glucose homeostasis and reduced risk of type 2 diabetes and that Angptl4−/− mice on a high-fat diet show improved insulin sensitivity.
Abstract: Angiopoietin-like 4 (ANGPTL4) is an endogenous inhibitor of lipoprotein lipase that modulates lipid levels, coronary atherosclerosis risk, and nutrient partitioning. We hypothesize that loss of ANGPTL4 function might improve glucose homeostasis and decrease risk of type 2 diabetes (T2D). We investigate protein-altering variants in ANGPTL4 among 58,124 participants in the DiscovEHR human genetics study, with follow-up studies in 82,766 T2D cases and 498,761 controls. Carriers of p.E40K, a variant that abolishes ANGPTL4 ability to inhibit lipoprotein lipase, have lower odds of T2D (odds ratio 0.89, 95% confidence interval 0.85–0.92, p = 6.3 × 10−10), lower fasting glucose, and greater insulin sensitivity. Predicted loss-of-function variants are associated with lower odds of T2D among 32,015 cases and 84,006 controls (odds ratio 0.71, 95% confidence interval 0.49–0.99, p = 0.041). Functional studies in Angptl4-deficient mice confirm improved insulin sensitivity and glucose homeostasis. In conclusion, genetic inactivation of ANGPTL4 is associated with improved glucose homeostasis and reduced risk of T2D.
••
University of Michigan1, Norwegian University of Science and Technology2, Science for Life Laboratory3, University of Exeter4, Icahn School of Medicine at Mount Sinai5, University of Copenhagen6, Nord-Trøndelag Hospital Trust7, Centro Nacional de Investigaciones Cardiovasculares8, University of Tromsø9, Geisinger Health System10, Stanford University11
TL;DR: Pathway and functional enrichment analyses suggested that many AF-associated genetic variants act through a mechanism of impaired muscle cell differentiation and tissue formation during fetal heart development.
Abstract: Atrial fibrillation (AF) is a common cardiac arrhythmia and a major risk factor for stroke, heart failure, and premature death. The pathogenesis of AF remains poorly understood, which contributes to the current lack of highly effective treatments. To understand the genetic variation and biology underlying AF, we undertook a genome-wide association study (GWAS) of 6,337 AF individuals and 61,607 AF-free individuals from Norway, including replication in an additional 30,679 AF individuals and 278,895 AF-free individuals. Through genotyping and dense imputation mapping from whole-genome sequencing, we tested almost nine million genetic variants across the genome and identified seven risk loci, including two novel loci. One novel locus (lead single-nucleotide variant [SNV] rs12614435; p = 6.76 × 10-18) comprised intronic and several highly correlated missense variants situated in the I-, A-, and M-bands of titin, which is the largest protein in humans and responsible for the passive elasticity of heart and skeletal muscle. The other novel locus (lead SNV rs56202902; p = 1.54 × 10-11) covered a large, gene-dense chromosome 1 region that has previously been linked to cardiac conduction. Pathway and functional enrichment analyses suggested that many AF-associated genetic variants act through a mechanism of impaired muscle cell differentiation and tissue formation during fetal heart development.
••
TL;DR: These estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features, and provide the most refined portrait to date of the factors contributing to genome-wide variability of the human germline mutation rate.
Abstract: A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution. Here, we use ~36 million singleton variants from 3560 whole-genome sequences to infer fine-scale patterns of mutation rate heterogeneity. Mutability is jointly affected by adjacent nucleotide context and diverse genomic features of the surrounding region, including histone modifications, replication timing, and recombination rate, sometimes suggesting specific mutagenic mechanisms. Remarkably, GC content, DNase hypersensitivity, CpG islands, and H3K36 trimethylation are associated with both increased and decreased mutation rates depending on nucleotide context. We validate these estimated effects in an independent dataset of ~46,000 de novo mutations, and confirm our estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features. Our results thus provide the most refined portrait to date of the factors contributing to genome-wide variability of the human germline mutation rate.
01 Jan 2018
TL;DR: This paper performed two genome-wide association studies (GWASs) of circulating amounts of C-reactive protein (CRP) by using data from 88 studies comprising 204,402 European individuals.
Abstract: C-reactive protein (CRP) is a sensitive biomarker of chronic low-grade inflammation and is associated with multiple complex diseases. The genetic determinants of chronic inflammation remain largely unknown, and the causal role of CRP in several clinical outcomes is debated. We performed two genome-wide association studies (GWASs), on HapMap and 1000 Genomes imputed data, of circulating amounts of CRP by using data from 88 studies comprising 204,402 European individuals. Additionally, we performed in silico functional analyses and Mendelian randomization analyses with several clinical outcomes. The GWAS meta-analyses of CRP revealed 58 distinct genetic loci (p < 5 × 10-8). After adjustment for body mass index in the regression analysis, the associations at all except three loci remained. The lead variants at the distinct loci explained up to 7.0% of the variance in circulating amounts of CRP. We identified 66 gene sets that were organized in two substantially correlated clusters, one mainly composed of immune pathways and the other characterized by metabolic pathways in the liver. Mendelian randomization analyses revealed a causal protective effect of CRP on schizophrenia and a risk-increasing effect on bipolar disorder. Our findings provide further insights into the biology of inflammation and could lead to interventions for treating inflammation and its clinical consequences.
••
TL;DR: It is found that promoter-interacting elements in human adipocytes are enriched for adipose-related transcription factor motifs, such as PPARG and CEBPB, and contribute to heritability of cis-regulated gene expression.
Abstract: Increased adiposity is a hallmark of obesity and overweight, which affect 2.2 billion people world-wide. Understanding the genetic and molecular mechanisms that underlie obesity-related phenotypes can help to improve treatment options and drug development. Here we perform promoter Capture Hi-C in human adipocytes to investigate interactions between gene promoters and distal elements as a transcription-regulating mechanism contributing to these phenotypes. We find that promoter-interacting elements in human adipocytes are enriched for adipose-related transcription factor motifs, such as PPARG and CEBPB, and contribute to heritability of cis-regulated gene expression. We further intersect these data with published genome-wide association studies for BMI and BMI-related metabolic traits to identify the genes that are under genetic cis regulation in human adipocytes via chromosomal interactions. This integrative genomics approach identifies four cis-eQTL-eGene relationships associated with BMI or obesity-related traits, including rs4776984 and MAP2K5, which we further confirm by EMSA, and highlights 38 additional candidate genes.
••
TL;DR: MetaUSAT is a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS that can provide novel insights into the genetic architecture of a common disease or traits.
Abstract: Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.
01 Jan 2018
TL;DR: In this article, the authors performed genome-wide association meta-analyses of systolic and diastolic BP incorporating gene-smoking interactions in 610,091 individuals.
Abstract: Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genome-wide association meta-analyses of systolic and diastolic BP incorporating gene-smoking interactions in 610,091 individuals. Stage 1 analysis examined ∼18.8 million SNPs and small insertion/deletion variants in 129,913 individuals from four ancestries (European, African, Asian, and Hispanic) with follow-up analysis of promising variants in 480,178 additional individuals from five ancestries. We identified 15 loci that were genome-wide significant (p < 5 × 10-8) in stage 1 and formally replicated in stage 2. A combined stage 1 and 2 meta-analysis identified 66 additional genome-wide significant loci (13, 35, and 18 loci in European, African, and trans-ancestry, respectively). A total of 56 known BP loci were also identified by our results (p < 5 × 10-8). Of the newly identified loci, ten showed significant interaction with smoking status, but none of them were replicated in stage 2. Several loci were identified in African ancestry, highlighting the importance of genetic studies in diverse populations. The identified loci show strong evidence for regulatory features and support shared pathophysiology with cardiometabolic and addiction traits. They also highlight a role in BP regulation for biological candidates such as modulators of vascular structure and function (CDKN1B, BCAR1-CFDP1, PXDN, EEA1), ciliopathies (SDCCAG8, RPGRIP1L), telomere maintenance (TNKS, PINX1, AKTIP), and central dopaminergic signaling (MSRA, EBF2).
••
University of Oxford1, University of Michigan2, Amgen3, University of Cambridge4, Novo Nordisk5, University of Liverpool6, University of Freiburg7, Boston University8, University of Tartu9, Erasmus University Medical Center10, Leiden University11, Pasteur Institute12, Icahn School of Medicine at Mount Sinai13, Wellcome Trust Sanger Institute14, Charité15, Innsbruck Medical University16, Washington University in St. Louis17, University of Southern Denmark18, National and Kapodistrian University of Athens19, Harvard University20, Robertson Centre for Biostatistics21, University of Exeter22, Uppsala University23, Steno Diabetes Center24, University of Eastern Finland25, Lund University26, Technische Universität München27, University of North Carolina at Chapel Hill28, Ninewells Hospital29, University of Minnesota30, University of Glasgow31, University of Iceland32, National Institutes of Health33, Aarhus University34, Stanford University35, Leiden University Medical Center36
TL;DR: Increase in sample size and variant diversity deliver enhanced discovery and single-variant resolution of causal T2D-risk alleles, and the consequent impact on mechanistic insights and clinical translation is highlighted.
Abstract: We aggregated genome-wide genotyping data from 32 European-descent GWAS (74,124 T2D cases, 824,006 controls) imputed to high-density reference panels of >30,000 sequenced haplotypes. Analysis of ~27M variants (~21M with minor allele frequency [MAF] p -8 ; MAF 0.02%-50%; odds ratio [OR] 1.04-8.05), 135 not previously-implicated in T2D-predisposition. Conditional analyses revealed 160 additional distinct association signals ( p -5 ) within the identified loci. The combined set of 403 T2D-risk signals includes 56 low-frequency (0.5%≤MAF 2. Forty-one of the signals displayed effect-size heterogeneity between BMI-unadjusted and adjusted analyses. Increased sample size and improved imputation led to substantially more precise localisation of causal variants than previously attained: at 51 signals, the lead variant after fine-mapping accounted for >80% posterior probability of association (PPA) and at 18 of these, PPA exceeded 99%. Integration with islet regulatory annotations enriched for T2D association further reduced median credible set size (from 42 variants to 32) and extended the number of index variants with PPA>80% to 73. Although most signals mapped to regulatory sequence, we identified 18 genes as human validated therapeutic targets through coding variants that are causal for disease. Genome wide chip heritability accounted for 18% of T2D-risk, and individuals in the 2.5% extremes of a polygenic risk score generated from the GWAS data differed >9-fold in risk. Our observations highlight how increases in sample size and variant diversity deliver enhanced discovery and single-variant resolution of causal T2D-risk alleles, and the consequent impact on mechanistic insights and clinical translation.
••
TL;DR: The data suggest that rs7163757 contributes to genetic risk of islet dysfunction and T2D by increasing NFAT-mediated islet enhancer activity and modulating C2CD4B, and possibly C2 CD4A, expression in (patho)physiologic states.
Abstract: Genome-wide association studies (GWASs) and functional genomics approaches implicate enhancer disruption in islet dysfunction and type 2 diabetes (T2D) risk. We applied genetic fine-mapping and functional (epi)genomic approaches to a T2D- and proinsulin-associated 15q22.2 locus to identify a most likely causal variant, determine its direction of effect, and elucidate plausible target genes. Fine-mapping and conditional analyses of proinsulin levels of 8,635 non-diabetic individuals from the METSIM study support a single association signal represented by a cluster of 16 strongly associated (p −17 ) variants in high linkage disequilibrium (r 2 > 0.8) with the GWAS index SNP rs7172432. These variants reside in an evolutionarily and functionally conserved islet and β cell stretch or super enhancer; the most strongly associated variant (rs7163757, p=3 × 10 −19 ) overlaps a conserved islet open chromatin site. DNA sequence containing the rs7163757 risk allele displayed 2-fold higher enhancer activity than the non-risk allele in reporter assays (p in vitro , suggesting that it could be a factor mediating risk-allele effects. Finally, the rs7163757 proinsulin-raising and T2D risk allele (C) was associated with increased expression of C2CD4B , and possibly C2CD4A , both of which were induced by inflammatory cytokines, in human islets. Together, these data suggest that rs7163757 contributes to genetic risk of islet dysfunction and T2D by increasing NFAT-mediated islet enhancer activity and modulating C2CD4B , and possibly C2CD4A , expression in (patho)physiologic states.
••
TL;DR: The results demonstrate that profiling DNA methylation in adipose tissue is a powerful tool for understanding the molecular effects of metabolic syndrome on adipOSE tissue, and can be used in conjunction with traditional genetic analyses to further characterize this disorder.
Abstract: Most epigenome-wide association studies to date have been conducted in blood. However, metabolic syndrome is mediated by a dysregulation of adiposity and therefore it is critical to study adipose tissue in order to understand the effects of this syndrome on epigenomes. To determine if natural variation in DNA methylation was associated with metabolic syndrome traits, we profiled global methylation levels in subcutaneous abdominal adipose tissue. We measured association between 32 clinical traits related to diabetes and obesity in 201 people from the Metabolic Syndrome in Men cohort. We performed epigenome-wide association studies between DNA methylation levels and traits, and identified associations for 13 clinical traits in 21 loci. We prioritized candidate genes in these loci using expression quantitative trait loci, and identified 18 high confidence candidate genes, including known and novel genes associated with diabetes and obesity traits. Using methylation deconvolution, we examined which cell types may be mediating the associations, and concluded that most of the loci we identified were specific to adipocytes. We determined whether the abundance of cell types varies with metabolic traits, and found that macrophages increased in abundance with the severity of metabolic syndrome traits. Finally, we developed a DNA methylation-based biomarker to assess type 2 diabetes risk in adipose tissue. In conclusion, our results demonstrate that profiling DNA methylation in adipose tissue is a powerful tool for understanding the molecular effects of metabolic syndrome on adipose tissue, and can be used in conjunction with traditional genetic analyses to further characterize this disorder.
••
University of Michigan1, University of Eastern Finland2, Imperial College London3, University of Pennsylvania4, University of Cambridge5, University of North Carolina at Chapel Hill6, National Institutes of Health7, University of Oulu8, University of Iceland9, University of Mississippi Medical Center10
TL;DR: Five novel signals at established amino acid loci provide further insight into the molecular mechanisms of amino acid metabolism and potentially, their perturbations in disease, and are among the first applications of gene-based tests to identify new loci for amino acid levels.
Abstract: Comprehensive metabolite profiling captures many highly heritable traits, including amino acid levels, which are potentially sensitive biomarkers for disease pathogenesis. To better understand the contribution of genetic variation to amino acid levels, we performed single variant and gene-based tests of association between nine serum amino acids (alanine, glutamine, glycine, histidine, isoleucine, leucine, phenylalanine, tyrosine, and valine) and 16.6 million genotyped and imputed variants in 8545 non-diabetic Finnish men from the METabolic Syndrome In Men (METSIM) study with replication in Northern Finland Birth Cohort (NFBC1966). We identified five novel loci associated with amino acid levels (P = < 5×10-8): LOC157273/PPP1R3B with glycine (rs9987289, P = 2.3×10-26); ZFHX3 (chr16:73326579, minor allele frequency (MAF) = 0.42%, P = 3.6×10-9), LIPC (rs10468017, P = 1.5×10-8), and WWOX (rs9937914, P = 3.8×10-8) with alanine; and TRIB1 with tyrosine (rs28601761, P = 8×10-9). Gene-based tests identified two novel genes harboring missense variants of MAF <1% that show aggregate association with amino acid levels: PYCR1 with glycine (Pgene = 1.5×10-6) and BCAT2 with valine (Pgene = 7.4×10-7); neither gene was implicated by single variant association tests. These findings are among the first applications of gene-based tests to identify new loci for amino acid levels. In addition to the seven novel gene associations, we identified five independent signals at established amino acid loci, including two rare variant signals at GLDC (rs138640017, MAF=0.95%, Pconditional = 5.8×10-40) with glycine levels and HAL (rs141635447, MAF = 0.46%, Pconditional = 9.4×10-11) with histidine levels. Examination of all single variant association results in our data revealed a strong inverse relationship between effect size and MAF (Ptrend<0.001). These novel signals provide further insight into the molecular mechanisms of amino acid metabolism and potentially, their perturbations in disease.
••
TL;DR: A novel framework to select tag SNPs using the reference panel of 26 populations from Phase 3 of the 1000 Genomes Project, which demonstrates increased imputation accuracy for rare variants and examines array design strategies that contrast multi-ethnic cohorts vs. single populations.
Abstract: The emergence of very large cohorts in genomic research has facilitated a focus on genotype-imputation strategies to power rare variant association. These strategies have benefited from improvements in imputation methods and association tests, however little attention has been paid to ways in which array design can increase rare variant association power. Therefore, we developed a novel framework to select tag SNPs using the reference panel of 26 populations from Phase 3 of the 1000 Genomes Project. We evaluate tag SNP performance via mean imputed r2 at untyped sites using leave-one-out internal validation and standard imputation methods, rather than pairwise linkage disequilibrium. Moving beyond pairwise metrics allows us to account for haplotype diversity across the genome for improve imputation accuracy and demonstrates population-specific biases from pairwise estimates. We also examine array design strategies that contrast multi-ethnic cohorts vs. single populations, and show a boost in performance for the former can be obtained by prioritizing tag SNPs that contribute information across multiple populations simultaneously. Using our framework, we demonstrate increased imputation accuracy for rare variants (frequency < 1%) by 0.5-3.1% for an array of one million sites and 0.7-7.1% for an array of 500,000 sites, depending on the population. Finally, we show how recent explosive growth in non-African populations means tag SNPs capture on average 30% fewer other variants than in African populations. The unified framework presented here will enable investigators to make informed decisions for the design of new arrays, and help empower the next phase of rare variant association for global health.
••
University of Michigan1, University of Texas Health Science Center at Houston2, Broad Institute3, University of Texas at Austin4, University of Exeter5, Regeneron6, University of Lübeck7, McGill University8, University of Oxford9, Boston University10, National Institutes of Health11, Seoul National University12, University of Chicago13, Colorado School of Public Health14, National University of Singapore15, Ninewells Hospital16, University of Liverpool17, University of Maryland, Baltimore18, Harvard University19, University Health System20, Vanderbilt University21, University of North Carolina at Chapel Hill22, University of California, San Francisco23, Systems Research Institute24, University of Mississippi Medical Center25, University of Haifa26, Albert Einstein College of Medicine27, University of Texas Health Science Center at San Antonio28
TL;DR: The results from deep whole-genome analysis of large Mexican-American pedigrees are described, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families.
Abstract: A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.
••
TL;DR: The approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports use of genetics to deconstruct T1D heterogeneity.
Abstract: Type 2 diabetes (T2D) is a heterogeneous disease for which 1) disease-causing pathways are incompletely understood and 2) sub-classification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four independent cohorts of individuals with T2D. In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization clustering to genome-wide association results for 94 independent T2D genetic loci and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta-cell function, differing from each other by high vs. low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity-mediated (high BMI, waist circumference), "lipodystrophy-like" fat distribution (low BMI, adiponectin, HDL-cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster GRS9s were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease, and stroke risk. We evaluated the potential for clinical impact of these clusters in four studies containing participants with T2D (METSIM, N=487; Ashkenazi, N=509; Partners Biobank, N=2,065; UK Biobank N=14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with ~30% of all participants assigned to just one cluster top decile. Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.
01 Jan 2018
TL;DR: In this article, the authors performed a fine-mapping analysis to winnow the number of putative causal variants within associated loci to identify specific variants contributing to the biological etiology of substance use behavior.
Abstract: BACKGROUND
Smoking and alcohol use have been associated with common genetic variants in multiple loci. Rare variants within these loci hold promise in the identification of biological mechanisms in substance use. Exome arrays and genotype imputation can now efficiently genotype rare nonsynonymous and loss of function variants. Such variants are expected to have deleterious functional consequences and to contribute to disease risk.
METHODS
We analyzed ∼250,000 rare variants from 16 independent studies genotyped with exome arrays and augmented this dataset with imputed data from the UK Biobank. Associations were tested for five phenotypes: cigarettes per day, pack-years, smoking initiation, age of smoking initiation, and alcoholic drinks per week. We conducted stratified heritability analyses, single-variant tests, and gene-based burden tests of nonsynonymous/loss-of-function coding variants. We performed a novel fine-mapping analysis to winnow the number of putative causal variants within associated loci.
RESULTS
Meta-analytic sample sizes ranged from 152,348 to 433,216, depending on the phenotype. Rare coding variation explained 1.1% to 2.2% of phenotypic variance, reflecting 11% to 18% of the total single nucleotide polymorphism heritability of these phenotypes. We identified 171 genome-wide associated loci across all phenotypes. Fine mapping identified putative causal variants with double base-pair resolution at 24 of these loci, and between three and 10 variants for 65 loci. Twenty loci contained rare coding variants in the 95% credible intervals.
CONCLUSIONS
Rare coding variation significantly contributes to the heritability of smoking and alcohol use. Fine-mapping genome-wide association study loci identifies specific variants contributing to the biological etiology of substance use behavior.
••
TL;DR: Opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)—genetic effects on gene expression that depend on an environmental condition are explored.
Abstract: From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)—genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.
••
Evangelos Evangelou1, Evangelos Evangelou2, Helen R. Warren3, Helen R. Warren4 +337 more•Institutions (93)
TL;DR: In the version of this article originally published, the name of author Martin H. de Borst was coded incorrectly in the XML and the error has now been corrected in the HTML version of the paper.
Abstract: In the version of this article originally published, the name of author Martin H. de Borst was coded incorrectly in the XML. The error has now been corrected in the HTML version of the paper.
••
Simone de Jong1, Simone de Jong2, Mateus Jose Abdalla Diniz3, Andiara Calado Saloma Rodrigues3 +373 more•Institutions (6)
TL;DR: Studying patterns of assortative mating and anticipation, it appears increased polygenic risk is contributed by affected individuals who married into the family, resulting in an increasing genetic risk over generations, which may explain the observation of anticipation in mood disorders.
Abstract: Psychiatric disorders are thought to have a complex genetic pathology consisting of interplay of common and rare variation. Traditionally, pedigrees are used to shed light on the latter only, while here we discuss the application of polygenic risk scores to also highlight patterns of common genetic risk. We analyze polygenic risk scores for psychiatric disorders in a large pedigree (n ~ 260) in which 30% of family members suffer from major depressive disorder or bipolar disorder. Studying patterns of assortative mating and anticipation, it appears increased polygenic risk is contributed by affected individuals who married into the family, resulting in an increasing genetic risk over generations. This may explain the observation of anticipation in mood disorders, whereby onset is earlier and the severity increases over the generations of a family. Joint analyses of rare and common variation may be a powerful way to understand the familial genetics of psychiatric disorders.
••
TL;DR: A method to combine summary statistics across participating studies and consistently estimate joint effects, even when the contributed summary statistics contain large amounts of missing values is developed.
Abstract: Meta-analysis of genetic association studies increases sample size and the power for mapping complex traits. Existing methods are mostly developed for datasets without missing values, i.e. the summary association statistics are measured for all variants in contributing studies. In practice, genotype imputation is not always effective. This may be the case when targeted genotyping/sequencing assays are used or when the un-typed genetic variant is rare. Therefore, contributed summary statistics often contain missing values. Existing methods for imputing missing summary association statistics and using imputed values in meta-analysis, approximate conditional analysis, or simple strategies such as complete case analysis all have theoretical limitations. Applying these approaches can bias genetic effect estimates and lead to seriously inflated type-I or type-II errors in conditional analysis, which is a critical tool for identifying independently associated variants. To address this challenge and complement imputation methods, we developed a method to combine summary statistics across participating studies and consistently estimate joint effects, even when the contributed summary statistics contain large amounts of missing values. Based on this estimator, we proposed a score statistic called PCBS (partial correlation based score statistic) for conditional analysis of single-variant and gene-level associations. Through extensive analysis of simulated and real data, we showed that the new method produces well-calibrated type-I errors and is substantially more powerful than existing approaches. We applied the proposed approach to one of the largest meta-analyses to date for the cigarettes-per-day phenotype. Using the new method, we identified multiple novel independently associated variants at known loci for tobacco use, which were otherwise missed by alternative methods. Together, the phenotypic variance explained by these variants was 1.1%, improving that of previously reported associations by 71%. These findings illustrate the extent of locus allelic heterogeneity and can help pinpoint causal variants.
••
TL;DR: Experiments in rabbits with heart failure and left atrial dilation identified a heterogeneous distributed molecular switch from MYH6 to MYH7 in the left atrium, which resulted in contractile and functional heterogeneity and may predispose to initiation and maintenance of atrial arrhythmia.
Abstract: To understand the genetic variation underlying atrial fibrillation (AF), the most common cardiac arrhythmia, we performed a genome-wide association study (GWAS) of > 1 million people, including 60,620 AF cases and 970,216 controls We identified 163 independent risk variants at 111 loci and prioritized 165 candidate genes likely to be involved in AF Many of the identified risk variants fall near genes where more deleterious mutations have been reported to cause serious heart defects in humans or mice (MYH6, NKX2-5, PITX2, TBC1D32, TBX5), or near genes important for striated muscle function and integrity (eg MYH7, PKP2, SSPN, SGCA) Experiments in rabbits with heart failure and left atrial dilation identified a heterogeneous distributed molecular switch from MYH6 to MYH7 in the left atrium, which resulted in contractile and functional heterogeneity and may predispose to initiation and maintenance of atrial arrhythmia