scispace - formally typeset
Search or ask a question

Showing papers by "Gonçalo R. Abecasis published in 2017"


Journal ArticleDOI
Robert A. Scott1, Laura J. Scott2, Reedik Mägi3, Letizia Marullo4  +213 moreInstitutions (66)
01 Nov 2017-Diabetes
TL;DR: This article conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel.
Abstract: To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel Promising association signals were followed up in additional data sets (of 14,545 or 7,397 T2D case and 38,994 or 71,604 control subjects) We identified 13 novel T2D-associated loci (P < 5 × 10-8), including variants near the GLP2R, GIP, and HLA-DQA1 genes Our analysis brought the total number of independent T2D associations to 128 distinct signals at 113 loci Despite substantially increased sample size and more complete coverage of low-frequency variation, all novel associations were driven by common single nucleotide variants Credible sets of potentially causal variants were generally larger than those based on imputation with earlier reference panels, consistent with resolution of causal signals to common risk haplotypes Stratification of T2D-associated loci based on T2D-related quantitative trait associations revealed tissue-specific enrichment of regulatory annotations in pancreatic islet enhancers for loci influencing insulin secretion and in adipocytes, monocytes, and hepatocytes for insulin action-associated loci These findings highlight the predominant role played by common variants of modest effect and the diversity of biological mechanisms influencing T2D pathophysiology

601 citations


Journal ArticleDOI
Dajiang J. Liu1, Gina M. Peloso2, Gina M. Peloso3, Haojie Yu4  +285 moreInstitutions (91)
TL;DR: It is found that beta-thalassemia trait carriers displayed lower TC and were protected from coronary artery disease (CAD), and only some mechanisms of lowering LDL-C appeared to increase risk for type 2 diabetes (T2D); and TG-lowering alleles involved in hepatic production of TG-rich lipoproteins tracked with higher liver fat, higher risk for T2D, and lower risk for CAD.
Abstract: We screened variants on an exome-focused genotyping array in >300,000 participants (replication in >280,000 participants) and identified 444 independent variants in 250 loci significantly associated with total cholesterol (TC), high-density-lipoprotein cholesterol (HDL-C), low-density-lipoprotein cholesterol (LDL-C), and/or triglycerides (TG). At two loci (JAK2 and A1CF), experimental analysis in mice showed lipid changes consistent with the human data. We also found that: (i) beta-thalassemia trait carriers displayed lower TC and were protected from coronary artery disease (CAD); (ii) excluding the CETP locus, there was not a predictable relationship between plasma HDL-C and risk for age-related macular degeneration; (iii) only some mechanisms of lowering LDL-C appeared to increase risk for type 2 diabetes (T2D); and (iv) TG-lowering alleles involved in hepatic production of TG-rich lipoproteins (TM6SF2 and PNPLA3) tracked with higher liver fat, higher risk for T2D, and lower risk for CAD, whereas TG-lowering alleles involved in peripheral lipolysis (LPL and ANGPTL4) had no effect on liver fat but decreased risks for both T2D and CAD.

465 citations


01 Jan 2017
TL;DR: The results demonstrate that sufficiently large sample sizes can uncover rare and low-frequency variants of moderate-to-large effect associated with polygenic human phenotypes, and that these variants implicate relevant genes and pathways.
Abstract: Height is a highly heritable, classic polygenic trait with approximately 700 common associated variants identified through genome-wide association studies so far. Here, we report 83 height-associated coding variants with lower minor-allele frequencies (in the range of 0.1–4.8%) and effects of up to 2 centimetres per allele (such as those in IHH, STC2, AR and CRISPLD2), greater than ten times the average effect of common variants. In functional follow-up studies, rare height-increasing alleles of STC2 (giving an increase of 1–2 centimetres per allele) compromised proteolytic inhibition of PAPP-A and increased cleavage of IGFBP-4 in vitro, resulting in higher bioavailability of insulin-like growth factors. These 83 height-associated variants overlap genes that are mutated in monogenic growth disorders and highlight new biological candidates (such as ADAMTS3, IL11RA and NOX4) and pathways (such as proteoglycan and glycosaminoglycan synthesis) involved in growth. Our results demonstrate that sufficiently large sample sizes can uncover rare and low-frequency variants of moderate-to-large effect associated with polygenic human phenotypes, and that these variants implicate relevant genes and pathways.

407 citations


Journal ArticleDOI
TL;DR: A genome-wide association study of a broad allergic disease phenotype that considers the presence of any one of these three diseases identified 136 independent risk variants, including 73 not previously reported, which implicate 132 nearby genes in allergic disease pathophysiology.
Abstract: Asthma, hay fever (or allergic rhinitis) and eczema (or atopic dermatitis) often coexist in the same individuals, partly because of a shared genetic origin. To identify shared risk variants, we performed a genome-wide association study (GWAS; n = 360,838) of a broad allergic disease phenotype that considers the presence of any one of these three diseases. We identified 136 independent risk variants (P < 3 × 10-8), including 73 not previously reported, which implicate 132 nearby genes in allergic disease pathophysiology. Disease-specific effects were detected for only six variants, confirming that most represent shared risk factors. Tissue-specific heritability and biological process enrichment analyses suggest that shared risk variants influence lymphocyte-mediated immunity. Six target genes provide an opportunity for drug repositioning, while for 36 genes CpG methylation was found to influence transcription independently of genetic effects. Asthma, hay fever and eczema partly coexist because they share many genetic risk variants that dysregulate the expression of immune-related genes.

378 citations


Journal ArticleDOI
Philip C Haycock1, Stephen Burgess2, Aayah Nounu1, Jie Zheng1  +194 moreInstitutions (88)
TL;DR: It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases, as well as single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population.
Abstract: IMPORTANCE: The causal direction and magnitude of the association between telomere length and incidence of cancer and non-neoplastic diseases is uncertain owing to the susceptibility of observational studies to confounding and reverse causation. OBJECTIVE: To conduct a Mendelian randomization study, using germline genetic variants as instrumental variables, to appraise the causal relevance of telomere length for risk of cancer and non-neoplastic diseases. DATA SOURCES: Genomewide association studies (GWAS) published up to January 15, 2015. STUDY SELECTION: GWAS of noncommunicable diseases that assayed germline genetic variation and did not select cohort or control participants on the basis of preexisting diseases. Of 163 GWAS of noncommunicable diseases identified, summary data from 103 were available. DATA EXTRACTION AND SYNTHESIS: Summary association statistics for single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population. MAIN OUTCOMES AND MEASURES: Odds ratios (ORs) and 95% confidence intervals (CIs) for disease per standard deviation (SD) higher telomere length due to germline genetic variation. RESULTS: Summary data were available for 35 cancers and 48 non-neoplastic diseases, corresponding to 420 081 cases (median cases, 2526 per disease) and 1 093 105 controls (median, 6789 per disease). Increased telomere length due to germline genetic variation was generally associated with increased risk for site-specific cancers. The strongest associations (ORs [95% CIs] per 1-SD change in genetically increased telomere length) were observed for glioma, 5.27 (3.15-8.81); serous low-malignant-potential ovarian cancer, 4.35 (2.39-7.94); lung adenocarcinoma, 3.19 (2.40-4.22); neuroblastoma, 2.98 (1.92-4.62); bladder cancer, 2.19 (1.32-3.66); melanoma, 1.87 (1.55-2.26); testicular cancer, 1.76 (1.02-3.04); kidney cancer, 1.55 (1.08-2.23); and endometrial cancer, 1.31 (1.07-1.61). Associations were stronger for rarer cancers and at tissue sites with lower rates of stem cell division. There was generally little evidence of association between genetically increased telomere length and risk of psychiatric, autoimmune, inflammatory, diabetic, and other non-neoplastic diseases, except for coronary heart disease (OR, 0.78 [95% CI, 0.67-0.90]), abdominal aortic aneurysm (OR, 0.63 [95% CI, 0.49-0.81]), celiac disease (OR, 0.42 [95% CI, 0.28-0.61]) and interstitial lung disease (OR, 0.09 [95% CI, 0.05-0.15]). CONCLUSIONS AND RELEVANCE: It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases.

376 citations


Journal ArticleDOI
Eleanor Wheeler1, Aaron Leong2, Ching-Ti Liu3, Marie-France Hivert2  +255 moreInstitutions (89)
TL;DR: This multiancestry study recommends investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G 6PD deficiency is common, and investigates the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance.
Abstract: Background: Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identif ...

304 citations


Journal ArticleDOI
TL;DR: A TNFSF13B variant was associated with multiple sclerosis and SLE, and its effects were clarified at the population, cellular, and molecular levels.
Abstract: BackgroundGenomewide association studies of autoimmune diseases have mapped hundreds of susceptibility regions in the genome. However, only for a few association signals has the causal gene been identified, and for even fewer have the causal variant and underlying mechanism been defined. Coincident associations of DNA variants affecting both the risk of autoimmune disease and quantitative immune variables provide an informative route to explore disease mechanisms and drug-targetable pathways. MethodsUsing case–control samples from Sardinia, Italy, we performed a genomewide association study in multiple sclerosis followed by TNFSF13B locus–specific association testing in systemic lupus erythematosus (SLE). Extensive phenotyping of quantitative immune variables, sequence-based fine mapping, cross-population and cross-phenotype analyses, and gene-expression studies were used to identify the causal variant and elucidate its mechanism of action. Signatures of positive selection were also investigated. ResultsA...

279 citations


Journal ArticleDOI
Mariaelisa Graff1, Robert A. Scott2, Anne E. Justice1, Kristin L. Young1  +346 moreInstitutions (101)
TL;DR: In additional genome-wide meta-analyses adjusting for PA and interaction with PA, 11 novel adiposity loci are identified, suggesting that accounting for PA or other environmental factors that contribute to variation in adiposity may facilitate gene discovery.
Abstract: Physical activity (PA) may modify the genetic effects that give rise to increased risk of obesity. To identify adiposity loci whose effects are modified by PA, we performed genome-wide interaction meta-analyses of BMI and BMI-adjusted waist circumference and waist-hip ratio from up to 200,452 adults of European (n = 180,423) or other ancestry (n = 20,029). We standardized PA by categorizing it into a dichotomous variable where, on average, 23% of participants were categorized as inactive and 77% as physically active. While we replicate the interaction with PA for the strongest known obesity-risk locus in the FTO gene, of which the effect is attenuated by ~30% in physically active individuals compared to inactive individuals, we do not identify additional loci that are sensitive to PA. In additional genome-wide meta-analyses adjusting for PA and interaction with PA, we identify 11 novel adiposity loci, suggesting that accounting for PA or other environmental factors that contribute to variation in adiposity may facilitate gene discovery.

275 citations



Journal ArticleDOI
TL;DR: This study provides a comprehensive layout for the genetic architecture of common variants for psoriasis, including data from eight different Caucasian cohorts, with a combined effective sample size >39,000 individuals.
Abstract: Psoriasis is a complex disease of skin with a prevalence of about 2%. We conducted the largest meta-analysis of genome-wide association studies (GWAS) for psoriasis to date, including data from eight different Caucasian cohorts, with a combined effective sample size >39,000 individuals. We identified 16 additional psoriasis susceptibility loci achieving genome-wide significance, increasing the number of identified loci to 63 for European-origin individuals. Functional analysis highlighted the roles of interferon signalling and the NFκB cascade, and we showed that the psoriasis signals are enriched in regulatory elements from different T cells (CD8+ T-cells and CD4+ T-cells including TH0, TH1 and TH17). The identified loci explain ∼28% of the genetic heritability and generate a discriminatory genetic risk score (AUC=0.76 in our sample) that is significantly correlated with age at onset (p=2 × 10-89). This study provides a comprehensive layout for the genetic architecture of common variants for psoriasis.

237 citations


01 Jan 2017
TL;DR: In this article, the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance was investigated.
Abstract: Background Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identified 18 HbA1c-associated genetic variants. These variants proved to be classifiable by their likely biological action as erythrocytic (also associated with erythrocyte traits) or glycemic (associated with other glucose-related traits). In this study, we tested the hypotheses that, in a very large scale GWAS, we would identify more genetic variants associated with HbA1c and that HbA1c variants implicated in erythrocytic biology would affect the diagnostic accuracy of HbA1c. We therefore expanded the number of HbA1c-associated loci and tested the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance. Throughout this multiancestry study, we kept a focus on interancestry differences in HbA1c genetics performance that might influence race-ancestry differences in health outcomes. Methods & findings Using genome-wide association meta-analyses in up to 159,940 individuals from 82 cohorts of European, African, East Asian, and South Asian ancestry, we identified 60 common genetic variants associated with HbA1c. We classified variants as implicated in glycemic, erythrocytic, or unclassified biology and tested whether additive genetic scores of erythrocytic variants (GS-E) or glycemic variants (GS-G) were associated with higher T2D incidence in multiethnic longitudinal cohorts (N = 33,241). Nineteen glycemic and 22 erythrocytic variants were associated with HbA1c at genome-wide significance. GS-G was associated with higher T2D risk (incidence OR = 1.05, 95% CI 1.04–1.06, per HbA1c-raising allele, p = 3 × 10−29); whereas GS-E was not (OR = 1.00, 95% CI 0.99–1.01, p = 0.60). In Europeans and Asians, erythrocytic variants in aggregate had only modest effects on the diagnostic accuracy of HbA1c. Yet, in African Americans, the X-linked G6PD G202A variant (T-allele frequency 11%) was associated with an absolute decrease in HbA1c of 0.81%-units (95% CI 0.66–0.96) per allele in hemizygous men, and 0.68%-units (95% CI 0.38–0.97) in homozygous women. The G6PD variant may cause approximately 2% (N = 0.65 million, 95% CI 0.55–0.74) of African American adults with T2D to remain undiagnosed when screened with HbA1c. Limitations include the smaller sample sizes for non-European ancestries and the inability to classify approximately one-third of the variants. Further studies in large multiethnic cohorts with HbA1c, glycemic, and erythrocytic traits are required to better determine the biological action of the unclassified variants. Conclusions As G6PD deficiency can be clinically silent until illness strikes, we recommend investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G6PD deficiency is common. Screening with direct glucose measurements, or genetically-informed HbA1c diagnostic thresholds in people with G6PD deficiency, may be required to avoid missed or delayed diagnoses.

Journal ArticleDOI
Anne E. Justice1, Thomas W. Winkler2, Mary F. Feitosa3, Misa Graff1  +367 moreInstitutions (97)
TL;DR: The results suggest that tobacco smoking may alter the genetic susceptibility to overall adiposity and body fat distribution, and highlight the importance of accounting for environment in genetic analyses.
Abstract: Few genome-wide association studies (GWAS) account for environmental exposures, like smoking, potentially impacting the overall trait variance when investigating the genetic contribution to obesity-related traits. Here, we use GWAS data from 51,080 current smokers and 190,178 nonsmokers (87% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI. We identify 23 novel genetic loci, and 9 loci with convincing evidence of gene-smoking interaction (GxSMK) on obesity-related traits. We show consistent direction of effect for all identified loci and significance for 18 novel and for 5 interaction loci in an independent study sample. These loci highlight novel biological functions, including response to oxidative stress, addictive behaviour, and regulatory functions emphasizing the importance of accounting for environment in genetic analyses. Our results suggest that tobacco smoking may alter the genetic susceptibility to overall adiposity and body fat distribution.

01 Jan 2017
TL;DR: Results from genetic risk score models raise the possibility of a precision medicine approach through early lifestyle intervention to offset the impact of blood pressure–raising genetic variants on future cardiovascular disease risk.
Abstract: Elevated blood pressure is the leading heritable risk factor for cardiovascular disease worldwide. We report genetic association of blood pressure (systolic, diastolic, pulse pressure) among UK Biobank participants of European ancestry with independent replication in other cohorts, and robust validation of 107 independent loci. We also identify new independent variants at 11 previously reported blood pressure loci. In combination with results from a range of in silico functional analyses and wet bench experiments, our findings highlight new biological pathways for blood pressure regulation enriched for genes expressed in vascular tissues and identify potential therapeutic targets for hypertension. Results from genetic risk score models raise the possibility of a precision medicine approach through early lifestyle intervention to offset the impact of blood pressure–raising genetic variants on future cardiovascular disease risk.

Journal ArticleDOI
07 Mar 2017-JAMA
TL;DR: The presence of rare damaging mutations in LPL was significantly associated with higher triglyceride levels and presence of coronary arteries disease, and further research is needed to assess whether there are causal mechanisms by which heterozygous lipoprotein lipase deficiency could lead to coronary artery disease.
Abstract: Importance The activity of lipoprotein lipase (LPL) is the rate-determining step in clearing triglyceride-rich lipoproteins from the circulation. Mutations that damage the LPL gene ( LPL ) lead to lifelong deficiency in enzymatic activity and can provide insight into the relationship of LPL to human disease. Objective To determine whether rare and/or common variants in LPL are associated with early-onset coronary artery disease (CAD). Design, Setting, and Participants In a cross-sectional study, LPL was sequenced in 10 CAD case-control cohorts of the multinational Myocardial Infarction Genetics Consortium and a nested CAD case-control cohort of the Geisinger Health System DiscovEHR cohort between 2010 and 2015. Common variants were genotyped in up to 305 699 individuals of the Global Lipids Genetics Consortium and up to 120 600 individuals of the CARDIoGRAM Exome Consortium between 2012 and 2014. Study-specific estimates were pooled via meta-analysis. Exposures Rare damaging mutations in LPL included loss-of-function variants and missense variants annotated as pathogenic in a human genetics database or predicted to be damaging by computer prediction algorithms trained to identify mutations that impair protein function. Common variants in the LPL gene region included those independently associated with circulating triglyceride levels. Main Outcomes and Measures Circulating lipid levels and CAD. Results Among 46 891 individuals with LPL gene sequencing data available, the mean (SD) age was 50 (12.6) years and 51% were female. A total of 188 participants (0.40%; 95% CI, 0.35%-0.46%) carried a damaging mutation in LPL , including 105 of 32 646 control participants (0.32%) and 83 of 14 245 participants with early-onset CAD (0.58%). Compared with 46 703 noncarriers, the 188 heterozygous carriers of an LPL damaging mutation displayed higher plasma triglyceride levels (19.6 mg/dL; 95% CI, 4.6-34.6 mg/dL) and higher odds of CAD (odds ratio = 1.84; 95% CI, 1.35-2.51; P LPL variants resulted in an odds ratio for CAD of 1.51 (95% CI, 1.39-1.64; P = 1.1 × 10 −22 ) per 1-SD increase in triglycerides. Conclusions and Relevance The presence of rare damaging mutations in LPL was significantly associated with higher triglyceride levels and presence of coronary artery disease. However, further research is needed to assess whether there are causal mechanisms by which heterozygous lipoprotein lipase deficiency could lead to coronary artery disease.

Journal ArticleDOI
TL;DR: Preliminary Whole Genome Sequencing for Psychiatric Disorders Consortium data will integrate data for 18,000 individuals with psychiatric disorders, beginning with autism spectrum disorder, schizophrenia, bipolar disorder, and major depressive disorder, along with over 150,000 controls.
Abstract: As technology advances, whole genome sequencing (WGS) is likely to supersede other genotyping technologies. The rate of this change depends on its relative cost and utility. Variants identified uniquely through WGS may reveal novel biological pathways underlying complex disorders and provide high-resolution insight into when, where, and in which cell type these pathways are affected. Alternatively, cheaper and less computationally intensive approaches may yield equivalent insights. Understanding the role of rare variants in the noncoding gene-regulating genome through pilot WGS projects will be critical to determining which of these two extremes best represents reality. With large cohorts, well-defined risk loci, and a compelling need to understand the underlying biology, psychiatric disorders have a role to play in this preliminary WGS assessment. The Whole Genome Sequencing for Psychiatric Disorders Consortium will integrate data for 18,000 individuals with psychiatric disorders, beginning with autism spectrum disorder, schizophrenia, bipolar disorder, and major depressive disorder, along with over 150,000 controls.

Journal ArticleDOI
Louise V. Wain1, Ahmad Vaez2, Rick Jansen2, Roby Joehanes3  +267 moreInstitutions (74)
TL;DR: 48 genes with evidence for involvement in blood pressure regulation that are significant in multiple resources are identified and these robustly implicated genes may provide new leads for therapeutic innovation.
Abstract: Elevated blood pressure is a major risk factor for cardiovascular disease and has a substantial genetic contribution. Genetic variation influencing blood pressure has the potential to identify new pharmacological targets for the treatment of hypertension. To discover additional novel blood pressure loci, we used 1000 Genomes Project-based imputation in 150 134 European ancestry individuals and sought significant evidence for independent replication in a further 228 245 individuals. We report 6 new signals of association in or near HSPB7, TNXB, LRP12, LOC283335, SEPT9, and AKT2, and provide new replication evidence for a further 2 signals in EBF2 and NFKBIA. Combining large whole-blood gene expression resources totaling 12 607 individuals, we investigated all novel and previously reported signals and identified 48 genes with evidence for involvement in blood pressure regulation that are significant in multiple resources. Three novel kidney-specific signals were also detected. These robustly implicated genes may provide new leads for therapeutic innovation.

Journal ArticleDOI
TL;DR: A computationally fast score-test-based method that estimates the distribution of the test statistic by using the saddlepoint approximation that can control type I error rates while replicating previously known association signals even for traits with a very small number of cases and a large number of controls.
Abstract: The availability of electronic health record (EHR)-based phenotypes allows for genome-wide association analyses in thousands of traits and has great potential to enable identification of genetic variants associated with clinical phenotypes. We can interpret the phenome-wide association study (PheWAS) result for a single genetic variant by observing its association across a landscape of phenotypes. Because a PheWAS can test thousands of binary phenotypes, and most of them have unbalanced or often extremely unbalanced case-control ratios (1:10 or 1:600, respectively), existing methods cannot provide an accurate and scalable way to test for associations. Here, we propose a computationally fast score-test-based method that estimates the distribution of the test statistic by using the saddlepoint approximation. Our method is much (∼100 times) faster than the state-of-the-art Firth's test. It can also adjust for covariates and control type I error rates even when the case-control ratio is extremely unbalanced. Through application to PheWAS data from the Michigan Genomics Initiative, we show that the proposed method can control type I error rates while replicating previously known association signals even for traits with a very small number of cases and a large number of controls.

Journal ArticleDOI
TL;DR: The data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.
Abstract: Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used an exome array to examine protein-coding genetic variants in 47,532 East Asian individuals. We identified 255 variants at 41 loci that reached chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After a meta-analysis including >300,000 European samples, we identified an additional nine novel loci. Sixteen genes were identified by protein-altering variants in both East Asians and Europeans, and thus are likely to be functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.

01 Jan 2017
TL;DR: In this paper, the authors used GWAS data from 51,080 current smokers and 190,178 nonsmokers (87% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI.
Abstract: Few genome-wide association studies (GWAS) account for environmental exposures, like smoking, potentially impacting the overall trait variance when investigating the genetic contribution to obesity-related traits. Here, we use GWAS data from 51,080 current smokers and 190,178 nonsmokers (87% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI. We identify 23 novel genetic loci, and 9 loci with convincing evidence of gene-smoking interaction (GxSMK) on obesity-related traits. We show consistent direction of effect for all identified loci and significance for 18 novel and for 5 interaction loci in an independent study sample. These loci highlight novel biological functions, including response to oxidative stress, addictive behaviour, and regulatory functions emphasizing the importance of accounting for environment in genetic analyses. Our results suggest that tobacco smoking may alter the genetic susceptibility to overall adiposity and body fat distribution.

Journal ArticleDOI
TL;DR: A genome-wide association scan of 466 BAV cases and 4,660 age, sex and ethnicity-matched controls with replication in up to 1,326 cases and 8,103 controls identifies association with a noncoding variant 151 kb from the gene encoding the cardiac-specific transcription factor, GATA4.
Abstract: Bicuspid aortic valve (BAV) is a heritable congenital heart defect and an important risk factor for valvulopathy and aortopathy. Here we report a genome-wide association scan of 466 BAV cases and 4,660 age, sex and ethnicity-matched controls with replication in up to 1,326 cases and 8,103 controls. We identify association with a noncoding variant 151 kb from the gene encoding the cardiac-specific transcription factor, GATA4, and near-significance for p.Ser377Gly in GATA4. GATA4 was interrupted by CRISPR-Cas9 in induced pluripotent stem cells from healthy donors. The disruption of GATA4 significantly impaired the transition from endothelial cells into mesenchymal cells, a critical step in heart valve development. Bicuspid aortic valve (BAV) is the most common human congenital cardiovascular malformation. Here, the authors perform a genome-wide association study for BAV and identify risk variants in the gene region of cardiac-specific transcription factor GATA4 and implicate GATA4 in heart valve development.


Journal ArticleDOI
TL;DR: In this article, the authors demonstrate that therapy that inhibit CETP (cholesteryl ester transfer protein) has failed to demonstrate a reduction in risk for coronary heart disease (CHD) human DNA sequence variants that truncate.
Abstract: Rationale:Therapies that inhibit CETP (cholesteryl ester transfer protein) have failed to demonstrate a reduction in risk for coronary heart disease (CHD) Human DNA sequence variants that truncate

Journal ArticleDOI
TL;DR: Almost all SSHs coalesce in the post-Nuragic, Nuragic and Neolithic-Copper Age periods, however, for some rare SSHs, the possibility that they might have been on the island prior to the Neolithic is not dismissed, a scenario that would be in agreement with archeological evidence of a Mesolithic occupation of Sardinia.
Abstract: Sardinians are “outliers” in the European genetic landscape and, according to paleogenomic nuclear data, the closest to early European Neolithic farmers. To learn more about their genetic ancestry, we analyzed 3,491 modern and 21 ancient mitogenomes from Sardinia. We observed that 78.4% of modern mitogenomes cluster into 89 haplogroups that most likely arose in situ. For each Sardinian-specific haplogroup (SSH), we also identified the upstream node in the phylogeny, from which non-Sardinian mitogenomes radiate. This provided minimum and maximum time estimates for the presence of each SSH on the island. In agreement with demographic evidence, almost all SSHs coalesce in the post-Nuragic, Nuragic and Neolithic-Copper Age periods. For some rare SSHs, however, we could not dismiss the possibility that they might have been on the island prior to the Neolithic, a scenario that would be in agreement with archeological evidence of a Mesolithic occupation of Sardinia.

Posted ContentDOI
10 Mar 2017-bioRxiv
TL;DR: The results suggest that as imputation reference panels become larger and more diverse, estimates of the frequency distribution of causal variants will become increasingly unbiased and the vast majority of trait narrow-sense heritability will be accounted for.
Abstract: Heritability, h 2 , is a foundational concept in genetics, critical to understanding the genetic basis of complex traits. Recently-developed methods that estimate heritability from genotyped SNPs, h 2 SNP , explain substantially more genetic variance than genome-wide significant loci, but less than classical estimates from twins and families. However, h 2 SNP estimates have yet to be comprehensively compared under a range of genetic architectures, making it difficult to draw conclusions from sometimes conflicting published estimates. Here, we used thousands of real whole genome sequences to simulate realistic phenotypes under a variety of genetic architectures, including those from very rare causal variants. We compared the performance of ten methods across different types of genotypic data (commercial SNP array positions, whole genome sequence variants, and imputed variants) and under differing causal variant frequencies, levels of stratification, and relatedness thresholds. These results provide guidance in interpreting past results and choosing optimal approaches for future studies. We then chose two methods (GREML-MS and GREML-LDMS) that best estimated overall h 2 SNP and the causal variant frequency spectra to six phenotypes in the UK Biobank using imputed genome-wide variants. Our results suggest that as imputation reference panels become larger and more diverse, estimates of the frequency distribution of causal variants will become increasingly unbiased and the vast majority of trait narrow-sense heritability will be accounted for.

Journal ArticleDOI
TL;DR: The method dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-wise LD structures in human genomes and efficiently integrate functional information in GWASs, helping identify functional associated-variants and underlying biology.
Abstract: Genome-wide association studies (GWASs) have identified many complex loci. However, most loci reside in noncoding regions and have unknown biological functions. Integrative analysis that incorporates known functional information into GWASs can help elucidate the underlying biological mechanisms and prioritize important functional variants. Hence, we develop a flexible Bayesian variable selection model with efficient computational techniques for such integrative analysis. Different from previous approaches, our method models the effect-size distribution and probability of causality for variants with different annotations and jointly models genome-wide variants to account for linkage disequilibrium (LD), thus prioritizing associations based on the quantification of the annotations and allowing for multiple associated variants per locus. Our method dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-wise LD structures in human genomes. In simulations, our method accurately quantifies the functional enrichment and performs more powerfully for prioritizing the true associations than alternative methods, where the power gain is especially apparent when multiple associated variants in LD reside in the same locus. We applied our method to an in-depth GWAS of age-related macular degeneration with 33,976 individuals and 9,857,286 variants. We find the strongest enrichment for causality among non-synonymous variants (54× more likely to be causal, 1.4× larger effect sizes) and variants in transcription, repressed Polycomb, and enhancer regions, as well as identify five additional candidate loci beyond the 32 known AMD risk loci. In conclusion, our method is shown to efficiently integrate functional information in GWASs, helping identify functional associated-variants and underlying biology.

Journal ArticleDOI
01 Jul 2017-Diabetes
TL;DR: The allelic spectrum for coding variants in AKT2 associated with disorders of glucose homeostasis is extended and bidirectional effects of variants within the pleckstrin homology domain ofAKT2 are demonstrated.
Abstract: To identify novel coding association signals and facilitate characterization of mechanisms influencing glycemic traits and type 2 diabetes risk, we analyzed 109,215 variants derived from exome array genotyping together with an additional 390,225 variants from exome sequence in up to 39,339 normoglycemic individuals from five ancestry groups. We identified a novel association between the coding variant (p.Pro50Thr) in AKT2 and fasting plasma insulin (FI), a gene in which rare fully penetrant mutations are causal for monogenic glycemic disorders. The low-frequency allele is associated with a 12% increase in FI levels. This variant is present at 1.1% frequency in Finns but virtually absent in individuals from other ancestries. Carriers of the FI-increasing allele had increased 2-h insulin values, decreased insulin sensitivity, and increased risk of type 2 diabetes (odds ratio 1.05). In cellular studies, the AKT2-Thr50 protein exhibited a partial loss of function. We extend the allelic spectrum for coding variants in AKT2 associated with disorders of glucose homeostasis and demonstrate bidirectional effects of variants within the pleckstrin homology domain of AKT2.

Posted ContentDOI
17 Jul 2017-bioRxiv
TL;DR: The Genetic Association Study (GAS) Power Calculator is developed to provide users with a simple interface that can be compute the power of genetic association studies in a convenient browser based interface.
Abstract: Motivation: Statistical power calculations are crucial in designing genetic association studies. They help guide tradeoffs between large sample sizes and detailed assessments of genotype and phenotype, help determine which studies are viable, and help interpret research findings. To facilitate widespread use of power analysis in the design and interpretation of genetic studies, it is important to enable users to calculate power and visualize the effect of different models and design choices in convenient, interactive tools that are easily accessible. Results: We developed the Genetic Association Study (GAS) Power Calculator to provide users with a simple interface that can be compute the power of genetic association studies in a convenient browser based interface. Availability: The GAS Power Calculator can be accessed from the web interface at http://csg.sph.umich.edu/abecasis/gas_power_calculator/. Source code is available at https://github.com/jenlij/GAS-power-calculator.

Journal ArticleDOI
01 May 2017-Genetics
TL;DR: The model for predicting the disease progression risk demonstrated satisfactory performance in both cohorts, and is recommend its use with baseline AMD severity scores plus baseline age, education level, and smoking status, either with or without GRS.
Abstract: Age-related macular degeneration (AMD) is a leading cause of blindness in the developed world. While many AMD susceptibility variants have been identified, their influence on AMD progression has not been elucidated. Using data from two large clinical trials, Age-Related Eye Disease Study (AREDS) and AREDS2, we evaluated the effects of 34 known risk variants on disease progression. In doing so, we calculated the eye-level time-to-late AMD and modeled them using a bivariate survival analysis approach, appropriately accounting for between-eye correlation. We then derived a genetic risk score (GRS) based on these 34 risk variants, and analyzed its effect on AMD progression. Finally, we used the AREDS data to fit prediction models of progression based on demographic and environmental factors, eye-level AMD severity scores and the GRS and tested the models using the AREDS2 cohort. We observed that GRS was significantly associated with AMD progression in both cohorts, with a stronger effect in AREDS than in AREDS2 (AREDS: hazard ratio (HR) = 1.34, P = 1.6 × 10-22; AREDS2: HR = 1.11, P = 2.1 × 10-4). For prediction of AMD progression, addition of GRS to the demographic/environmental risk factors considerably improved the prediction performance. However, when the baseline eye-level severity scores were included as the predictors, any other risk factors including the GRS only provided small additional predictive power. Our model for predicting the disease progression risk demonstrated satisfactory performance in both cohorts, and we recommend its use with baseline AMD severity scores plus baseline age, education level, and smoking status, either with or without GRS.

Posted ContentDOI
07 Jul 2017-bioRxiv
TL;DR: The WGSPD consortium will integrate data for 18,000 individuals with psychiatric disorders, beginning with autism spectrum disorder, schizophrenia, bipolar disorder, and major depressive disorder, along with over 150,000 controls, to assess the role of rare variants in the noncoding gene-regulating genome.
Abstract: As technology advances, whole genome sequencing (WGS) is likely to supersede other genotyping technologies. The rate of this change depends on its relative cost and utility. Variants identified uniquely through WGS may reveal novel biological pathways underlying complex disorders and provide high-resolution insight into when, where, and in which cell type these pathways are affected. Alternatively, cheaper and less computationally intensive approaches may yield equivalent insights. Understanding the role of rare variants in the noncoding gene-regulating genome, through pilot WGS projects, will be critical to determine which of these two extremes best represents reality. With large cohorts, well-defined risk loci, and a compelling need to understand the underlying biology, psychiatric disorders have a role to play in this preliminary WGS assessment. The WGSPD consortium will integrate data for 18,000 individuals with psychiatric disorders, beginning with autism spectrum disorder, schizophrenia, bipolar disorder, and major depressive disorder, along with over 150,000 controls.

01 May 2017
TL;DR: CETP PTV carrier status was associated with reduced risk for CHD and, compared with noncarriers, carriers of PTV at CETP displayed higher high-density lipoprotein cholesterol, lower low-density lipid levels, lower triglycerides, and lower risk forCHD.
Abstract: Rationale: Therapies that inhibit CETP (cholesteryl ester transfer protein) have failed to demonstrate a reduction in risk for coronary heart disease (CHD). Human DNA sequence variants that truncate the CETP gene may provide insight into the efficacy of CETP inhibition. Objective: To test whether protein-truncating variants (PTVs) at the CETP gene were associated with plasma lipid levels and CHD. Methods and Results: We sequenced the exons of the CETP gene in 58 469 participants from 12 case–control studies (18 817 CHD cases, 39 652 CHD-free controls). We defined PTV as those that lead to a premature stop, disrupt canonical splice sites, or lead to insertions/deletions that shift frame. We also genotyped 1 Japanese-specific PTV in 27561 participants from 3 case–control studies (14 286 CHD cases, 13 275 CHD-free controls). We tested association of CETP PTV carrier status with both plasma lipids and CHD. Among 58 469 participants with CETP gene-sequencing data available, average age was 51.5 years and 43% were women; 1 in 975 participants carried a PTV at the CETP gene. Compared with noncarriers, carriers of PTV at CETP had higher high-density lipoprotein cholesterol (effect size, 22.6 mg/dL; 95% confidence interval, 18–27; P<1.0×10−4), lower low-density lipoprotein cholesterol (−12.2 mg/dL; 95% confidence interval, −23 to −0.98; P=0.033), and lower triglycerides (−6.3%; 95% confidence interval, −12 to −0.22; P=0.043). CETP PTV carrier status was associated with reduced risk for CHD (summary odds ratio, 0.70; 95% confidence interval, 0.54–0.90; P=5.1×10−3). Conclusions: Compared with noncarriers, carriers of PTV at CETP displayed higher high-density lipoprotein cholesterol, lower low-density lipoprotein cholesterol, lower triglycerides, and lower risk for CHD.