Showing papers by "Adam Auton published in 2018"
••
TL;DR: This work introduces a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional enrichments to increase prediction accuracy andfits priors using the recently developed baseline-LD model.
Abstract: Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional enrichments to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 16 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg N=365K) and samples of other European ancestries as validation data (avg N=22K), to minimize confounding. LDpred-funct attained a +27% relative improvement in prediction accuracy (avg prediction R 2 =0.173; highest R 2 =0.417 for height) compared to existing methods that do not incorporate functional information, consistent with simulations. For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N=1107K; higher heritability in UK Biobank cohort) increased prediction R 2 to 0.429. Our results show that modeling functional enrichment substantially improves polygenic prediction accuracy, bringing polygenic prediction of complex traits closer to clinical utility.
57 citations
••
TL;DR: In this article, a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy was introduced. But the method was not applied to predict 21 highly heritable traits in the UK Biobank.
Abstract: Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg N=373K) and samples of other European ancestries as validation data (avg N=22K), to minimize confounding. LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2=0.144; highest R2=0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N=1107K; higher heritability in UK Biobank cohort) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.
49 citations
••
Erasmus University Rotterdam1, University of Bristol2, King's College London3, Wellcome Trust Centre for Human Genetics4, University of Oxford5, University of Western Australia6, University of Newcastle7, University of Melbourne8, University of Warwick9, QIMR Berghofer Medical Research Institute10, University College London11
TL;DR: Although the hypothesis that 2D:4D ratio is a direct biomarker of prenatal exposure to androgens in healthy individuals, the findings do not explicitly exclude this possibility, and pathways involving testosterone may become apparent as the size of the discovery sample increases further.
Abstract: The ratio of the length of the index finger to that of the ring finger (2D:4D) is sexually dimorphic and is commonly used as a non-invasive biomarker of prenatal androgen exposure. Most association studies of 2D:4D ratio with a diverse range of sex-specific traits have typically involved small sample sizes and have been difficult to replicate, raising questions around the utility and precise meaning of the measure. In the largest genome-wide association meta-analysis of 2D:4D ratio to date (N = 15 661, with replication N = 75 821), we identified 11 loci (9 novel) explaining 3.8% of the variance in mean 2D:4D ratio. We also found weak evidence for association (β = 0.06; P = 0.02) between 2D:4D ratio and sensitivity to testosterone [length of the CAG microsatellite repeat in the androgen receptor (AR) gene] in females only. Furthermore, genetic variants associated with (adult) testosterone levels and/or sex hormone-binding globulin were not associated with 2D:4D ratio in our sample. Although we were unable to find strong evidence from our genetic study to support the hypothesis that 2D:4D ratio is a direct biomarker of prenatal exposure to androgens in healthy individuals, our findings do not explicitly exclude this possibility, and pathways involving testosterone may become apparent as the size of the discovery sample increases further. Our findings provide new insight into the underlying biology shaping 2D:4D variation in the general population.
41 citations
••
VU University Amsterdam1, Erasmus University Rotterdam2, University of Zurich3, Harvard University4, University of Colorado Boulder5, Hospital for Special Surgery6, University of Southern California7, University of Amsterdam8, University of Copenhagen9, Statens Serum Institut10, University of Toronto11, University of Queensland12, University of Essex13, Broad Institute14, University of Oxford15, German Institute for Economic Research16, Max Planck Society17, Pompeu Fabra University18, University of Edinburgh19, University of Oulu20, University of California, San Diego21, University of Lübeck22, University of Konstanz23, University of North Carolina at Chapel Hill24, University of Bern25, Karolinska Institutet26, St. Joseph's Healthcare Hamilton27, Ludwig Maximilian University of Munich28, University of Cologne29, University College London30, University of Chicago31, Imperial College London32, University of Tartu33, Stockholm School of Economics34, Catalan Institution for Research and Advanced Studies35, University of Mainz36, Uniformed Services University of the Health Sciences37, Western General Hospital38, University of Minnesota39, New York University40, National Bureau of Economic Research41
TL;DR: Bioinformatics analyses imply that genes near general-risk-tolerance-associated SNPs are highly expressed in brain tissues and point to a role for glutamatergic and GABAergic neurotransmission.
Abstract: Humans vary substantially in their willingness to take risks. In a combined sample of over one million individuals, we conducted genome-wide association studies (GWAS) of general risk tolerance, adventurousness, and risky behaviors in the driving, drinking, smoking, and sexual domains. We identified 611 approximately independent genetic loci associated with at least one of our phenotypes, including 124 with general risk tolerance. We report evidence of substantial shared genetic influences across general risk tolerance and risky behaviors: 72 of the 124 general risk tolerance loci contain a lead SNP for at least one of our other GWAS, and general risk tolerance is moderately to strongly genetically correlated (|rˆ g | ~ 0.25 to 0.50) with a range of risky behaviors. Bioinformatics analyses imply that genes near general-risk-tolerance-associated SNPs are highly expressed in brain tissues and point to a role for glutamatergic and GABAergic neurotransmission. We find no evidence of enrichment for genes previously hypothesized to relate to risk tolerance.
19 citations
••
TL;DR: Cov-LDSC is introduced, a method to provide robust h g 2 estimates from GWAS summary statistics and in-sample LD estimates in admixed populations and is robust to all simulation parameters.
Abstract: All summary statistics-based methods to estimate the heritability of SNPs (h g 2 ) rely on accurate linkage disequilibrium (LD) calculations. In admixed populations, such as African Americans and Latinos, LD estimates are influenced by admixture and can result in biased h g 2 estimates. Here, we introduce covariate-adjusted LD score regression (cov-LDSC), a method to provide robust h g 2 estimates from GWAS summary statistics and in-sample LD estimates in admixed populations. In simulations, we observed that unadjusted LDSC underestimates h g 2 by 10%- 60%; in contrast, cov-LDSC is robust to all simulation parameters. We applied cov-LDSC to approximately 170,000 Latino, 47,000 African American 135,000 European individuals in three quantitative and five dichotomous phenotypes. Our results show that most traits have high concordance of h g 2 between ethnic groups; for example in the 23andMe cohort, estimates of h g 2 for BMI are 0.22 ± 0.01, 0.23 ± 0.03 and 0.22 ± 0.01 in Latino, African American and European populations respectively. However, for age at menarche, we observe population specific heritability differences with estimates of h g 2 of 0.10 ± 0.03, 0.33 ± 0.13 and 0.19 ± 0.01 in Latino, African American and European populations respectively.
7 citations