scispace - formally typeset
Open AccessPosted ContentDOI

Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis

Urmo Võsa, +100 more
- 19 Oct 2018 - 
- pp 447367
TLDR
It is observed that cis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting the ability to use cis- eZTLs to pinpoint causal genes within susceptibility loci.
Abstract
While many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear. To identify these effects, we performed cis- and trans-expression quantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium. We observed that cis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to use cis-eQTLs to pinpoint causal genes within susceptibility loci. In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology. We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.

read more

Content maybe subject to copyright    Report

1
Unraveling the polygenic architecture of
complex traits using blood eQTL meta-
analysis
Urmo Võsa*
#1,2
, Annique Claringbould*
#1
, Harm-Jan Westra**
1
, Marc Jan Bonder**
1
, Patrick
Deelen**
1,3
, Biao Zeng
4
, Holger Kirsten
5
, Ashis Saha
6
, Roman Kreuzhuber
7,8
, Silva Kasela
2
, Natalia
Pervjakova
2
, Isabel Alvaes
9
, Marie-Julie Fave
9
, Mawusse Agbessi
9
, Mark Christiansen
10
, Rick
Jansen
11
, Ilkka Seppälä
12
, Lin Tong
13
, Alexander Teumer
14
, Katharina Schramm
15,16
, Gibran
Hemani
17
, Joost Verlouw
18
, Hanieh Yaghootkar
19
, Reyhan Sönmez
20,21
, Andrew Brown
22,23,21
,
Viktorija Kukushkina
2
, Anette Kalnapenkis
2
, Sina Rüeger
24
, Eleonora Porcu
24
, Jaanika Kronberg-
Guzman
2
, Johannes Kettunen
25
, Joseph Powell
26
, Bernett Lee
27
, Futao Zhang
28
, Wibowo
Arindrarto
29
, Frank Beutner
30
, BIOS Consortium, Harm Brugge
1
, i2QTL Consortium, Julia
Dmitreva
31
, Mahmoud Elansary
31
, Benjamin P. Fairfax
32
, Michel Georges
31
, Bastiaan T.
Heijmans
29
, Mika Kähönen
33
, Yungil Kim
34,35
, Julian C. Knight
32
, Peter Kovacs
36
, Knut Krohn
37
,
Shuang Li
1
, Markus Loeffler
5
, Urko M. Marigorta
4
, Hailang Mei
38
, Yukihide Momozawa
31,39
, Martina
Müller-Nurasyid
15,16,40
, Matthias Nauck
41
, Michel Nivard
42
, Brenda Penninx
11
, Jonathan Pritchard
43
,
Olli Raitakari
44
, Olaf Rotzchke
27
, Eline P. Slagboom
29
, Coen D.A. Stehouwer
45
, Michael Stumvoll
46
,
Patrick Sullivan
47
, Peter A.C. ‘t Hoen
48
, Joachim Thiery
49
, Anke Tönjes
46
, Jenny van Dongen
11
,
Maarten van Iterson
29
, Jan Veldink
50
, Uwe Völker
51
, Cisca Wijmenga
1
, Morris Swertz
3
, Anand
Andiappan
27
, Grant W. Montgomery
52
, Samuli Ripatti
53
, Markus Perola
54
, Zoltan Kutalik
24
,
Emmanouil Dermitzakis
22,23,21
, Sven Bergmann
20,21
, Timothy Frayling
19
, Joyce van Meurs
18
, Holger
Prokisch
55,56
, Habibul Ahsan
13
, Brandon Pierce
13
, Terho Lehtimäki
12
, Dorret Boomsma
11
, Bruce M.
Psaty
10,57
, Sina A. Gharib
58,10
, Philip Awadalla
9
, Lili Milani
2
, Willem Ouwehand
7,59
, Kate Downes
7
,
Oliver Stegle
8,60,61
, Alexis Battle
62
, Jian Yang
28,63
, Peter M. Visscher
28
, Markus Scholz
5
, Gregory
Gibson
4
, Tõnu Esko
2
, Lude Franke
#1
* These authors contributed equally to this work.
** These authors contributed equally to this work.
1. Department of Genetics, University Medical Centre Groningen, Groningen, The Netherlands
2. Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu 51010, Estonia
3. Genomics Coordination Center, University Medical Centre Groningen, Groningen, The Netherlands
4. School of Biological Sciences, Georgia Tech, Atlanta, United States of America
5. Institut für Medizinische InformatiK, Statistik und Epidemiologie, LIFE Leipzig Research Center for Civilization Diseases, Universität
Leipzig, Leipzig, Germany
6. Department of Computer Science, Johns Hopkins University, Baltimore, United States of America
7. Department of Haematology, University of Cambridge and NHS Blood and Transplant Cambridge Biomedical Campus, Cambridge, United
Kingdom
8. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD,
United Kingdom
9. Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada
10. Cardiovascular Health Research Unit, University of Washington, Seattle, United States of America
11. Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
12. Department of Clinical Chemistry, Fimlab Laboratories and Faculty of Medicine and Life Sciences, University of Tampere, Tampere, Finland
13. Department of Public Health Sciences, University of Chicago, Chicago, United States of America
14. Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
15. Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg,
Germany
16. Department of Medicine I, University Hospital Munich, Ludwig Maximilian’s University, München, Germany
certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was notthis version posted October 19, 2018. ; https://doi.org/10.1101/447367doi: bioRxiv preprint

2
17. MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
18. Department of Internal Medicine, Erasmus Medical Centre, Rotterdam, The Netherlands
19. Exeter Medical School, University of Exeter, Exeter, United Kingdom
20. Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
21. Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
22. Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
23. Institute of Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, Switzerland
24. Lausanne University Hospital, Lausanne, Switzerland
25. University of Helsinki, Helsinki, Finland
26. Garvan Institute of Medical Research, Garvan-Weizmann Centre for Cellular Genomics, Sydney, Australia
27. Singapore Immunology Network, Agency for Science, Technology and Research, Singapore, Singapore
28. Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
29. Leiden University Medical Center, Leiden, The Netherlands
30. Heart Center Leipzig, Universität Leipzig, Leipzig, Germany
31. Unit of Animal Genomics, WELBIO, GIGA-R & Faculty of Veterinary Medicine, University of Liege, 1 Avenue de l'Hôpital, Liège 4000,
Belgium
32. Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
33. Department of Clinical Physiology, Tampere University Hospital and Faculty of Medicine and Life Sciences, University of Tampere,
Tampere, Finland
34. Department of Computer Science, Johns Hopkins University, Baltimore, United States of America
35. Genetics and Genomic Science Department, Icahn School of Medicine at Mount Sinai, New York, United States of America
36. IFB Adiposity Diseases, Universität Leipzig, Leipzig, Germany
37. Interdisciplinary Center for Clinical Research, Faculty of Medicine, Universität Leipzig, Leipzig, Germany
38. Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
39. Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Kanagawa 230-0045, Japan
40. DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
41. Institute of Clinical Chemistry and Laboratory Medicine, Greifswald University Hospital, Greifswald, Germany
42. Faculty of Genes, Behavior and Health, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
43. Stanford University, Stanford, United States of America
44. Turku University Hospital and University of Turku, Turku, Finland
45. Department of Internal Medicine, Maastricht University Medical Centre, Maastricht, The Netherlands
46. Department of Medicine, Universität Leipzig, Leipzig, Germany
47. Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
48. Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center
Nijmegen, Nijmegen, The Netherlands
49. Institute for Laboratory Medicine, LIFE Leipzig Research Center for Civilization Diseases, Universität Leipzig, Leipzig, Germany
50. University Medical Center Utrecht, Utrecht, The Netherlands
51. Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
52. Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
53. Statistical and Translational Genetics, University of Helsinki, Helsinki, Finland
54. National Institute for Health and Welfare, University of Helsinki, Helsinki, Finland
55. Institute of Human Genetics, Helmholtz Zentrum München, Neuherberg, Germany
56. Institute of Human Genetics, Technical University Munich, Munich, Germany.
57. Kaiser Permanente Washington Health Research Institute, Seattle, WA, United States of America
58. Department of Medicine, University of Washington, Seattle, United States of America
59. Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton Cambridge, United Kingdom
60. Genome Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
61. Division of Computational Genomics and Systems Genetics, German Cancer Research Center, 69120 Heidelberg, Germany
62. Departments of Biomedical Engineering and Computer Science, Johns Hopkins University, Baltimore, United States of America
63. Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang 325027, China
#
Correspondence can be addressed to
Urmo Võsa (urmo.vosa@gmail.com)
Annique Claringbould (anniqueclaringbould@gmail.com)
Lude Franke (ludefranke@gmail.com)
certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was notthis version posted October 19, 2018. ; https://doi.org/10.1101/447367doi: bioRxiv preprint

3
Summary
While many disease-associated variants have been identified through genome-wide association
studies, their downstream molecular consequences remain unclear.
To identify these effects, we performed cis- and trans-expression quantitative trait locus (eQTL)
analysis in blood from 31,684 individuals through the eQTLGen Consortium.
We observed that cis-eQTLs can be detected for 88% of the studied genes, but that they have a
different genetic architecture compared to disease-associated variants, limiting our ability to use
cis-eQTLs to pinpoint causal genes within susceptibility loci.
In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more
informative. Multiple unlinked variants, associated to the same complex trait, often converged on
trans-genes that are known to play central roles in disease etiology.
We observed the same when ascertaining the effect of polygenic scores calculated for 1,263
genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes
correlated with polygenic scores, and many resulting genes are known to drive these traits.
certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was notthis version posted October 19, 2018. ; https://doi.org/10.1101/447367doi: bioRxiv preprint

4
Main text
Expression quantitative trait loci (eQTLs) have become a common tool to interpret the regulatory
mechanisms of the variants associated with complex traits through genome-wide association
studies (GWAS). Cis-eQTLs, where gene expression levels are affected by a nearby single
nucleotide polymorphism (SNP) (<1 megabases; Mb), in particular, have been widely used for this
purpose. However, cis-eQTLs from the genome tissue expression project (GTEx) explain only a
modest proportion of disease heritability
1
.
In contrast, trans-eQTLs, where the SNP is located distal to the gene (>5Mb) or on other
chromosomes, can provide insight into the effects of a single variant on many genes. Trans-eQTLs
identified before
1–7
have already been used to identify putative key driver genes that contribute to
disease
8
. However, trans-eQTL effects are generally much weaker than those of cis-eQTLs,
requiring a larger sample size for detection.
While trans-eQTLs are useful for the identification of the downstream effects of a single variant, a
different approach is required to determine the combined consequences of trait-associated
variants. Polygenic scores (PGS) have been recently applied to sum genome-wide risk for several
diseases and likely will improve clinical care
9,10
. However, the exact consequences of different PGS
at the molecular level, and thus the contexts in which a polygenic effects manifest themselves, are
largely unknown. Here, we systematically investigate trans-eQTLs as well as associations between
PGS and gene expression (expression quantitative trait score, eQTS) to determine how genetic
effects influence and converge on genes and pathways that are important for complex traits.
To maximize the statistical power to detect eQTL and eQTS effects, we performed a large-scale
meta-analysis in 31,684 blood samples from 37 cohorts (assayed using three gene expression
certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was notthis version posted October 19, 2018. ; https://doi.org/10.1101/447367doi: bioRxiv preprint

5
platforms) in the context of the eQTLGen Consortium. This allowed us to identify significant cis-
eQTLs for 16,989 genes, trans-eQTLs for 6,298 genes and eQTS effects for 2,568 genes (Figure
1A), revealing complex regulatory effects of trait-associated variants. We combine these results
with additional data layers and highlight a number of examples where we leverage this resource to
infer novel biological insights into mechanisms of complex traits. We hypothesize that analyses
identifying genes further downstream are more cell-type specific and more relevant for
understanding disease (Figure 1B).
certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was notthis version posted October 19, 2018. ; https://doi.org/10.1101/447367doi: bioRxiv preprint

Citations
More filters
Journal ArticleDOI

The GTEx Consortium atlas of genetic regulatory effects across human tissues

François Aguet, +167 more
- 01 Jan 2020 - 
Posted ContentDOI

The GTEx Consortium atlas of genetic regulatory effects across human tissues

TL;DR: Analysis of the v8 data provides insights into the tissue-specificity of genetic effects, and shows that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.
Journal ArticleDOI

Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies

Mike A. Nalls, +248 more
- 01 Dec 2019 - 
TL;DR: These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified.
Journal ArticleDOI

Genetic mechanisms of critical illness in Covid-19.

Erola Pairo-Castineira, +1449 more
- 04 Mar 2021 - 
TL;DR: The GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2244 critically ill Covid-19 patients from 208 UK intensive care units is reported, finding evidence in support of a causal link from low expression of IFNAR2, and high expression of TYK2, to life-threatening disease.
Journal ArticleDOI

PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations

TL;DR: A major update of PhenoScanner is presented, including over 150 million genetic variants and more than 65 billion associations with diseases and traits, gene expression, metabolite and protein levels, and epigenetic markers.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Journal ArticleDOI

clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

TL;DR: An R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters and can be easily extended to other species and ontologies is presented.
Journal ArticleDOI

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek, +106 more
- 18 Aug 2016 - 
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Journal ArticleDOI

Second-generation PLINK: rising to the challenge of larger and richer datasets

TL;DR: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility, and for the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
Related Papers (5)

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
Frequently Asked Questions (15)
Q1. What have the authors contributed in "Unraveling the polygenic architecture of complex traits using blood eqtl metaanalysis" ?

Võsa et al. this paper used eQTL meta-analysis to uncover the polygenic architecture of complex traits using blood eqTL metaanalysis. 

To gain further insight into genes that are important in the biology of the trait, the authors used the combined cis-eQTL results to perform SMR14 for 16 large GWAS studies ( Supplementary Table 20 ). 

Independent effect SNPs for each summary statistics file were identified by double-clumping by first using a 250kb window and subsequently a 10Mb window with LD threshold R2=0.1. 

Four GWAS P-value thresholds (P<5×10-8, 1×10-5, 1×10-4 and 1×10-3) were used for constructing PGS for each summary statistics file. 

The reason the authors chose only SNPs associated to blood-related traits and immune-mediated diseases was to minimize potential confounding due to a subtle bias in the Epigenomics Roadmap Project towards blood cell-types: 29 of the 127 cell-types that the authors studied were blood cell types. 

The trans-eQTL genes were also assigned to 10kb blocks, and to multiple blocks if the gene was more than 10kb in length (length between TSS and TES, Ensembl v71). 

Cis-eQTLs with a gene-level FDR < 0.05 (corresponding to P < 1.829×10-5) and tested in at least two cohorts were deemed significant. 

To test the performance of the empirical probe-matching approach, the authors conducted discovery cis-, trans- and eQTS meta-analyses for each expression platform (RNA-seq, Illumina, Affymetrix U291 and Affymetrix Hu-Ex v1.0 ST arrays; array probes matched to 19,960 genes by empirical probe matching). 

FDR calculation for trans-eQTL and eQTS mappingTo determine nominal P-value thresholds corresponding to FDR=0.05, the authors used the pruned set of SNPs for trans-eQTL mapping and permutation-based FDR calculation, as described previously1. 

Due to the fact these cohorts have a comparatively low sample sizes and study different cell types, the authors observed limited replication: 10 eQTS showed significant replication effect (FDR<0.05) in the LCL dataset, with 9 out of those (90%) showing the same effect direction as inthe discovery set (Extended Data Figure 16A, Supplementary Table 17). 

The authors here performed cis-eQTL, trans-eQTL and eQTS analyses in 31,684 blood samples, reflecting a six-fold increase over earlier large-scale studies1,5. 

Since only a few eQTS associations are significant in non-blood tissues and the majority of identified eQTS associations are for blood-related traits, the authors speculate these effects are likely to be highly cell-type specific. 

This indicates that large-scale eQTL meta-analyses in other tissues could uncover more genes on which trait-associated SNPs converge. 

Weused the integrative trans-eQTL analysis results as an input, confined ourselves to those effects which were present in the datasets the authors had direct access to (BBMRI-BIOS+EGCUT; N=4,339), and showed nominal P < 8.3115× 10-06 in the meta-analysis of those datasets. 

The false discovery rate (FDR) was determined using 10 meta-analyzed permutations: for each gene in the real analysis, the most significant association was recorded, and the same was done for each of the permutations,resulting in a gene-level FDR.