Identification of context-dependent expression quantitative trait loci in whole blood
Summary (2 min read)
Introduction
- The molecular mechanisms underlying the association of genetic risk factors with disease and complex traits are still largely elusive.
- Many disease-associated genetic variants are found in non-coding parts of the genome 1,2 and thus must have a regulatory effect on expression.
- Mapping single nucleotide polymorphisms (SNPs) with an effect on the regulation of gene expression (expression quantitative trait loci, eQTLs) helps to unravel the regulatory networks that underlie physiological traits and diseases 3–8.
- A subset of eQTLs in immune cells may only be observed after activation of these cells by immunological triggers 15–20.
- Additionally, insights into the activity of signaling pathways modifying eQTL effects help to unravel the regulatory networks underlying disease.
Results
- The authors generated a comprehensive set of cis-eQTLs by sequencing whole peripheral blood mRNA of 2,176 healthy adults from four Dutch cohorts 21–24 (2,116 individuals remaining after stringent quality control (Table S1, Supplementary material)).
- The authors quantified gene and exon expression, as well as exon ratios (the proportion of expression of an exon relative to the total expression of all exons of a gene) and polyA ratios (the ratio of the expression in upstream and downstream parts of the 3’-UTRs separated by annotated polyadenylation (polyA) sites) and performed ciseQTL mapping for all of these.
- A complete catalogue of all their eQTLs can be downloaded and explored via a dedicated browser at http://genenetwork.nl/biosqtlbrowser.
- More than half of the cis-regulated genes showed evidence for multiple independent eQTL effects (Figure 1a, Figure S1).
- As expected, eQTL effects were predominantly found for SNPs associated with hematological, lipid or immune-related traits.
Context-dependent eQTLs
- The effects of SNPs on gene expression often depend on the cell type or tissue under investigation 9–12, and may be modified by external and environmental factors 15–19.
- The authors first identified the proxy gene acting on the highest number of eQTLs.
- There was a significant imbalance in the direction of regulation within the Tcell cluster: 54 genes were up-regulated by the IBD risk allele whereas only 29 were downregulated (binominal test p-value: 0.003), suggesting increased T-cell activity in IBD.
- Five of these eQTLs were strongest in neutrophils (positive interaction score for module 1) and the genes containing these eQTLs were present in the neutrophil cluster (Figure 3d).
- The authors therefore conclude that the effect of these 145 eQTL genes is dependent on stimulation with type I interferon.
Regulatory network discovery
- Each of the aforementioned ten modules demonstrated effects on many (>120) eQTLs.
- To identify these, the authors first corrected the expression data for the 10 module interaction effects and then ascertained for each gene-level eQTL whether the eQTL effect size was significantly dependent on the expression of any other gene.
- The authors propose a model where extracellular (HDL) cholesterol levels modify SREBF2 binding to the FADS2 promoter, which, in turn has effects on the expression of FADS2 and the lipid unsaturase activity in the cell.
- This eQTL activating cluster was strongly enriched for “positive regulation of B cell proliferation” (p-value = 1 x 10-7), and the strongest proxy gene in this cluster was FCRLA, which is known to be highly expressed in proliferating B-cells residing in the germinal center of the lymph nodes 44.
- As such EBF1 influences MYBL2 gene expression, but because of its binding at SNP rs285205, this SNP likely affects the binding affinity of EBF1.
Discussion
- Using whole blood RNA-seq data the authors greatly expanded the catalog of SNPs that have a known regulatory function.
- To gain a better understanding of the biology behind these regulatory variants, the authors identified 2,743 context-dependent eQTLs (1,842 in the first 10 modules and 901 in the remainder) and identified many of the determinants that modify these eQTLs.
- These provide further insight into the cell types in which the genetic risk factors are regulating gene expression and the regulatory networks in which they participate, further refining their findings on GWAS risk loci.
- Unlike other approaches (15,16,20), their method does not rely on any prior knowledge or assumptions on differences in cell type composition or naturally occurring stimulations acting on their whole tissue data.
- As such their approach complements perturbation experiments to gain better insight in regulatory networks and their stimuli.
Figures
- Over 20,000 genes are regulated by cis-eQTLs overlapping with 33% of the entries in the GWAS catalog.
- .CC-BY-NC-ND 4.0 International licenseunder a not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
- (B) Gene function enrichment per cluster showed T-cell biology for the yellow cluster and neutrophil biology for the blue cluster.
- (D) All positive eQTL interaction effects for IBD eQTLs.
- Genes positively correlated with the top covariate (SP140) are indicated in blue and those negatively correlated with SP140 in red.
Data availability
- All results can be queried using their dedicated QTL browser: http://genenetwork.nl/biosqtlbrowser/.
- Raw data was submitted to the European Genomephenome Archive (EGA, accession number EGAS00001001077).
Author contributions
- BTH, PACtH, JBJvM, AI, RJ and LF formed the management team of the BIOS consortium.
- JBJvM, PMJ, MV, JvR and NL generated RNA-seq data.
- HM, MvI, MvG, WA, JB, DVZ, RJ, PvtH, PD, MV, IN, MaS, PACtH, BTH and MM were responsible for data management and the computational infrastructure.
- DVZ, PD, PACtH and LF drafted the manuscript.
Did you find this useful? Give us your feedback
Citations
2,092 citations
1,898 citations
1,460 citations
1,436 citations
1,243 citations
References
45,957 citations
30,684 citations
18,858 citations
6,429 citations
6,042 citations
Related Papers (5)
Frequently Asked Questions (18)
Q2. What are the future works mentioned in the paper "Hypothesis-free identification of modulators of genetic risk factors" ?
These provide further insight into the cell types in which the genetic risk factors are regulating gene expression and the regulatory networks in which they participate, further refining their findings on GWAS risk loci.
Q3. What is the p-value of the positive correlated genes?
The positively correlated genes are enriched for up-regulated genes upon rhinovirus stimulation 16 (Fisher exact p-value 1.14 x 10-9), in line with their involvement in the type The authorinterferon response.
Q4. What is the effect of type The authorinterferon on eQTLs?
In support of the modifying effects of viral cues on this set of eQTLs, eQTL genes that have recently been reported as rhinovirus-response QTLs 16 typically have higher interaction z-scores for module 7 than other eQTL genes (Wilcoxon p-value = 0.02).
Q5. What is the significance of the gene expression in the yellow cluster?
(C) Expression levels in the cellsorted BLUEPRINT data show that the genes in the yellow cluster show higher expression in T-cells and the genes in the blue cluster show higher expression in neutrophils.
Q6. What was the effect size of the SNP rs1981760 on NOD2?
Samples with very low expression of STX3 showed only a very weak eQTL on NOD2, whereas samples with very high STX3 expression showed a stronger eQTL effect size.
Q7. What is the effect of eqtl on MYBL2?
When also including these genes, the authors observed this cluster of genes is strongly co-expressed with EBF1, a transcription factor that drives B-cell differentiation and proliferation, suggesting that EBF1 mightdrive the eQTL interaction effect for MYBL2.
Q8. What is the meaning of 'proxy genes'?
the authors expect that the genes whose expression levels modify eQTLs are proxies of cell types or other intrinsic or extrinsic factors, and the authors call these genes 'proxy genes'.
Q9. What is the significance of the cis-eQTL SNPs?
The gene cis-eQTL SNPs are strongly enriched for DNase The authorfootprints, various histone marks and binding sites of multiple transcription factors 26 (Table S4) suggesting that their substantial sample-size enabled us to pinpoint likely causal regulatory variants.
Q10. What is the effect of the SNP rs1981760 on NOD2?
In this example, the eQTL effect is found to be more prominent in neutrophils than in other blood cell types, and the expression of NOD2 found to be lower in carriers of the risk allele compared to carriers of the protective allele.
Q11. What are the enriched sites for transcription factors involved in erythrocyte development?
They were also enriched in binding sites for transcription factors involved in erythrocyte development based on ENCODE ChIP-seq data (GATA1, TAL1, GATA2 and MafK, each with enrichment p-values ≤ 10-5) 30–32.
Q12. What is the effect of the exons on the expression of the IBD gene?
The other exons showed downregulation by the risk allele, suggesting that a shift to the NMD isoform is lowering overall gene expression levels (Figure S4).
Q13. What is the median LD of the top eQTL SNP?
Of the 232 top SNPs reported in this meta-analysis, 95 loci (41%) are in strong LD (r2 ≥ 0.8) with a top eQTL SNP (median r2 = 0.96 and median D’ = 0.996).
Q14. What is the effect of rs1728801 on ZPF90?
The authors also observed negative interactions, where the effect becomes smaller in a specific module, e.g. the eQTL effect of rs1728801 regulating ZPF90 (Figure 3f), a gene that is known to be important in T-helper cells 36.
Q15. What is the effect of EBF1 on MYBL2?
As such EBF1 influences MYBL2 gene expression, but because of its binding at SNP rs285205, this SNP likely affects the binding affinity of EBF1.
Q16. What is the significance of the eQTLs?
Gene function enrichment analysis on the exon-level and exon ratio QTLs showed results similar to that of eQTL genes (Table S8), indicating that the proxy genes do not solely represent the factors modulating gene-level eQTLs but also those that affect alternative splicing eQTLs.
Q17. What is the eqtls associated with inflammatory bowel disease?
; https://doi.org/10.1101/033217doi: bioRxiv preprintFigure 3. eQTLs associated with inflammatory bowel disease are predominantly active in neutrophils and T-cells.
Q18. What is the correlation between eqtl and mYBL2?
EBF1 is a known player in B-cell differentiation and proliferation and positively correlated to both MYBL2 (r = 0.11, p-value = 6.99 x 10-7) and FCRLA (r = 0.8, p-value ≤ 2.2 x 10-16).