Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types

doi:10.1038/NG.3795

Limited statistical evidence for shared genetic

effects of eQTLs and autoimmune disease-

associated loci in three major immune cell types

Citation

Chun, Sung, Alexandra Casparino, Nikolaos A Patsopoulos, Damien C Croteau-Chonka,

Benjamin A Raby, Philip L De Jager, Shamil R Sunyaev, and Chris Cotsapas. 2017. “Limited

statistical evidence for shared genetic effects of eQTLs and autoimmune disease-associated loci

in three major immune cell types.” Nature genetics 49 (4): 600-605. doi:10.1038/ng.3795. http://

dx.doi.org/10.1038/ng.3795.

Published Version

doi:10.1038/ng.3795

Permanent link

http://nrs.harvard.edu/urn-3:HUL.InstRepos:34375208

Terms of Use

This article was downloaded from Harvard University’s DASH repository, and is made available

under the terms and conditions applicable to Other Posted Material, as set forth at http://

nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA

Share Your Story

The Harvard community has made this article openly available.

Please share how this access benefits you. Submit a story .

Accessibility

Limited statistical evidence for shared genetic effects of eQTLs

and autoimmune disease-associated loci in three major immune

cell types

Sung Chun

1,2,3

, Alexandra Casparino

4

, Nikolaos A Patsopoulos

3,5,6

, Damien C Croteau-

Chonka

2,7

, Benjamin A Raby

2,7

, Philip L De Jager

2,3,5,6

, Shamil R Sunyaev

1,2,3,*

, and Chris

Cotsapas

3,4,8,*

1

Division of Genetics, Brigham and Women’s Hospital, Boston MA, USA

2

Department of Medicine, Harvard School of Medicine, Boston MA, USA

3

Broad Institute of Harvard and MIT, Cambridge MA, USA

4

Department of Neurology, Yale School of Medicine, New Haven CT, USA

5

Department of Neurology, Brigham and Women’s Hospital, Boston MA, USA

6

Ann Romney Center for Neurological Diseases, Brigham and Women’s Hospital, Boston MA,

USA

7

Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston MA, USA

8

Department of Genetics, Yale School of Medicine, New Haven CT, USA

Most autoimmune disease risk effects identified by genome-wide association studies

(GWAS) localize to open chromatin with gene regulatory activity. GWAS loci are also

enriched for expression quantitative trait loci (eQTLs), suggesting that most risk variants

alter gene expression

1,2

. However, because causal variants are difficult to identify and

cis

-

eQTLs occur frequently, it remains challenging to identify specific instances of disease-

relevant changes to gene regulation. Here, we use a novel joint likelihood framework with

Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research,

subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

*

correspondence to: SRS (ssunyaev@rics.bwh.harvard.edu) and CC (cotsapas@broadinstitute.org).

Code availability

The current implementation of JLIM is available from the Cotsapas and Sunyaev labs: http://www.github.com/cotsapaslab/jlim and

http://genetics.bwh.harvard.edu/wiki/sunyaevlab/jlim

Data availability

The publicly available 1000 Genomes genotype data were downloaded from: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/

20130502/

. The publicly available gEUVADIS LCL eQTL data were accessed via EBI ArrayExpress site under accession E-GEUV-1.

Gene expression data for CD4

+

T cell and CD14

+

monocytes were accessed viaNCBI Gene Expression Omnibus accession no.

GSE56035. Immunochip GWAS summary statistics are available at http://www.immunobase.org.

Author Contributions

SC designed and performed research and authored the manuscript; AC performed research; NP contributed data and approved the

manuscript, DCC contributed data and approved the manuscript; BR contributed data and approved the manuscript; PDJ contributed

data and approved the manuscript; SRS designed and performed research and authored the manuscript; CC designed and performed

research and authored the manuscript.

Competing Financial Interests statement

The authors declare no competing financial interests.

HHS Public Access

Author manuscript

Nat Genet

. Author manuscript; available in PMC 2017 August 20.

Published in final edited form as:

Nat Genet

. 2017 April ; 49(4): 600–605. doi:10.1038/ng.3795.

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

higher resolution than previous methods to identify loci where autoimmune disease risk and

an eQTL are driven by a single, shared genetic effect. Using eQTLs from three major

immune subpopulations, we find shared effects in only ~25% of loci. Thus, we uncover a

fraction of gene regulatory changes as strong mechanistic hypotheses for disease risk, but

conclude that most risk mechanisms likely do not involve changes to basal gene expression.

The autoimmune and inflammatory diseases (AID) are heritable, complex diseases where

loss of tolerance to self-antigens results in either systemic or tissue-specific immune

attack

3,4

. GWAS have identified hundreds of genomic regions mediating risk to several AID.

These associations are primarily non-coding: lead GWAS SNPs are more likely to be

associated with expression levels of neighboring genes than expected by chance

12,13

, and the

same lead SNPs are enriched in regulatory regions marked by chromatin accessibility and

modification

1,14

. Fine-mapping reveals enrichment of AID-associated variants in enhancer

elements active in stimulated T cell subpopulations

15

, with heritability strongly enriched in

such regulatory regions

16,17

. Collectively, these strands of evidence suggest that the majority

of disease risk is mediated by changes to gene regulation in specific cell subpopulations.

However, these bulk analyses do not formally assess whether expression levels and disease

risk can be attributed to a single underlying variant or to independent effects in a locus

18,19

.

Though several methods have been developed to assess these alternatives using eQTL

data

20–23

, they show limited resolution to detect cases where distinct disease and eQTL

causal variants are in linkage disequilibrium. Here, we present an approach to test if a

GWAS risk association and an eQTL are driven by the same underlying genetic effect,

accounting for the LD between causal variants. Using data from ImmunoChip studies of

seven AID comprising >180,000 samples in total (Supplementary Table 1), we test if

associations in 272 known risk loci are consistent with

cis

-eQTL for genes in each region,

measured in three relevant immune cell populations: lymphoblastoid cell lines (LCLs),

CD4

+

T cells and CD14

+

monocytes

24,25

.

When associations to two traits – here, disease trait and eQTL – are driven by the same

underlying causal variant, the joint evidence of association should be maximized at the

markers in tightest LD with the causal variant

19,26

. Here, we directly evaluate this joint

likelihood (Supplementary Figure 1), unlike previous approaches that look for similarities in

the shape of the association curve over multiple markers

20,21,27,28

. When the underlying

causal effect is shared, joint likelihood is maximized when we model the same causal variant

in both traits; conversely, when the underlying causal variants are different, we expect

maximum joint likelihood when we model their closest proxies. We empirically derive the

null distribution of the joint likelihood ratio statistic by comparing disease associations to

permuted eQTL data(see Methods, Supplementary Figure 2 and Supplementary Notes). We

thus directly evaluate whether two associations in the same locus, observed in different

cohorts, are due to the same underlying effect.

To assess the performance of our method, we benchmarked it against three recently reported

methods:

coloc

20

, a well-calibrated Bayesian framework that considers spatial similarities in

association data across sets of markers; gwas-pw

29

, which extends this idea to hierarchical

priors and optimizes model parameters; and HEIDI/SMR

22

, which applies Mendelian

Chun et al.

Page 2

Nat Genet

. Author manuscript; available in PMC 2017 August 20.

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

randomization between traits. We simulated pairs of case-control cohorts with either the

same or distinct causal variants driving association, and find that our approach shows the

best overall performance (Supplementary Tables 2 and 3). When independent causal variants

(i.e. not in LD) drive GWAS and eQTL associations, our own method,

coloc

and gwas-pw

all had excellent performance. As the LD between the causal variants increases, our method

shows the best performance, maintaining high resolution even when the underlying causal

variants are in strong LD (AUC = 0.883 when 0.7 < r

2

< 0.8, Supplementary Figures 3 and

4), whereas the other methods show substantial false positive rates, reporting distinct effects

as shared. We also found that our method is robust to within-continent levels of population

structure (Supplementary Figures 5 and 6), and when limiting analysis to a subset of SNPs

for computational efficiency (Supplementary Figure 7;

coloc

fares similarly, Supplementary

Figure 8). Our method also performs well when multiple independent causal variants affect

one or both traits (Supplementary Figures 9–11). In practice, our resolution becomes limited

at high LD levels (r

2

>0.8), where the false positive rate increases dramatically. We also have

limited resolution when the eQTL effect is very weak (

p

> 0.01, Supplementary Figures 12–

15). Thus, within these limits, we can accurately detect cases of shared genetic effects

between two traits.

To dissect AID risk loci, we first identified densely genotyped ImmunoChip loci showing

genome-wide significant association, excluding the Major Histocompatibility Locus due to

the extensive LD structure in the region (immunobase.org; Table 1). We next identified

genes in a 1Mb window centered on the most associated variant in each locus. Consistent

with previous observations that eQTLs are frequently found in GWAS loci, we found that

260/272 loci had at least one gene with an eQTL (p < 0.01) in at least one cell type, with

most such effects common across all three tissues (Table 1). We tested if any eQTLs in these

loci appear driven by the same underlying effect as the disease associations. We find

evidence for shared effects for only 77/5,749 pairs in 55/260 (21%) loci across all diseases,

with the proportion varying from 4/34 (12%) for rheumatoid arthritis loci to 6/10 (60%) for

ulcerative colitis loci (false discovery rate < 5%; Tables 1 and 2). Of these 77 shared effects,

45 pass even the more stringent family-wise multiple testing correction (Bonferroni

corrected P < 0.05). Thus, our analysis reveals that in the majority of AID loci, variants

causally involved in disease phenotypes do not overlap variants responsible for eQTL signals

in the three broad cell populations we analyzed, which represent the major arms of the

immune lineages. Overall, we find that >75% of tested disease-eQTL pairs appear associated

to distinct genetic variants in the same locus (Figure 1).

We sought to explain this lack of overlap between disease associations and eQTLs, despite

their frequent co-occurrence in the same loci. In particular, although our method showed

good performance in simulated data (Supplementary Figure 4), we remained concerned that

this lack of overlap may be due to low statistical power in the eQTL data, which come from

cohorts of limited sample size. However, we find that even amongst the most strongly

supported eQTLs (nominal

p

< 10

−5

), <25% show evidence of shared effects with disease

associations. Conversely, we find strong evidence for distinct effects for the majority of

disease-eQTL pairs, with only a subset of comparisons being ambiguous, suggesting that our

method is adequately powered to detect shared effects where they exist (Figure 1a and

Chun et al.

Page 3

Nat Genet

. Author manuscript; available in PMC 2017 August 20.

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

Supplementary Figures 16–18). To assess whether power affects the total number of loci,

rather than eQTL, that can be resolved, we looked more deeply at our significance threshold

settings. We find that more liberal thresholds do not increase the number of true positive

results after adjusting for false positive rate, indicating that most loci do not contain

any

gene with an eQTL consistent with the disease association (Figure 1b and Supplementary

Figure 19). Cumulatively, our results demonstrate that only a minority of AID risk effects

drive eQTLs in the three cell populations we tested, which are drawn from diverse lineages

of the immune system.

We next focused on the subset of 77 disease/eQTL pairs in 55 loci where we could detect

strong evidence of a shared effect (Table 2). We find that 59/77 (77%) of effects are

restricted to one cell population, indicating that tissue-specific eQTLs are important

components of the molecular underpinnings of disease (Supplementary Figures 20 and 21).

The remaining 18 effects are detected in multiple cell populations; for example, the multiple

sclerosis association at rs10783847 on chromosome 12 is consistent with eQTLs for the

transcript of methyltransferase-like 21B (

METTL21B

) in both CD4

+

T cells and CD14

+

monocytes, but not for the remaining 31 genes in the immediate locus (Figure 2). Although

METTL21B

is expressed in LCLs, there is no evidence of an eQTL in this tissue within

1Mb from rs10783847. Similarly, for the multiple sclerosis association at rs1966115 on

chromosome 8 and eQTLs for

ZC2HC1A

, and for the inflammatory bowel disease

association at rs55770741 on chromosome 5 and eQTLs for

ERAP2

, we detect a shared

effect in all three cell populations. In several cases we find tissue-specific shared effects

despite strong eQTLs for the same gene in other tissues: for

ZFP90

and ulcerative colitis risk

at rs889561 on chromosome 16, we also find shared effects in CD4

+

and CD14

+

but not

LCLs, where we observe a

ZFP90

eQTL at p = 0.005 that has a low likelihood of shared

effect with GWAS (joint likelihood P = 0.85). Instead, we find evidence of sharing between

disease risk and an eQTL for

NFAT5

in LCLs. Thus, despite the presence of eQTLs for a

gene in multiple tissues, not all these effects are consistent with disease associations

suggesting that disease-relevant eQTLs are tissue specific.

We also find cases where an eQTL is consistent with associations to multiple diseases. The

ankyrin repeat domain 55 (

ANKRD55

) transcript encoded on chromosome 5 has an eQTL in

CD4

+

T cells that is shared with associations to multiple sclerosis, Crohn disease and

rheumatoid arthritis (Figure 3, all observations are significant after Bonferroni correction).

We also find weaker evidence for shared effects between all three diseases and an eQTL for

interleukin 6 signal transducer (

IL6ST

) in CD4

+

T cells, which passes the false discovery

rate threshold but not Bonferroni correction (Supplementary Figure 22). Similarly, a CD4

+

eQTL for

ELMO1

on chromosome 7 is consistent with associations to both celiac disease

and multiple sclerosis (Supplementary Figure 23), a CD14

+

eQTL for

RGS1

on

chromosome 1 is consistent with associations to both celiac disease and multiple sclerosis

(Supplementary Figure 24), and three other eQTLs are consistent with associations in

multiple diseases (Supplementary Figures 25–27). In all cases, these are the only genome-

wide significant disease associations reported in these loci. As we consider each disease

association independently, these results indicate that the same underlying risk variants drive

risk to multiple diseases in these loci by altering gene expression, consistent with

observations of shared effects across diseases

7

.

Chun et al.

Page 4

Nat Genet

. Author manuscript; available in PMC 2017 August 20.

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types

Citations

Cites background from "Limited statistical evidence for sh..."

References

Related Papers (5)