scispace - formally typeset
Open AccessJournal ArticleDOI

Large-scale association analysis identifies new risk loci for coronary artery disease

Panos Deloukas, +204 more
- 01 Jan 2013 - 
- Vol. 45, Iss: 1, pp 25-33
TLDR
An association analysis in CAD cases and controls identifies 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants strongly associated with CAD at a 5% false discovery rate (FDR).
Abstract
Coronary artery disease (CAD) is the commonest cause of death. Here, we report an association analysis in 63,746 CAD cases and 130,681 controls identifying 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants (r(2) < 0.2) strongly associated with CAD at a 5% false discovery rate (FDR). Together, these variants explain approximately 10.6% of CAD heritability. Of the 46 genome-wide significant lead SNPs, 12 show a significant association with a lipid trait, and 5 show a significant association with blood pressure, but none is significantly associated with diabetes. Network analysis with 233 candidate genes (loci at 10% FDR) generated 5 interaction networks comprising 85% of these putative genes involved in CAD. The four most significant pathways mapping to these networks are linked to lipid metabolism and inflammation, underscoring the causal role of these activities in the genetic etiology of CAD. Our study provides insights into the genetic basis of CAD and identifies key biological pathways.

read more

Content maybe subject to copyright    Report

Nature GeNetics VOLUME 45 | NUMBER 1 | JANUARY 2013 25
Coronary artery disease and its main complication, myocardial infarc-
tion, is the leading cause of death worldwide. Although, epidemio-
logical studies have identified many risk factors for CAD, including
plasma lipid concentrations, blood pressure, smoking, diabetes and
markers of inflammation, a causal role has been proven only for
some (for example, low-density lipoprotein (LDL) cholesterol and
blood pressure), primarily through randomized clinical trials of drug
therapy directed at the risk factor
1
. Twin and family studies have
documented that a significant proportion (40–50%) of susceptibility
to CAD is heritable (for a review, see ref. 2). Because genotypes are
not confounded by environmental exposures, genetic analysis has the
potential to define which risk factors are indeed causal and to identify
pathways and therapeutic targets
3,4
. To date, genome-wide association
studies (GWAS) have collectively reported a total of 31 loci, associ-
ated with CAD risk at genome-wide significance (P < 5 × 10
−8
)
5–13
.
However, variants at these loci explain less than 10% of the heritability
of CAD. One likely reason for this is that, given the polygenic nature
of complex traits and the relatively small observed effect sizes of the
loci identified, many genuinely associated variants do not reach the
stringent P-value threshold for genome-wide significance. Indeed,
there is increasing evidence that the genetic architecture of common
traits involves a large number of causative alleles with very small
effects
14
. Addressing this will require the discovery of additional loci
while leveraging large-scale genomic data to identify the molecular
pathways underlying the pathogenesis of CAD. Such discovery is facil-
itated by building molecular networks, on the basis of DNA, RNA and
protein interactions, which have nodes of known biological function
that also show evidence of association with risk variants for CAD and
related metabolic traits.
In the largest GWAS meta-analysis of CAD undertaken to date
by the Coronary ARtery DIsease Genome-wide Replication and
Meta-analysis (CARDIoGRAM) Consortium
5
, which involved 22,233
cases and 64,762 controls, in addition to loci reported at genome-wide
significance, a linkage disequilibrium (LD)-pruned set of 6,222 vari-
ants achieved a nominal association P value of less than 0.01. Here,
we test these 6,222 SNPs in a meta-analysis of over 190,000 individuals,
with the primary aim of identifying additional susceptibility loci for
CAD. To this end, we used the Metabochip array
15
, which is a custom
iSELECT chip (Illumina) containing 196,725 SNPs, designed to
(i) follow-up putative associations in several cardiometabolic traits,
including CAD, and (ii) fine map confirmed loci for these traits.
All SNPs on the array with data in the CARDIoGRAM study were
considered for analysis (79,138 SNPs, of which 6,222 were the repli-
cation SNPs and 20,876 were fine-mapping SNPs in the 22 CAD sus-
ceptibility loci identified at the time at which the array was designed;
the remaining SNPs were submitted by the other consortia contri-
buting to the Metabochip array
15
). In addition, we assess whether the
genome-wide significant CAD risk alleles act through traditional risk
factors by considering the available large GWAS for these traits
16–20
.
Finally, we identify a broader set of SNPs passing a conservative FDR
threshold for association with CAD and use this set to undertake
network analysis to find key biological pathways underlying the
pathogenesis of CAD.
RESULTS
Study design
We expanded the CARDIoGRAM discovery data set (22,233 cases and
64,762 controls
5
, stage 1) with 34 additional CAD sample collections
(stage 2) of European or south Asian descent comprising 41,513 cases
and 65,919 controls (study descriptions and sample characteristics are
given in Supplementary Tables 1a and 2a, respectively) and under-
took a 2-stage meta-analysis to test SNPs on the Metabochip array
Large-scale association analysis identifies new risk loci
for coronary artery disease
The CARDIoGRAMplusC4D Consortium
1
Coronary artery disease (CAD) is the commonest cause of death. Here, we report an association analysis in 63,746 CAD cases and 
130,681 controls identifying 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, 
and a further 104 independent variants (r
2
 < 0.2) strongly associated with CAD at a 5% false discovery rate (FDR). Together,  
these variants explain approximately 10.6% of CAD heritability. Of the 46 genome-wide significant lead SNPs, 12 show a 
significant association with a lipid trait, and 5 show a significant association with blood pressure, but none is significantly 
associated with diabetes. Network analysis with 233 candidate genes (loci at 10% FDR) generated 5 interaction networks 
comprising 85% of these putative genes involved in CAD. The four most significant pathways mapping to these networks are  
linked to lipid metabolism and inflammation, underscoring the causal role of these activities in the genetic etiology of CAD.  
Our study provides insights into the genetic basis of CAD and identifies key biological pathways.
1
A full list of authors and affiliations appears at the end of the paper.
Received 24 April; accepted 2 November; published online 2 December 2012; doi:10.1038/ng.2480
A R T I C L E S
npg
© 2013 Nature America, Inc. All rights reserved.

26 VOLUME 45 | NUMBER 1 | JANUARY 2013 Nature GeNetics
A R T I C L E S
for disease association in a total of 63,746 cases and 130,681 controls.
A further set of 3,630 cases and 11,983 controls from 4 independent stud-
ies was used for replication of SNPs that reached 5 × 10
−8
< P < 1 × 10
−6
in combined stage 1 and 2 analysis (stage 3; Supplementary
Tables 1b and 2b). An overview of the study design is provided in
Supplementary Figure 1. Cases were selected for inclusion following
the standard criteria for CAD and myocardial infarction used in the
CARDIoGRAM study
5
(details for the stage 2 and 3 cohorts are given
in Supplementary Table 2). Collections were typed with either the
Metabochip array (60% of samples) or provided GWAS data imputed
using HapMap (Supplementary Table 3). We applied standard quality
control criteria to each study and corrected for population stratifica-
tion if
λ
GC
was 1.05 (estimated for samples typed on the Metabochip
using 4,310 SNPs associated with long QT syndrome and located at
least 5 Mb away from established CAD risk loci; Online Methods).
Case-control association analyses were adjusted for sex and age. For
the 79,138 SNPs on the Metabochip with both stage 1 and 2 data,
we combined (2-sided) P values from stage 1 with their respective
(1-sided) P values for stage 2 using Fisher’s method (Online Methods).
In stage 3, we validated SNPs at 5 × 10
−8
< P < 1 × 10
−6
and com-
bined evidence across all stages (1– 3) using a sample size–weighted
meta-analysis.
Genome-wide significant loci
We first examined the 30 CAD risk loci previously reported in individ-
uals of European ancestry at genome-wide significance (the ADTRP
(C6orf105) locus has been reported only in Chinese)
12
in the stage
2 samples. For the 26 loci in which we could test the known lead SNP or
a suitable proxy (r
2
> 0.8), we found highly significant associations in
the stage 2 samples (Table 1). Notably, in four of these loci (CDKN2B-
AS1, COL4A2, CXCL12 and APOE), we detected additional SNPs not
in LD (r
2
< 0.5) with the lead SNP, which also reached genome-wide
significance and were conditionally independent when analyzed with
GCTA software
21
. The additional SNP in the APOE locus, rs445925
(P = 9.42 × 10
−11
; r
2
= 0.015 with rs207560 in 1000 Genomes Project
data), is located near APOC1, a gene previously suggested to confer
risk for CAD
22
. The r
2
value between rs445925 (P = 9.42 × 10
−11
;
n = 31 studies) and rs7412 (P = 8.86 × 10
−4
; n = 21 studies), which
tags the APOE e2 allele,is 0.588. The LIPA locus also harbors a strong
independent signal, which, however, did not reach genome-wide sig-
nificance. Findings for the strongest associated variant available on
the Metabochip for the other four loci (MIA3, 7q22, ZNF259-APOA5-
APOA1 and ADAMTS7) for which we did not have a good proxy for
the previously reported lead SNP are also given (Table 1). Notably,
for ADAMTS7, rs7173743 (r
2
= 0.38 with rs3825807, the published
lead SNP) also achieved genome-wide significance.
We next examined the association of the 6,222 SNPs with P < 0.01
in CARDIoGRAM (we excluded SNPs in all loci listed in Table 1).
Distribution of the absolute z scores for these SNPs in the stage 2
samples showed strong enrichment in positive scores corresponding
to SNPs with directionally consistent signals between stages 1 and 2
under the null distribution, which is defined by mean = 0 and s.d. = 1
(4,260 SNPs observed versus 3,111 SNPs expected; binomial 2-sided
P = 7.5 × 10
−187
) (Supplementary Fig. 2). In total, 19 loci showed
association at P < 1 × 10
−6
in the combined stage 1 and 2 analysis, with
13 of them reaching genome-wide significance, namely IL6R, APOB,
VAMP5-
VAMP8-GGCX, SLC22A4-SLC22A5, ZEB2-AC074093.1,
GUCY1A3, KCNK5, LPL, PLG, TRIB1, ABCG5-ABCG8, FURIN-FES
and FLT1 (Table 2; Forest and regional association plots are given in
Supplementary Figs. 3 and 4, respectively). The 6 loci with associa-
tions not reaching P < 5 × 10
−8
were further validated (stage 3) in 4
independent studies (3,630 cases and 11,983 controls; Supplementary
Table 1b). Two loci, EDNRA and HDAC9 replicated at P < 0.05 and
reached genome-wide significance in a combined analysis of stages
1–3 (Table 2); findings for those SNPs not meeting the above criteria
are shown in Supplementary Table 4.
Of the newly associated loci reaching genome-wide significance,
TRIB1 and ABCG5-ABCG8 were recently reported to reach study-
wide significance (P < 3 × 10
−6
) in a large candidate gene (IBC array)
study of CAD
13
. The same study reported rs2706399 in the IL5 locus,
which is located 200,349 bp away from the SNP we detected in the
SLC22A4-SLC22A5 locus (rs273909; Table 2). Although located in the
same recombination interval, these SNPs are not in LD (r
2
= 0.02), and
conditional analysis in a subset of 85,136 samples (up to19,200 cases)
suggested that the 2 signals are conditionally independent; when con-
ditioning on rs2706399 (IL5 locus), the P value for rs273909 (SLC22A4
locus) was 5.54 × 10
−3
(1.33 × 10
−3
initially), whereas the converse con-
ditioning gave a P value of 3.34 × 10
−2
for rs2706399 (IL5; 7.55 × 10
−3
initially). We also detected a second signal in the FES locus (rs2521501;
P = 1.31 × 10
−9
); conditional analysis with rs17514846 and rs2521501
(r
2
= 0.43 in 1000 Genomes Project data) showed the two signals not
only to be independent but to also increase in strength upon condition-
ing (rs17514846 associated at P
= 1.07 × 10
−25
when conditioned on
rs2521501; conversely, the P value for rs2521501 was 9.24 × 10
−26
).
Subgroup analyses
Genetic risk of CAD could vary by age and gender and could also
specifically influence the risk of its main adverse outcome, myocardial
infarction
23
. We therefore undertook exploratory association analyses in
subgroups partitioned by either gender, age at event (with individuals of
<50 years of age being defined as young cases) or history of myocardial
infarction (Online Methods). For the 46 genome-wide significant CAD
risk loci, we observed no trend for higher odds ratios (ORs) in any of
the subgroup analyses (Supplementary Table 5). However, one new
locus reached genome-wide significance in males and in young CAD
cases (rs16986953; P = 1.89 × 10
−8
and 1.67 × 10
−8
, respectively), which
is located in a gene desert (with nearest transcript AK097927), 1.3 Mb
away from the APOB gene. Interaction analysis conducted in a subset
of studies (n = 12) where we had individual-level data provided sug-
gestive evidence of an association with age (P = 0.033) but not with sex
(P = 0.708); further studies are required to confirm this finding.
Wider Metabochip content
In addition to SNPs provided by the CARDIoGRAM Consortium,
the Metabochip array contains a further 113,248 SNPs submitted for
a range of cardiometabolic traits
15
other than CAD itself (associated at
P > 0.01 with CAD in CARDIoGRAM samples or not tested). For these
SNPs, we did not detect any new locus reaching genome-wide signi-
ficance in our data set (including stage 1 and 3 data, when available).
In total, therefore, we discovered 15 newly associated loci at genome-wide
significance, increasing the total number of genome-wide significant
loci to 45 in individuals of European and south Asian ancestry.
Localizing candidate CAD genes
To identify potential causal CAD-associated genes at the 15 new sus-
ceptibility loci identified in our study, we first analyzed genome-wide
expression quantitative trait locus (eQTL) data in multiple tissues
(circulating monocytes, liver, fat, skin, omentum, aortic media and
adventitia, mammary artery and lymphoblastoid cell lines (LCLs)).
We found that the lead SNP or a proxy in high LD (r
2
0.8) in three
of the new loci was associated in cis with variable expression levels of
the GGCX-VAMP8, PLG and FES genes (
Supplementary Table 6).
npg
© 2013 Nature America, Inc. All rights reserved.

Nature GeNetics VOLUME 45 | NUMBER 1 | JANUARY 2013 27
A R T I C L E S
We then assessed allele-specific expression data in monocytes, fibro-
blasts and LCLs and found three loci where the lead SNP was associ-
ated with an imbalance in expression of either LPL, GGCX or FES;
IL6R
showed some evidence of allele-specific expression in the fibro-
blast sample (Supplementary Table 6). Finally, we examined the new
CAD risk loci for genes with relevant disease trait associations in
mouse knockout models; six loci harbor a gene for which a mouse
knockout model has a relevant cardiovascular phenotype, namely
ABCG8, APOB, GUCY1A3, PLG, LPL and FES (Supplementary
Table 7). PLG is adjacent to LPA, and, although the PLG risk variant
rs4252120[T] was strongly associated with elevated Lp(a) lipoprotein
levels (P = 5 × 10
−24
) in 3,698 PROCARDIS cases, it was associated
with CAD independent of the LPA-linked variant at rs3798220.
A detailed discussion of the genes in each locus is provided in the
Supplementary Note. Of the 30 previously reported CAD suscep-
tibility loci in individuals of European and south Asian ancestry,
mouse knockout models for the candidate genes PEMT, APOE, LDLR,
COL4A1, LIPA, APOA1-APOA5, PPAP2B and PCSK9
also show pheno-
typic characteristics directly relevant to disease (Supplementary
Table 7). In total, approximately a third of the 45 CAD loci contain a
known functionally relevant candidate gene.
Overlap with traditional risk factors
We assessed both the known and new CAD susceptibility loci for
overlap of associations with a number of relevant traits for which
summary statistics have been made available: lipid levels (GLGC)
16
,
blood pressure (ICBPG)
17
, diabetes (DIAGRAM)
18
, glucometabolic
traits (fasting insulin and fasting glucose concentrations, HOMA-B
Table 1 Association findings for known CAD susceptibility loci
Known loci
a
Published lead SNP or proxy
New SNP
(r
2
with lead SNP) Chr.
Effect/non-effect
allele (frequency) Stage 2 OR Stage 2 P Combined P Combined OR
SORT1
b
rs602633
(tagging rs599839; r
2
= 1.00)
1 C/A (0.77) 1.13 2.19 × 10
−18
1.47 × 10
−25
1.12
PCSK9 rs11206510 1 T/C (0.84) 1.04 5.09 × 10
–3
1.79 × 10
–5
1.06
WDR12 rs6725887 2 C/T (0.11) 1.10 5.29 × 10
–8
1.16 × 10
–15
1.12
MRAS rs9818870 3 T/C (0.14) 1.05 1.83 × 10
–3
2.62 × 10
–9
1.07
TCF21 rs12190287 6 C/G (0.59) 1.04 6.48 × 10
–4
4.94 × 10
–13
1.07
SLC22A3-LPAL2-LPA rs3798220 6 C/T (0.01) 1.28 4.90 × 10
–5
N/A N/A
rs2048327 (0.03) 6 G/A (0.35) 1.05 1.09 × 10
–5
6.86 × 10
–11
1.06
ZC3HC1 rs11556924 7 C/T (0.65) 1.08 1.45 × 10
–9
6.74 × 10
–17
1.09
CDKN2BAS1 rs1333049 9 C/G (0.47) 1.21 1.08 × 10
–34
1.39 × 10
–52
1.23
rs3217992 (0.50) 9 A/G (0.38) 1.14 7.27 × 10
–32
7.75 × 10
–57
1.16
ABO rs579459 9 C/T (0.21) 1.04 2.13 × 10
–2
2.66 × 10
–8
1.07
CYP17A1-CNNM2-NT5C2 rs12413409 10 G/A (0.89) 1.08 4.12 × 10
–3
6.26 × 10
–8
1.10
KIAA1462 rs2505083 10 C/T (0.42) 1.06 2.82 × 10
–7
1.35 × 10
–11
1.06
PDGFD rs974819 11 A/G (0.29) 1.08 2.03 × 10
–9
3.55 × 10
–11
1.07
SH2B3 rs3184504 12 T/C (0.40) 1.07 6.13 × 10
–7
5.44 × 10
–11
1.07
COL4A1-COL4A2 rs4773144 13 G/A (0.42) 1.06 2.34 × 10
–6
1.43 × 10
–11
1.07
rs9515203 (0.01) 13 T/C (0.74) 1.08 1.13 × 10
–8
5.85 × 10
–12
1.08
HHIPL1 rs2895811 14 C/T (0.43) 1.04 1.18 × 10
–4
4.08 × 10
–10
1.06
RAI1-PEMT-RASD1 rs12936587 17 G/A (0.59) 1.04 2.06 × 10
–4
1.24 × 10
–9
1.06
LDLR rs1122608 19 G/T (0.76) 1.06 3.72 × 10
–6
6.33 × 10
–14
1.10
Gene desert (KCNE2) rs9982601 21 T/C (0.13) 1.10 8.69 × 10
–9
7.67 × 10
–17
1.13
PPAP2B rs17114036 1 A/G (0.91) 1.09 2.68 × 10
–5
5.80 × 10
–12
1.11
ANKS1A rs12205331 (tagging
rs17609940; r
2
= 0.85)
6 C/T (0.81) 1.01
4.36 × 10
–1
4.18 × 10
–5
1.04
PHACTR1 rs9369640 (tagging rs12526453;
r
2
= 0.90)
6 A/C (0.65) 1.09 1.11 × 10
–12
7.53 × 10
–22
1.09
CXCL12 rs501120 10 A/G (0.83) 1.06 7.13 × 10
–5
1.79 × 10
–9
1.07
rs2047009 (0.05) 10 C/A (0.48) 1.05 9.66 × 10
–6
1.59 × 10
–9
1.05
LIPA rs2246833
(tagging rs1412444; r
2
= 0.98)
10 T/C (0.38) 1.04 2.76 × 10
–2
9.49 × 10
–6
1.06
rs11203042 (0.39) 10 T/C (0.44) 1.03 9.86 × 10
–3
6.08 × 10
–6
1.04
UBE2Z rs15563 (tagging rs46522;
r
2
= 0.93)
17 C/T (0.52) 1.01 2.44 × 10
–1
9.37 × 10
–6
1.04
SMG6 rs2281727
(tagging rs216172; r
2
= 0.96)
17 C/T (0.36) 1.04 8.46 × 10
–4
7.83 × 10
–9
1.05
ApoE-ApoC1 rs2075650 19 G/A (0.14) 1.11 5.86 × 10
–11
N/A N/A
rs445925 (0.03) 19 C/T (0.90) 1.13 8.76 × 10
–9
N/A N/A
MIA3 N/A rs17464857 (0.18) 1 T/G (0.87) 1.02 1.56 × 10
–1
6.06 × 10
–5
1.05
7q22 N/A rs12539895 (0.64) 7 A/C (0.19) 1.02 4.00 × 10
–2
5.33 × 10
–4
1.08
ZNF259-APOA5-APOA1 N/A rs9326246 (0.63) 11 C/G (0.10) 1.04 2.90 × 10
–2
1.51 × 10
–7
1.09
ADAMTS7 N/A rs7173743 (0.38) 15 T/C (0.58) 1.06 2.46 × 10
–7
6.74 × 10
–13
1.07
Chr., chromosome.
a
Locus C6orf105, which has been reported only in Chinese and has no good proxy SNP (Utah residents of Northern and Western European ancestry (CEU) or Han Chinese in Beijing, China
(CHB)) on the Metabochip. The best available proxy is rs9348953 (r
2
= 0.01), with combined P = 2.81 × 10
–3
.
b
rs12740374, which was reported as a functional variant in this locus and has
r
2
= 0.895 with rs599839, has combined P = 8.25 × 10
–18
(OR = 1.135) based on the random-effects model used (P in stage 2 alone was 6.48 × 10
–21
under the fixed-effect model).
npg
© 2013 Nature America, Inc. All rights reserved.

28 VOLUME 45 | NUMBER 1 | JANUARY 2013 Nature GeNetics
A R T I C L E S
(homeostatic model assessment-β score) and HOMA-IR (insulin
resistance); MAGIC)
19
and anthropometric traits (GIANT)
20,24
.
After applying a Bonferroni correction for the 51 independent CAD-
associated alleles tested (44 loci; no data available for rs16986953 and
rs2521501), 12 loci showed evidence of association (P < 1 × 10
−4
)
between the lead CAD risk SNP and 1 or more plasma lipid trait
(total cholesterol, LDL cholesterol, high-density lipoprotein (HDL)
cholesterol and triglyceride concentration) in the expected direction
(the CAD risk allele was associated with higher total cholesterol, LDL
cholesterol and triglyceride concentrations and lower HDL choles-
terol concentration). These lead SNPs were most strongly associated
with LDL cholesterol concentration at eight loci (APOB, ABCG5-
ABCG8, PCSK9, SORT1, ABO, LDLR, APOE and LPA), with trig-
lyceride concentration at two loci (TRIB1 and the APOA5 cluster)
and with HDL cholesterol concentration at one locus (ANKS1A).
There was near-equivalent association for triglyceride and HDL
cholesterol concentrations at one locus (LPL). All loci except LPA
and ANKS1A showed genome-wide significance for association with
a lipid trait. These results underscore the importance of LDL choles-
terol as a causal CAD risk factor (Supplementary Table 8). At the
SH2B3 locus, the CAD risk allele for rs3184504 was associated with
both lower LDL cholesterol (P = 1.73 × 10
−9
) and HDL cholesterol
(P = 4.97 × 10
−6
) concentration; one likely explanation is the presence
of independent variants for CAD and LDL cholesterol. Two known
CAD risk loci (CYP17A1-NT5C2 and SH2B3) and two of the new
CAD susceptibility loci (GUCY1A3 and FES) have previously been
associated with systolic (SBP) and diastolic (DBP) blood pressure
17
.
Significant evidence for association with DBP was also observed for
ZC3HC1 (Supplementary Table 8). In contrast to the results for lipid
concentration and blood pressure, there was no significant association
of any of the loci tested with type 2 diabetes (T2D). Consistent with
this observation, none of the assessed glucometabolic traits (fasting
insulin and fasting glucose concentrations, HOMA-B and HOMA-IR)
were related to these CAD variants (at the ANKS1A locus, it was not
the CAD risk SNP that was associated with fasting insulin concentra-
tion and HOMA-IR). Suggestive associations (P < 1 × 10
−4
) with body
mass index (BMI) and waist-hip ratio were observed in the CYP17A1-
CNNM2-NT5C2 and RAI1-PEMT-RASD1 loci, respectively.
Additional suggestive associations
The genome-wide significance threshold, P < 5 × 10
−8
, we used is
the accepted criterion for reporting individual association signals, as
for each experiment it controls the error rate among common vari-
ants to less than 5%. However, SNPs showing suggestive association
with a phenotype but not meeting this genome-wide threshold are
likely to include additional true positive signals in well-powered stud-
ies (Supplementary Fig. 1). Such SNPs may also be informative in
predicting CAD risk and in constructing CAD-associated biological
networks. To identify such variants, we undertook an FDR analysis to
assess the proportion of false positive signals in a set of (nominally)
significant SNPs
25
. The Metabochip array contains both SNPs with pri-
ors in terms of association to CAD (CARDIoGRAM study P < 0.01) and
blocks of highly correlated SNPs in fine-mapping regions. Therefore,
to normalize the distribution of SNPs considered for FDR analysis,
we (i) removed all SNPs in the CAD fine-mapping regions and LD-
pruned (r
2
< 0.2) SNPs in the non CAD fine-mapping regions and
(ii) adjusted the combined P values of all SNPs with priors in stage 1
(P < 0.01) using fixed-effect inverse variance–weighted meta-
analysis P values for all other SNPs (Online Methods). In addition,
we obtained 104 SNPs at an FDR threshold of 5% and LD threshold
of r
2
< 0.2 (Supplementary Table 9). The median OR for CAD for
these SNPs was 1.054 (interquartile range of 0.0199) per risk allele
(Supplementary Fig. 5).
On the basis of a heritability estimate of 40% for CAD, the combina-
tion of the known and newly associated SNPs within the 45 suscepti-
bility loci (Tables 1 and 2) explains approximately 6% of the additive
genetic variance of CAD. The addition of the 104 SNPs from FDR ana-
lysis increased the fraction explained to 10.6% (Online Methods).
Table 2 Additional loci showing genome-wide significant association with CAD
Stage 1 (18,014
cases and 40,925
controls)
a
Stage 2 (40,365
cases and 63,714
controls)
Combined
(stages 1
and 2)
Stage 3 (5,055
cases and 5,617
controls)
Combined
(stages 1–3)
SNP Chr. Nearest gene(s)
Effect/non-
effect allele
(frequency) OR P OR P P OR P P
Biological
relevance
b
New
rs4845625 1 IL6R T/C (0.47) 1.06 4.84 × 10
–5
1.04 3.46 × 10
–5
3.55 × 10
–8
1.09 1.58 × 10
–3
3.64 × 10
–10
2
rs515135 2 APOB G/A (0.83) 1.07 8.63 × 10
–4
1.08 2.17 × 10
–8
4.80 × 10
–10
1.03 4.02 × 10
–1
2.56 × 10
–10
1
rs2252641 2 ZEB2-AC074093.1 G/A (0.46) 1.06 1.37 × 10
–5
1.04 1.27 × 10
–4
3.66 × 10
–8
1.00 9.54 × 10
–1
5.30 × 10
–8
rs1561198 2 VAMP5-VAMP8-GGCX A/G (0.45) 1.06 7.47 × 10
–5
1.05 2.57 × 10
–6
4.48 × 10
–9
1.07 1.75 × 10
–2
1.22 × 10
–10
A,1
rs7692387 4 GUCY1A3 G/A (0.81) 1.08 1.04 × 10
–5
1.06 1.89 × 10
–5
4.57 × 10
–9
1.13 5.47 × 10
–4
2.65 × 10
–11
1
rs273909 5 SLC22A4-SLC22A5 C/T (0.14) 1.07 3.24 × 10
–3
1.09 2.00 × 10
–7
1.43 × 10
–8
1.11 2.43 × 10
–2
9.62 × 10
–10
A,1
rs10947789 6 KCNK5 T/C (0.76) 1.07 6.07 × 10
–5
1.06 1.22 × 10
–5
1.63 × 10
–8
1.01 7.03 × 10
–1
9.81 × 10
–9
3
rs4252120 6 PLG T/C (0.73) 1.07 1.18 × 10
–5
1.06 1.82 × 10
–5
5.00 × 10
–9
1.07 9.58 × 10
–2
4.88 × 10
–10
1
rs264 8 LPL G/A (0.86) 1.11 2.99 × 10
–7
1.05 7.30 × 10
–4
5.06 × 10
–9
1.06 1.60 × 10
–1
2.88 × 10
–9
1
rs9319428 13 FLT1 A/G (0.32) 1.06 7.88 × 10
–5
1.05 5.70 × 10
–6
1.01 × 10
–8
1.10 1.37 × 10
–3
7.32 × 10
–11
1
rs17514846 15 FURIN-FES A/C (0.44) 1.07 2.37 × 10
–5
1.05 7.35 × 10
–7
4.49 × 10
–10
1.04 3.02 × 10
–1
9.33 × 10
–11
A,1
Previously reported at array-wide level of significance (P < 3 × 10
−6
)
Rs2954029 8 TRIB1 A/T (0.55) 1.06 2.79 × 10
–5
1.04 7.75 × 10
–5
4.53 × 10
–8
1.05 8.56 × 10
–2
4.75 × 10
–9
4
Rs6544713 2 ABCG5-ABCG8 T/C (0.30) 1.06 2.22 × 10
–4
1.06 1.57 × 10
–7
8.72 × 10
–10
0.96 3.56 × 10
–1
2.12 × 10
–9
1
New (stage 3 replication)
Rs1878406 4 EDNRA T/C (0.15) 1.10 2.37 × 10
–6
1.06 3.54 × 10
–3
1.65 × 10
–7
1.09 2.01 × 10
–2
2.54 × 10
–8
1
Rs2023938 7 HDAC9 G/A (0.10) 1.08 6.81 × 10
–4
1.07 5.25 × 10
–5
6.49 × 10
–7
1.13 4.09 × 10
–2
4.94 × 10
–8
1
a
Total sample sizes do not include the CHARGE sample sizes.
b
A, cis eQTL in LCLs; 1, mouse model available with cardiovascular phenotype; 2, mouse model has homeostatic and immune
phenotypes; 3, mouse model has respiratory, nervous system, mortality, aging, growth and renal phenotypes; 4, mouse model has growth and immune phenotypes.
npg
© 2013 Nature America, Inc. All rights reserved.

Nature GeNetics VOLUME 45 | NUMBER 1 | JANUARY 2013 29
A R T I C L E S
Network analysis
In contrast to estimating heritability where we want to keep the false
positive rate as low as possible, in network analysis, we want to maxi-
mize the representation of potential network nodes in the gene set
used. Thus, to perform network analysis, we selected the top 222 SNPs
defined by the FDR analysis (10% FDR; final P < 6.6 × 10
−4
) at an LD
threshold of r
2
0.7 and assigned 239 candidate genes on the basis of
either eQTL data or physical proximity (Supplementary Table 10).
We mapped 238 of the 239 genes in the Ingenuity Knowledge Base
and considered 233 for network construction (Online Methods) on
the basis of available data on interactions in humans, mice and/or rats
(51 genes within the 46 genome-wide significant loci (set A) and 182
genes within the loci selected at FDR < 10% (set B)). Including neigh-
boring genes, Ingenuity generated 9 networks comprising 553 nodes;
these included 48 (94.1%) of the genes in set A and 156 (85.7%) of
those in set B (Supplementary Table 10). We obtained 2 overlapping
networks: ON1, which included networks 1, 2, 6 and 8, comprising
the majority of genes in both sets (33 and 83 in sets A and B, respec-
tively), and ON2, which included networks 4 and 7 (Supplementary
Table 10). The nine networks were strongly enriched for genes (query
set) known to be involved in lipid metabolism (P = 1.48 × 10
−9
),
cellular movement (blood and endothelial cells; P = 1.35 × 10
−7
) and
processes such as tissue morphology (size and area of atherosclerotic
lesion, quantity of leukocytes, macrophages and smooth muscle cells;
P = 9.66 × 10
−10
) and immune cell trafficking (migration and adhe-
sion; P = 1.12 × 10
−7
). As a negative control in the network analysis,
we used a set of 368 genes selected from the least significant SNPs
in the FDR analysis; the resulting networks showed no significant
enrichment in relevant molecular functions and process (results
described in detail in the Supplementary Note).
We then assessed how genes in the networks overlap with canonical
pathways in the Ingenuity database. The four most significant canonical
pathways represented in these networks are shown in Figure 1a.
The top three pathways, atherosclerosis signaling, liver X receptor
(LXR)/retinoid X receptor (RXR) activation and farnesoid X receptor
(FXR)/RXR activation, all harbor genes involved in lipid metabolism,
including ten CAD risk loci (ABCG5-ABCG8, APOA1, APOA5, APOB,
APOE, CXCL12, LDLR, LPA, LPL and PDGFD). This is in agreement
with our finding that 12 CAD risk loci are associated with lipid levels
at P < 1 × 10
−4
(Supplementary Table 8). Notably, three of the top four
pathways also contain genes involved in inflammation. In addition
to the atherosclerosis signaling and LXR/RXR activation pathways,
the acute phase response signaling (AAPRS) pathway, which includes
four CAD risk loci (APOA1, MRAS, IL6R and PLG), is involved in
inflammation and, more specifically, the rapid inflammatory response
that is triggered, among other factors, by tissue injury. Genes from
both the lipid metabolism and inflammation-related pathways map
to all networks, except network 9, which harbors only two genes
(Supplementary Table 10). As shown for overlapping network ON1
(Supplementary Fig. 6), genes in lipid metabolism and inflamma-
tion are interconnected and include both CAD-associated loci reach-
ing genome-wide significance and candidate loci at FDR < 10%. Key
interactions between CAD susceptibility genes (known, new and the
FDR set) involved in lipid metabolism and inflammation are shown
in Figure 1b; macrophages take up oxidized LDL (ox-LDL) through
their cell surface scavenger receptors to form foam cells. Foam cells
secrete proinflammatory cytokines, such as interleukin (IL)-1, IL-6
and matrix metalloproteinases, which can amplify the local inflam-
matory response and stimulate smooth muscle cell proliferation and
initial migration toward the lesion
26
. Regulation of collagen secre-
tion by smooth muscle cells in the extracellular matrix is regulated by
matrix metalloproteinases. Reduction of collagen in the extracellular
matrix will destabilize the plaque. Both COL4A1 and COL4A2 encode
subunits of type IV collagen, which is the major structural component
of basement membranes lining the inner surface of blood vessels.
Metalloproteinases have a role in the maintenance of the extracellular
matrix and remodeling, contributing to the transition of plaques from
stable to vulnerable states (Fig. 1b).
Biological function P value
Atherosclerosis signaling
1.67 × 10
–6
LXR/RXR activation
2.14 × 10
–6
FXR/RXR activation
5.76 × 10
–6
Acute phase response signaling
4.36 × 10
–5
Canonical
pathways
a
ABCG5
ABCG8
Cholesterol
Oxysterols
APOA1
LDLR
LDL
SCARB1
MSR1
LDL-ox
LDL-ox
Cholesterol-ester
Cholesterol
TNFR
NF-κB
APOE
APOA5
LPA
MMP13
IL6
IL6
IL1R
IL1B
MMP13
TNFSF14
IL6R
PLG
MRAS
PTPN11
STAT3
Collagen
N-COR
LXR/RXR
Fatty
acid
HMGCR
9-cis RA
N-COR
LXR/RXR
Oxysterols 9-cis RA
PDGF
CXCR4
Collagen
CXCL12
SMC
b
IL1F10
Cytoplasm
Nucleus
Extracellular space
Cells involved in
inammatory response
Figure 1 Canonical pathway analysis.
(a) The four most significant canonical
pathways represented in networks 3, 5
and 9, and overlapping networks ON1
(includes networks 1, 2, 6 and 8) and ON2 (includes networks 4 and 7); all molecules are listed by network in
Supplementary Table 10. (b) Schematic
showing parts of the atherosclerosis signaling, LXR/RXR activation and acute phase response signaling canonical pathways (Ingenuity) that are involved
in both lipid metabolism and inflammation. Genes in confirmed CAD susceptibility loci (including both previously and newly reported) and in loci
showing suggestive association with an FDR of <10% are depicted as black and gray ovals, respectively. Other key genes are depicted as white ovals;
notably, some of them, such as IL1F10-IL1B, STAT3 and HMGCR, have SNPs ranking in the top 1,000 in the FDR analysis. The process leading
to myocardial infarction involves multiple cell types that are depicted in this schematic as a composite cell (large oval) and its nucleus (inner oval)
in the extracellular space; the smooth muscle cell is shown separately (SMC; red oval), whereas the blue oval depicts cell types involved in the
inflammatory response.
npg
© 2013 Nature America, Inc. All rights reserved.

Citations
More filters
References
More filters
Journal ArticleDOI

Biological, clinical and population relevance of 95 loci for blood lipids

Tanya M. Teslovich, +218 more
- 05 Aug 2010 - 
TL;DR: The results identify several novel loci associated with plasma lipids that are also associated with CAD and provide the foundation to develop a broader biological understanding of lipoprotein metabolism and to identify new therapeutic opportunities for the prevention of CAD.
Journal ArticleDOI

Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index

Elizabeth K. Speliotes, +413 more
- 01 Nov 2010 - 
TL;DR: Genetic loci associated with body mass index map near key hypothalamic regulators of energy balance, and one of these loci is near GIPR, an incretin receptor, which may provide new insights into human body weight regulation.
Journal ArticleDOI

Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci

Andre Franke, +97 more
- 01 Dec 2010 - 
TL;DR: A meta-analysis of six Crohn's disease genome-wide association studies and a series of in silico analyses highlighted particular genes within these loci implicated functionally interesting candidate genes including SMAD3, ERAP2, IL10, IL2RA, TYK2, FUT2, DNMT3A, DENND1B, BACH2 and TAGAP.
Related Papers (5)

A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease

Majid Nikpay, +167 more
- 07 Sep 2015 - 

Discovery and refinement of loci associated with lipid levels

Cristen J. Willer, +319 more
- 06 Oct 2013 - 

Biological, clinical and population relevance of 95 loci for blood lipids

Tanya M. Teslovich, +218 more
- 05 Aug 2010 - 

Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk

Georg Ehret, +391 more
- 06 Oct 2011 -