scispace - formally typeset
Open AccessJournal ArticleDOI

Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function

María Soler Artigas, +192 more
- 01 Nov 2011 - 
- Vol. 43, Iss: 11, pp 1082-1090
Reads0
Chats0
TLDR
This article identified new regions showing association with pulmonary function in or near MFAP2, TGFB2, HDAC4, RARB, MECOM (also known as EVI1), SPATA9, ARMC2, NCR3, ZKSCAN3, CDC123, C10orf11, LRP1, CCDC38, MMP15, CFDP1 and KCNE2.
Abstract
Pulmonary function measures reflect respiratory health and are used in the diagnosis of chronic obstructive pulmonary disease. We tested genome-wide association with forced expiratory volume in 1 second and the ratio of forced expiratory volume in 1 second to forced vital capacity in 48,201 individuals of European ancestry with follow up of the top associations in up to an additional 46,411 individuals. We identified new regions showing association (combined P < 5 × 10(-8)) with pulmonary function in or near MFAP2, TGFB2, HDAC4, RARB, MECOM (also known as EVI1), SPATA9, ARMC2, NCR3, ZKSCAN3, CDC123, C10orf11, LRP1, CCDC38, MMP15, CFDP1 and KCNE2. Identification of these 16 new loci may provide insight into the molecular mechanisms regulating pulmonary function and into molecular targets for future therapy to alleviate reduced lung function.

read more

Content maybe subject to copyright    Report

© 2011 Nature America, Inc. All rights reserved.
© 2011 Nature America, Inc. All rights reserved.
1082 VOLUME 43 | NUMBER 11 | NOVEMBER 2011 Nature GeNetics
A R T I C L E S
Pulmonary function measures reflect respiratory health and are used in the diagnosis of chronic obstructive pulmonary disease. We
tested genome-wide association with forced expiratory volume in 1 second and the ratio of forced expiratory volume in 1 second
to forced vital capacity in 48,201 individuals of European ancestry with follow up of the top associations in up to an additional
46,411 individuals. We identified new regions showing association (combined P < 5 × 10
−8
) with pulmonary function in or near
MFAP2, TGFB2, HDAC4, RARB, MECOM (also known as EVI1), SPATA9, ARMC2, NCR3, ZKSCAN3, CDC123, C10orf11, LRP1,
CCDC38, MMP15, CFDP1 and KCNE2. Identification of these 16 new loci may provide insight into the molecular mechanisms
regulating pulmonary function and into molecular targets for future therapy to alleviate reduced lung function.
by ever-smoking versus never-smoking status. We performed the
meta-analyses of the smoking strata within each study and of the
study-specific results using inverse-variance weighting (and used
the inverse of the standard error squared as the weight). We applied
genomic control twice at the study level (to each smoking stratum
separately and to the study-level pooled estimates) and also at the
meta-analysis level to avoid inflation of the test statistics caused by
cryptic population structure or relatedness (see Supplementary Table 1a
for study-level estimates). Our application of genomic control at the
three stages is likely to be overly conservative because it has recently
been shown that in large meta-analyses, test statistics are expected
to be elevated under polygenic inheritance even when there is no
population structure
12
. The test statistic inflation (
λ
GC
) before apply-
ing genomic control at the meta-analysis level was 1.12 for FEV
1
and 1.09 for FEV
1
/FVC. Genomic inflation estimates increase with
sample size, as has been shown for other traits
13–15
; the standardized
estimates to a sample of 1,000 individuals (
λ
GC_1,000
) were 1.002 for
FEV
1
and 1.002 for FEV
1
/FVC. Plots of the meta-analysis P values
for FEV
1
and FEV
1
/FVC against a uniform distribution of P values
expected under the null hypothesis showed deviations which were
attenuated, but which persisted, after removal of SNPs in loci reported
previously, consistent with additional loci being associated with lung
function (Supplementary Fig. 1a).
Follow-up analysis (stage 2)
Twenty-nine new loci showing evidence of association with lung
function (P < 3 × 10
−6
) in stage 1 were followed up in stage 2 by
using in silico data from seven studies and by undertaking additional
genotyping in ten studies for the ten highest ranked SNPs (Fig. 1).
Full details of the SNP selection are given in the Online Methods. We
performed an inverse-variance–weighting meta-analysis across stages
1 and 2 and obtained two-sided P values for the pooled estimates.
Sixteen new loci reached genome-wide significance (P < 5 × 10
−8
)
and showed consistent direction of effects in both stages, comprising
12 new loci for FEV
1
/FVC, 3 new loci for FEV
1
and 1 new locus reaching
Genome-wide association and large-scale follow up
identifies 16 new loci influencing lung function
Pulmonary function, reliably measurable by spirometry, is a herit-
able trait reflecting the physiological state of the airways and lungs
1
.
Pulmonary function measures are important predictors of population
morbidity and mortality
2–4
and are used in the diagnosis of chronic
obstructive pulmonary disease (COPD), which ranks among the
leading causes of death in developed and developing countries
5,6
.
A reduced ratio of forced expiratory volume in 1 second (FEV
1
) to
forced vital capacity (FVC) is used to define airway obstruction, and
a reduced FEV
1
is used to grade the severity of airway obstruction
7
.
Recently, two large genome-wide association studies (GWAS), each
comprising discovery sets of more than 20,000 individuals of European
ancestry, identified new loci for lung function
8,9
. Recognizing the
need for larger data sets to increase the power to detect loci of indi-
vidually modest effect size, we conducted a meta-analysis of 23 lung
function GWAS comprising a total of 48,201 individuals of European
ancestry (stage 1) and followed up potentially new loci in 17 further
studies comprising up to 46,411 individuals (stage 2). We identified 16
additional new loci for lung function and provided evidence corrobo-
rating the association of loci previously associated with lung func-
tion
8–11
. Our findings implicate a number of different mechanisms
underlying regulation of lung function and highlight loci shared with
complex traits and diseases, including height, lung cancer and myo-
cardial infarction.
RESULTS
Genome-wide analysis (stage 1)
We undertook meta-analyses for cross-sectional lung function meas-
ures for approximately 2.5 million genotyped or imputed SNPs across
23 studies with a combined sample size of 48,201 adult individu-
als of European ancestry. Characteristics of the cohort participants
and the genotyping are shown in Supplementary Table 1a and b.
We adjusted FEV
1
and FEV
1
/FVC measures for ancestry principal
components, age, age
2
, sex and height as covariates. Our association
testing of the inverse-normal–transformed residuals for FEV
1
and
FEV
1
/FVC assumed an additive genetic model and was stratified
A full list of author affiliations appears at the end of the paper.
Received 20 April; accepted 19 August; published online 25 September 2011; doi:10.1038/ng.941

© 2011 Nature America, Inc. All rights reserved.
© 2011 Nature America, Inc. All rights reserved.
Nature GeNetics VOLUME 43 | NUMBER 11 | NOVEMBER 2011 1083
A R T I C L E S
genome-wide significance for both traits (Fig. 2 and Table 1). To
assess the heterogeneity across the studies included in stage 1 and 2,
we performed
χ
2
tests for all 16 SNPs, and none of these SNPs was
statistically significant after applying a Bonferroni correction for
16 tests. The sentinel SNPs at these loci were in or near MFAP2
(1p36.13), TGFB2-LYPLAL1 (1q41), HDAC4-FLJ43879 (2q37.3),
RARB (3p24.2), MECOM (also known as EVI1) (3q26.2), SPATA9-
RHOBTB3 (5q15), ARMC2 (6q21), NCR3-AIF1 (6p21.33), ZKSCAN3
(6p22.1), CDC123 (10p13), C10orf11 (10q22.3), LRP1 (12q13.3),
CCDC38 (12q22), MMP15 (16q13), CFDP1 (16q23.1) and KCNE2-
LINC00310 (also known as C21orf82) (21q22.11) (Supplementary
Fig. 1b,c). The strongest signals in AGER (rs2070600)
8,9
and two of
the new signals (rs6903823 in ZKSCAN3 and rs2857595, upstream
of NCR3) lie within a ~3.8-Mb interval at 6p21.32-22.1 that is char-
acterized by long-range linkage disequilibrium (LD). Nevertheless,
the leading SNPs in these regions, which are within the major his-
tocompatibility complex (MHC), were statistically independent
(Supplementary Note).
Gene expression
We investigated mRNA expression of the nearest gene for each of
the 16 new loci in human lung tissue and a range of human primary
cells including lung, brain, airway smooth
muscle cells and bronchial epithelial cells.
We detected transcripts for all the selected
genes in lung tissue except CCDC38, and we
also detected transcripts for most genes in
airway smooth muscle cells and in bronchial
epithelial cells (Table 2). As we were unable
to detect expression of CCDC38 in any tissue,
we also examined expression of SNPRF, which
is the gene adjacent to CCDC38 (Table 2),
and found its expression in all four cell types.
TGFB2, MFAP2, EVI1 and MMP15 were
expressed in one or more lung cell types but
not in peripheral blood mononuclear cells,
providing evidence that these genes may
show tissue-specific expression.
We assessed whether SNPs in these new regions or their prox-
ies (r
2
> 0.6) were associated with gene expression using a data-
base of expression-associated SNPs in lymphoblastoid cell lines
16
.
Four loci showed regional (cis) effects on expression (P < 1 × 10
−7
;
Supplementary Note). A proxy for our sentinel SNP in CFDP1,
rs2865531, coincided with the peak of the expression signal for
CFDP1, and the strongest proxy for rs6903823 in ZKSCAN3 coincided
with the peak of expression for ZSCAN12.
Plausible pathways for lung function involving new loci
The putative function of the genes within, or closest to, the asso-
ciation peaks identify a range of plausible mechanisms for affect-
ing lung function. The most statistically significant new signal for
FEV
1
/FVC (P = 7.5 × 10
−16
) was in the gene encoding MFAP2, an
antigen of elastin-associated microfibrils
17
, although correlated SNPs
in the region potentially implicate other genes that could plausibly
influence lung function, such as CROCC, which encodes rootletin,
a component of cilia
18
. Our second strongest new signal, also for
FEV
1
/FVC, was in RARB, the gene encoding the retinoic acid recep-
tor β. Rarb-null knockout mice have premature alveolar septation
19
.
The third most statistically significant new signal for FEV
1
/FVC, and
the most statistically significant new signal for FEV
1
, was in CDC123.
20
a b
FEV
1
/FVC FEV
1
18
16
14
12
12
MFAP2
RARB
SPATA9
NCR3
CDC123
LRP1
CCDC38
CFDP1
MECOM
ZKSCAN3
CDC123
C10orf11
MMP15
KCNE2
ARMC2
HDAC4
TGFB2
10
8
–log
10
P
–log
10
P
6
4
2
0
10
8
6
4
2
0
1 2 3 4 5 6 7 8
Chromosome Chromosome
9 11 13 1 2 3 4 5 6 7 8 9 11 13 15 18 2215 18 22
Figure 2 Manhattan plots of association results for FEV
1
/FVC and FEV
1
(analysis stage 1). The
Manhattan plots for FEV
1
/FVC (a) and FEV
1
(b) are ordered by chromosome position. SNPs for which
−log
10
P > 5 are indicated in red. Newly associated regions that reached genome-wide significance
after meta-analysis of stages 1 and 2 are labeled.
Stage 1 (genome-wide
association studies)
n = 48,201
AGES (n = 1,689)
ADONIX (n = 1,410)
rs10067603
rs11172113
rs12447804
rs12716852
rs12914385
rs1344555
rs153916
rs1541374
rs1878798
rs1928168
rs2036527
rs2544527
rs2647044
+
rs2798641
rs2855812
rs3094548
rs3734729
rs3769124
rs4762767
rs6903823
rs8040868
rs9310995
rs993925
rs9978142
BRHS (n = 3,862)
BHS2 (n = 3,038)
BWHHS (n = 3,635)
Gedling (n = 1,266)
HCS (n = 2,848)
Nottingham Smokers (n = 521)
NSHD (n = 2,511)
SAPALDIA (n = 5,646)
CARDIA (n = 1,626)*
CROATIA-SPLIT (n = 491)*
GS:SFHS (n = 10,399)
#
LBC1936 (n = 991)*
LifeLines (n = 3,078)*
MESA-Lung (n = 1,469)*
RS-III (n = 1,247)*
TwinsUK-II (n = 2,373)*
ARIC (n = 9,078)
B58C T1DGC (n = 2,343)
B58C WTCCC (n = 1,372)
BHS1 (n = 1,168)
CHS (n = 3,140)
CROATIA-Korcula (n = 825)
CROATIA-Vis (n = 769)
ECRHS (n = 1,594)
EPIC obese cases (n = 1,104)
EPIC population based (n = 2,336)
FHS (n = 7,911)
FTC (n = 134)
Health ABC (n = 1,472)
Health 2000 (n = 821)
KORA F4 (n = 904)
KORA S3 (n = 555)
NFBC1966 (n = 4,556)
ORCADES (n = 692)
RS-I (n = 1,224)
RS-II (n = 852)
SHIP (n = 1,777)
Twins UK-I (n = 1,885)
Stage 2 (follow up of
10 SNPs only)
n = 24,737
Stage 2 (follow up of up to 34 SNPs)
n = 21,674
SNPs
followed up
rs1036429
rs11001819
rs12477314
rs1529672
rs1551943
rs2284746
rs2857595
rs2865531
rs3743563
rs7068966
Figure 1 Study design. We followed up in stage 2
a total of 34 SNPs showing new evidence of
association (P < 3 × 10
−6
) with FEV
1
and/or
FEV
1
/FVC in a meta-analysis of the stage 1 studies.
Studies with a combined total of 24,737 individuals
undertook genotyping and association testing of
the top ten SNPs. Seven studies (marked with an
asterisk) with a combined total of 11,275 individuals
had genome-wide association data and provided
results for up to 34 SNPs. Researchers from GS:
SFHS (marked with
#
) undertook genotyping on
a 32-SNP multiplex genotyping platform and so
included the 32 top ranking SNPs (including proxies
and both SNPs from regions that showed association
with both FEV
1
and FEV
1
/FVC). This assay failed
for one SNP (rs3769124), which was subsequently
replaced with the thirty-third SNP (rs4762767).
We excluded rs2284746 because of poor
clustering. Although rs3743563 was chosen
as proxy for rs12447804, which had an effective
N < 80% in the stage 1 meta-analysis, researchers
from BHS2 were unable to genotype rs3743563 and
so undertook genotyping for rs12447804 instead.
See Table 1 for definitions of all study abbreviations.

© 2011 Nature America, Inc. All rights reserved.
© 2011 Nature America, Inc. All rights reserved.
1084 VOLUME 43 | NUMBER 11 | NOVEMBER 2011 Nature GeNetics
A R T I C L E S
This was the only new region to show genome-wide association with
both traits. CDC123 encodes a homolog of a yeast cell-division–cycle
protein that plays a critical role in modulating eukaryotic initiation
factor 2 in times of cell stress
20
. The fourth signal for FEV
1
/FVC
is downstream of HDAC4, which encodes a histone deacetylase;
reductions in the expression of other histone deacetylases (specifi-
cally HDAC2, HDAC5 and HDAC8) have been noted in COPD
21
.
The regions we observed in the MHC are much more difficult to local-
ize, with multiple genes being tagged by the top SNP, including non-
synonymous SNPs in ZKSCAN3, PGBD1, ZSCAN12, ZNF323, TCF19,
LTA, C6orf15 and GPANK1 (also known as BAT4) (Supplementary
Table 2). At 6p21.33, we observed the strongest association with
lung function for rs2857595, which is in LD (r
2
= 0.47) with a non-
synonymous SNP in LTA (encoding lymphotoxin α) and with a SNP
in the upstream promoter region of TNFA (encoding tumor necrosis
factor α) (r
2
= 0.86), both of which are plausible candidates
22,23
. Our
top SNP in MMP15 is in strong LD (r
2
= 1) with a non-synonymous
SNP (rs3743563, which has an association with FEV
1
/FVC at
P = 1.8 × 10
−7
) within the same gene. Among the plausible mecha-
nisms implicated by the other new signals of association with lung
function reported here is TGF-β signaling; TGFB2 expression is
upregulated in bronchial epithelial cells in asthma
24
. The putative
function of key genes (as defined by LD with the leading SNP) in each
of the 16 loci, and relevant findings from animal models, are summa-
rized in Table 2 and are detailed in Supplementary Table 2.
Associations with lung function in children
Alleles representing 11 of the 16 new loci showed directionally con-
sistent effects on lung function in 6,281 children (7–9 years of age)
(Supplementary Table 3a), suggesting that genetic determination
of lung function in adults may in part act through effects on lung
development, or alternatively, that some genetic determinants of
lung growth and lung function decline are shared.
Association of lung function loci with other traits
Although we stratified for ever smoking versus never smoking, we
did not adjust for the amount smoked. In order to investigate the
possibility that the associations at any of our 16 new regions were
driven by an effect of the SNP on smoking behavior, we evaluated
in silico data for associations with smoking amount from the
Oxford-GlaxoSmithKline (Ox-GSK) consortium
25
for the leading
SNPs in these 16 regions. None of these 16 SNPs showed statistically
significant association with the number of cigarettes smoked per day
(Supplementary Table 3b).
Table 1 Loci associated with lung function
Stage 1 Stage 2 Joint meta-analysis of all stages
SNP ID Chr.
NCBI36
position
Nearest
gene
Coded
allele Measure
β
(s.e.m.) P
Coded
allele
freq. N
β
(s.e.m.) P
Coded
allele
freq. N
β
(s.e.m.) P
rs2284746 1 17,179,262 MFAP2
(intron)
G FEV
1
/FVC
FEV
1
−0.042 (0.007)
0.008 (0.007)
2.47 × 10
−9
2.78 × 10
−1
0.516 45,944 −0.038 (0.007)
0.006 (0.007)
2.64 × 10
−7
3.70 × 10
−1
0.522 35,371 −0.04 (0.005)
0.007 (0.005)
7.50 × 10
−16
1.48 × 10
−1
rs993925 1 216,926,691 TGFB2
(downstream)
T FEV
1
/FVC
FEV
1
0.040 (0.007)
0.025 (0.007)
2.54 × 10
−7
1.51 × 10
−3
0.308 42,402 0.023 (0.01)
0.003 (0.007)
1.76 × 10
−2
7.29 × 10
−1
0.348 21,414 0.034 (0.006)
0.014 (0.005)
1.16 × 10
−8
8.71 × 10
−3
rs12477314 2 239,542,085 HDAC4
(downstream)
T FEV
1
/FVC 0.052 (0.008) 4.48 × 10
−9
0.202 45,585 0.031 (0.008) 8.41 × 10
−5
0.206 45,821 0.041 (0.006) 1.68 × 10
−12
FEV
1
0.032 (0.008) 2.77 × 10
−4
0.025 (0.007) 1.82 × 10
−4
0.028 (0.005) 1.02 × 10
−7
rs1529672 3 25,495,586 RARB
(intron)
C FEV
1
/FVC −0.060 (0.009) 7.75 × 10
−10
0.829 40,624 −0.038 (0.009) 1.16 × 10
−5
0.831 45,466 −0.048 (0.006) 3.97 × 10
−14
FEV
1
−0.037 (0.009) 1.78 × 10
−4
−0.011 (0.007) 9.33 × 10
−2
−0.020 (0.006) 2.16 × 10
−4
rs1344555 3 170,782,913 MECOM
(intron)
T FEV
1
/FVC −0.019 (0.008) 2.61 × 10
−2
0.205 46,067 −0.017 (0.012) 1.55 × 10
−1
0.209 21,313 −0.018 (0.007) 6.65 × 10
−3
FEV
1
−0.042 (0.008) 1.91 × 10
−6
−0.025 (0.009) 6.44 × 10
−3
−0.034 (0.006) 2.65 × 10
−8
rs153916 5 95,062,456 SPATA9
(upstream)
T FEV
1
/FVC −0.033 (0.007) 2.06 × 10
−6
0.552 47,530 −0.025 (0.009) 6.67 × 10
−3
0.535 21,647 −0.031 (0.005) 2.12 × 10
−8
FEV
1
−0.001 (0.007) 8.91 × 10
−1
0.004 (0.007) 6.22 × 10
−1
0.001 (0.005) 8.20 × 10
−1
rs6903823 6 28,430,275 ZKSCAN3 (intron)/
ZNF323 (intron)
G FEV
1
/FVC −0.027 (0.008) 2.28 × 10
−3
0.209 47,057 −0.013 (0.011) 2.34 × 10
−1
0.246 21,489 −0.021 (0.007) 1.19 × 10
−3
FEV
1
−0.046 (0.008) 2.00 × 10
−7
−0.029 (0.008) 4.75 × 10
−4
−0.037 (0.006) 2.18 × 10
−10
rs2857595 6 31,676,448 NCR3
(upstream)
G FEV
1
/FVC 0.049 (0.009) 7.86 × 10
−8
0.809 45,540 0.028 (0.008) 5.36 × 10
−4
0.796 46,107 0.037 (0.006) 2.28 × 10
−10
FEV
1
0.040 (0.009) 1.46 × 10
−5
0.017 (0.007) 9.41 × 10
−3
0.025 (0.005) 1.30 × 10
−6
rs2798641 6 109,374,743 ARMC2
(intron)
T FEV
1
/FVC −0.047 (0.009) 2.81 × 10
−7
0.183 46,369 −0.030 (0.012) 1.57 × 10
−2
0.179 21,173 −0.041 (0.007) 8.35 × 10
−9
FEV
1
−0.046 (0.009) 5.39 × 10
−7
−0.009 (0.01) 3.35 × 10
−1
−0.030 (0.006) 4.69 × 10
−6
rs7068966 10 12,317,998 CDC123
(intron)
T FEV
1
/FVC 0.045 (0.007) 1.28 × 10
−10
0.519 47,085 0.023 (0.006) 3.86 × 10
−4
0.518 46,067 0.033 (0.005) 6.13 × 10
−13
FEV
1
0.040 (0.007) 1.19 × 10
−8
0.022 (0.005) 3.56 × 10
−5
0.029 (0.004) 2.82 × 10
−12
rs11001819 10 77,985,230 C10orf11
(intron)
G FEV
1
/FVC −0.019 (0.007) 6.50 × 10
−3
0.522 45,546 −0.006 (0.006) 3.17 × 10
−1
0.506 45,932 −0.012 (0.005) 7.58 × 10
−3
FEV
1
−0.041 (0.007) 1.42 × 10
−8
−0.022 (0.005) 3.10 × 10
−5
−0.029 (0.004) 2.98 × 10
−12
rs11172113 12 55,813,550 LRP1
(intron)
T FEV
1
/FVC −0.035 (0.007) 1.36 × 10
−6
0.607 45,387 −0.026 (0.01) 5.83 × 10
−3
0.590 20,509 −0.032 (0.006) 1.24 × 10
−8
FEV
1
−0.021 (0.007) 3.55 × 10
−3
−0.003 (0.007) 6.94 × 10
−1
−0.013 (0.005) 1.19 × 10
−2
rs1036429 12 94,795,559 CCDC38
(intron)
T FEV
1
/FVC 0.049 (0.008) 1.24 × 10
−8
0.200 47,814 0.028 (0.008) 3.35 × 10
−4
0.214 46,311 0.038 (0.006) 2.30 × 10
−11
FEV
1
0.010 (0.008) 2.67 × 10
−1
0.004 (0.006) 5.38 × 10
−1
0.006 (0.005) 2.26 × 10
−1
rs12447804 16 56,632,783 MMP15
(intron)
T FEV
1
/FVC −0.053 (0.009) 7.12 × 10
−8
0.208 35,123 −0.021 (0.01) 4.20 × 10
−2
0.222 24,398 −0.038 (0.007) 3.59 × 10
−8
FEV
1
−0.017 (0.009) 8.02 × 10
−2
0.004 (0.007) 5.71 × 10
−1
−0.004 (0.006) 4.73 × 10
−1
rs2865531 16 73,947,817 CFDP1
(intron)
T FEV
1
/FVC 0.039 (0.007) 2.30 × 10
−8
0.418 47,594 0.024 (0.006) 1.94 × 10
−4
0.409 46,304 0.031 (0.005) 1.77 × 10
−11
FEV
1
0.024 (0.007) 6.30 × 10
−4
0.011 (0.005) 3.89 × 10
−2
0.016 (0.004) 1.09 × 10
−4
rs9978142 21 34,574,109 KCNE2
(upstream)
T FEV
1
/FVC −0.048 (0.009) 8.23 × 10
−7
0.156 44,577 −0.031 (0.013) 1.75 × 10
−2
0.149 20,944 −0.043 (0.008) 2.65 × 10
−8
FEV
1
−0.012 (0.009) 2.47 × 10
−1
−0.015 (0.01) 1.35 × 10
−1
−0.013 (0.007) 5.57 × 10
−2
Shown are FEV
1
and FEV
1
/FVC results for the leading SNPs, ordered by chromosome and position for each independent locus associated (P < 5 × 10
−8
) with FEV
1
or FEV
1
/FVC in a joint analysis
of up to 94,612 individuals of European ancestry from the SpiroMeta-CHARGE GWAS (stage 1) and follow up (stage 2). Two-sided P values are given for stage 1, stage 2 and the joint meta-analysis
of all stages. P values reaching genome-wide significance (P < 5 × 10
−8
) in the joint meta-analysis of all stages are indicated in bold. SNPs reaching independent replication in stage 2
(P = 0.05/34 = 1.47 × 10
−3
) are indicated with their stage 2 P value in bold. The sample sizes (N) shown are the effective sample sizes. The effective sample size within each study is the
product of sample size and the imputation quality metric. The joint meta-analysis includes data from stage 1 and stage 2.
β
values reflect effect-size estimates on an inverse-normal transformed
scale after adjustments for age, age
2
, sex, height and ancestry principal components. The estimated proportion of the variance explained by each SNP can be found in Supplementary Table 6.
Chr., chromosome; freq., frequency.

© 2011 Nature America, Inc. All rights reserved.
© 2011 Nature America, Inc. All rights reserved.
Nature GeNetics VOLUME 43 | NUMBER 11 | NOVEMBER 2011 1085
A R T I C L E S
In addition, in our stage 1 and 2 datasets combined, we assessed
whether the estimated effect sizes of the variants on lung function
phenotypes differed substantially between ever smokers and never
smokers (Supplementary Table 4) across the 16 loci. For the most
strongly associated trait at each locus, we tested the SNP interaction
with ever smoking versus never smoking. None of the 16 new loci
showed a significant interaction (Bonferroni-corrected threshold for
16 independent SNPs P = 0.003125). These analyses suggest that the
genetic effects we have identified underlie lung function variability
irrespective of smoking exposure.
We adjusted our lung function associations for height, but there
are some overlaps between loci associated with height and those
associated with lung function. Therefore, we evaluated in silico data
for height associations of our new regions in the GIANT consortium
14
dataset. The G allele of rs2284746 (in an intron of MFAP2), which was
associated with decreased FEV
1
/FVC, was associated with increased
height (Supplementary Table 3c).
Given reported associations between lung cancer and either COPD
or lung function decline, we also assessed in silico data for sentinel or
proxy SNPs in these 16 regions for associations with lung cancer in the
International Lung Cancer Consortium (ILCCO) GWAS meta-analysis
26
.
Alleles associated with reduced lung function were associated with risk of
lung cancer at the strongest available proxy SNP for rs2857595 (upstream
of NCR3) at 6p21.33 (rs3099844, r
2
= 0.67) and for the strongest proxy
SNP for rs6903823 (a SNP in an intron of ZKSCAN3 and ZNF323) at
6p22.1 (rs209181, r
2
= 0.69) (lung cancer associations at P = 2.2 × 10
−7
and P = 3.4 × 10
−5
, respectively; Supplementary Table 3d). We saw no
significant associations with lung cancer at the other new loci (proxy SNPs
were available for 15 of the 16 loci, Bonferroni-corrected P < 0.0033).
In addition to the effects on height, smoking and lung cancer
described above, we examined the literature for evidence of associations
with other traits for each of the 16 new loci (detailed in Supplementary
Table 2). Genome-wide significant associations (P < 5 × 10
−8
) have
been reported in KCNE2 with myocardial infarction
27
and at 6p21.33
near NCR3-AIF1 with neonatal lupus
28
and systemic lupus erythema-
tosus
29
. Other significant complex disease associations have also been
noted in the regions of CDC123 (type 2 diabetes
30
), CFDP1 (type 1
diabetes
31
) and MECOM (blood pressure
32,33
), but with weaker LD
(r
2
< 0.3) being seen between the reported SNP and the sentinel SNP
for lung function in the region (Supplementary Table 2).
Proportion of variance explained by loci discovered to date
Associations in ten loci previously reported for lung function
8,9
reached
genome-wide significance (P < 5 × 10
−8
) in our stage 1 data, namely loci
in or near TNS1, FAM13A, GSTCD-NPNT, HHIP, HTR4, ADAM19,
AGER, GPR126, PTCH1 and TSHD4 (Supplementary Table 5a).
Thus, a total of 26 regions showed genome-wide significant associa-
tion with lung function in our study. In aggregate, variants at these 26
regions explain approximately 3.2% of the additive polygenic variance
for FEV
1
/FVC and 1.5% of the variance for FEV
1
(Supplementary
Note). Following the approach previously described
34
, we estimated
that there are a total of 102 (95% confidence interval 57–155) inde-
pendent variants with similar effect sizes to the 26 variants we report
here. In combination, these 102 variants, comprising 26 discovered
variants and 76 putative undiscovered variants, collectively explain
around 7.5% of the additive polygenic variance for FEV
1
/FVC and
3.4% of the variance for FEV
1
(Online Methods, Supplementary
Table 6 and Supplementary Note).
Table 2 Expression profiling of candidate genes in the lung and periphery
Tissue
Sentinel SNP
(relationship to gene) Chr. Gene Putative function of encoded protein Lung HASM HBEC PBMC
rs993925 (intron) 1 TGFB2 Cytokine with roles in pro-fibrotic cytokine modulating epithelial repair mechanisms and extracellular matrix
homeostasis including collagen deposition
40
.
+ +
rs2284746 (intron) 1 MFAP2 Major antigen of elastin-associated microfibrils
17
and a candidate for involvement in the etiology of inherited
connective tissue diseases.
+ + +
rs12477314 (downstream) 2 HDAC4 Deacetylase of histone surrounding DNA thus influencing transcription factor access to the DNA and possibly
repressing gene transcription.
+ + + +
rs1344555 (intron) 3 EVI1 Zinc finger transcription factor, encoded as part of MECOM (MDS1-EVI1 complex locus). + + +
rs1529672 (intron) 3 RARB Nuclear retinoic acid receptor responsive to retinoic acid, a vitamin A derivative and which also controls cell
proliferation and differentiation.
+ + + +
rs153916 (intron) 5 SPATA9 Initially identified as a mediator of spermatogenesis, other family members may have a role in pancreatic
development and
β
-cell proliferation
41
.
+ + + +
rs2798641 (intron) 6 ARMC2 Function unknown, although other family members have been identified as having roles in cell signaling,
protein degradation and cytoskeleton functions
42
.
+ + + +
rs2857595 (upstream) 6 NCR3 Required for efficient cytotoxicity responses by natural killer cells against normal cells and tumors
43
. + +
rs6903823 (intron) 6 ZKSCAN3 Transcription factor involved in cell growth, cell cycle and signal transduction. + + + +
rs7068966 (intron) 10 CDC123 Homolog in yeast shown to be a critical control protein modulating eukaryotic initiation factor 2 in times
of cell stress.
+ + + +
rs11001819 (intron) 10 C10orf11 Function unknown. + + + +
rs11172113 (intron) 12 LRP1 Potentially diverse roles including cell signaling and migration
44
. + + + +
rs1036429 (intron) 12 CCDC38 Function unknown, although other family members involved in a diverse array of functions skeletal and
motor function
45
.
rs1036429 (r
2
= 0.96
with rs4762633 in SNRPF)
12 SNRPF Small nuclear ribonucleoprotein F. + + + +
rs12447804 (intron) 16 MMP15 Member of a large protease family with diverse functional roles via protease activity and specificity including
tissue remodeling, wound healing, angiogenesis and tumor invasion.
+ + +
rs2865531 (intron) 16 CFDP1 Craniofacial development protein 1. + + + +
rs9978142 (upstream) 21 KCNE2 KCNQ1-KCNE2 K+ channels may modulate transepithelial anion secretion in Calu3 airway epithelial cells
46
. + +
Reference gene 12 GAPDH + + + +
+ indicates the gene is expressed in the cell type used, and − indicates that we did not detect the gene expression at the mRNA level following 40 cycles of PCR. PCR profiling of gene transcripts
in the human lung showed expression of all candidates except CCDC38, for which two sets of primers were designed and tested under different optimization conditions. None of these assays
detected expression of CCDC38 in the cell types analyzed. We instead assayed SNRPF, which neighbors CCDC38 and harbors SNPs in strong LD with CCDC38s sentinel SNP. All PCR products
were sequence verified. We used GAPDH (encoding glyceraldehyde-3-phosphate dehydrogenase) as a positive control for the complementary DNA, and this gene was expressed in all tissues.
Chr., chromosome; HASM, human airway smooth muscle; HBEC, human bronchial epithelial cells; PBMC, peripheral blood mononuclear cells.

© 2011 Nature America, Inc. All rights reserved.
© 2011 Nature America, Inc. All rights reserved.
1086 VOLUME 43 | NUMBER 11 | NOVEMBER 2011 Nature GeNetics
A R T I C L E S
DISCUSSION
In meta-analysis of 23 studies comprising 48,201 individuals of
European ancestry and follow up in 17 studies comprising up to 46,411
individuals, we report genome-wide significant associations with an
additional 12 regions for FEV
1
/FVC, an additional 3 regions for FEV
1
and 1 additional region associated with both FEV
1
and FEV
1
/FVC.
We also confirmed genome-wide association with ten regions previ-
ously associated with lung function, bringing to 26 the total number
of loci associated with lung function from analyses of these datasets.
Most of the new loci are in regions not previously suspected to have
been involved in lung development, the control of pulmonary func-
tion or the risk of developing COPD. Elucidating the mechanisms
through which these regions influence lung function should lead to
a more complete understanding of lung function regulation and the
pathogenesis of COPD. Four of the new loci (MFAP2, ZKSCAN3, near
NCR3 and near KCNE2) that we showed to be associated with lung
function are also associated with other complex traits and diseases
(with P < 5 × 10
−8
for the other trait at a SNP having r
2
> 0.3 with the
top lung function SNP in the region). Understanding the intermedi-
ates underlying these pleiotropic effects could also lead to crucial
insights into the pathophysiology of lung disease. One potential expla-
nation is that these loci underlie control of the mechanisms regulat-
ing the development and resolution of inflammation and subsequent
tissue remodeling in a range of tissues.
The effect sizes of the variants in the 26 loci associated with lung
function collectively explain a modest proportion of the additive genetic
variance in FEV
1
/FVC and in FEV
1
, even after accounting for putative
undetected variants with a similar distribution of effect sizes
34
. Our
findings are consistent with those from other common complex traits,
where it is thought that many as yet unidentified common and rare
sequence variants, and potentially structural variants, could explain the
remaining heritability
35
. That our study more than doubled the number
of loci known to be associated with lung function underlines the utility
of large sample sizes to achieve the power to detect common variants
associated with complex traits. Nevertheless, it is likely that additional
variants with similar effect sizes remain undiscovered
14
. In addition,
our study was not designed to detect rare variants or structural variants
associated with lung function. Identification of rare variants associated
with lung function could be helpful in narrowing the scope of ongoing
functional work to those genes most likely to be causally related to the
association signals we detected.
Our study focused on cross-sectional measures of lung function.
Adult lung function at a particular time point is influenced by the
peak lung function achieved by 25–35 years of age as well as the rate of
decline of lung function after that peak
36
. The 26 loci now confirmed
to be associated with lung function could affect either pre- or post-
natal lung development and growth or decline in lung function during
adulthood, or both. We showed consistent directions of estimated
effects on lung function between adults and children 7–9 years of age
for SNPs at 11 of the 16 new loci and 8 of the 10 previously reported
loci (Supplementary Table 3a). The results we show for lung function
in children provide some indication that these loci affect lung function
development, although studies in larger populations of children would
provide greater clarity for SNPs in the new loci. Further investigations
will be required in large populations with longitudinal data to deline-
ate the influence of these variants on the rates of development of, and
decline in, lung function and on the risk of developing COPD.
Of the sentinel SNPs at the 16 new loci associated with lung function,
only rs2284746 (MFAP2) was associated with height in the GIANT
consortium
14
dataset. The G allele of rs2284746 was associated with
both increased height and reduced lung function. A similar relationship
between lung function and height was previously reported for the
G allele of rs3817928 in GPR126 (refs. 8,14), which is associated with
decreased height but with increased FEV
1
/FVC. A further 3 of the
180 loci found to be associated with height
14
showed association (for
the 180 loci, we used a Bonferroni-corrected threshold of P = 2.8 ×
10
−4
) with either FEV
1
(CLIC4 and BMP6) or FEV
1
/FVC (PIP4K2B)
(Supplementary Table 3e). In each case, the allele associated with an
increase in height was associated with a decrease in lung function. This
is not the case for the association of rs1032296 near HHIP, which has
shown consistent directions of effects on lung function and height
11,14
.
However, the strongest SNP associated with height in the HHIP region
lies within an intron of HHIP but shows no association with FEV
1
or
FEV
1
/FVC. Furthermore, although height is an important predictor of
FEV
1
, this is not true for its ratio to FVC
37
. These observations argue
against the associations with lung function at these loci being simply
caused by incomplete adjustment for height.
We stratified by ever- and never-smoker status in our analyses, and
in our investigation of amount smoked in the Ox-GSK consortium
25
,
none of the sentinel SNPs in the 16 new regions showed association
with the number of cigarettes smoked per day. Additionally, none of
these regions was associated with ever smoking in the Ox-GSK con-
sortium data (Supplementary Table 3b). Thus, the SNP associations
with lung function we observed are unlikely to have arisen simply as
a consequence of inadequate adjustment for smoking.
We did not observe any interactions with ever smoking for any of
the sentinel SNPs in the 16 new regions that exceeded a Bonferroni-
corrected significance level (for 16 SNPs). Thus, the effects on lung
function of the newly associated variants we identified are apparent in
both ever smokers and in never smokers, and the effects of smoking
and of these genetic variants may be independent and additive.
In other common complex diseases, follow-up studies that incor-
porate common genetic risk variants into models to predict disease
have not been shown to add substantially to existing risk models,
particularly when such models already include family history
38,39
. The
same may also prove to be true for the 26 genetic variants described
in this paper, as the effect size of any individual variant is small, but
further work is required in this area. The major utility of our findings
will be in the knowledge they provide about previously unknown
pathways underlying lung function. Elucidating the mechanisms that
these genes are involved in will lead to improved understanding of
the regulation of lung function and potentially to new therapeutic
targets for COPD.
URLs. R, http://www.r-project.org/.
METHODS
Methods and any associated references are available in the online
version of the paper at http://www.nature.com/naturegenetics/.
Note: Supplementary information is available on the Nature Genetics website.
ACKNOWLEDGMENTS
We thank the many colleagues who contributed to collection and phenotypic
characterization of the clinical sampling, genotyping and analysis of the data. We
especially thank those who kindly agreed to participate in the studies.
Major funding for this work is from the following sources (alphabetical): Academy
of Finland (project grants 104781, 120315, 129269, 1114194, Center of Excellence
in Complex Disease Genetics (213506 and 129680) and SALVE); Althingi
(Icelandic Parliament); Arthritis Research Campaign; Asthma UK; AstraZeneca;
AXA Research Fund; Biotechnology and Biological Sciences Research Council
(BBSRC) (BB/F019394/1, G20234); British Heart Foundation (PG/97012,
PG/06/154/22043, FS05/125); British Lung Foundation; Canadian Institutes of
Health Research (Grant ID MOP-82893); Cancer Research United Kingdom;

Citations
More filters
Journal ArticleDOI

Genome-wide patterns of selection in 230 ancient Eurasians

TL;DR: A genome-wide scan for selection using ancient DNA is reported, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data.
Journal ArticleDOI

GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment

Cornelius A. Rietveld, +230 more
- 21 Jun 2013 - 
TL;DR: In this article, a genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490 individuals, and three independent SNPs are genome wide significant (rs9320913, rs11584700, rs4851266).
Journal ArticleDOI

Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis.

TL;DR: The results suggest that genes involved in host defense, cell-cell adhesion and DNA repair contribute to risk of fibrotic IIPs.
Journal ArticleDOI

Meta-analysis of 375,000 individuals identifies 38 susceptibility loci for migraine

Padhraig Gormley, +133 more
- 01 Aug 2016 - 
TL;DR: For example, the authors identified 44 independent single-nucleotide polymorphisms (SNPs) significantly associated with migraine risk (P < 5 × 10−8) that mapped to 38 distinct genomic loci, including 28 loci not previously reported and a locus that to date is the first to be identified on chromosome X.
References
More filters
Journal ArticleDOI

Inference of population structure using multilocus genotype data

TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Journal ArticleDOI

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Journal ArticleDOI

Standardisation of spirometry

TL;DR: This research presents a novel and scalable approach called “Standardation of LUNG FUNCTION TESTing” that combines “situational awareness” and “machine learning” to solve the challenge of integrating nanofiltration into the energy system.
Journal ArticleDOI

Projections of Global Mortality and Burden of Disease from 2002 to 2030

TL;DR: These projections represent a set of three visions of the future for population health, based on certain explicit assumptions, which enable us to appreciate better the implications for health and health policy of currently observed trends, and the likely impact of fairly certain future trends.
Related Papers (5)

Genome-wide association study identifies five loci associated with lung function.

Emmanouela Repapi, +109 more
- 01 Jan 2010 -