1
Combining clinical and polygenic risk improves stroke
prediction among individuals with atrial fibrillation.
The integration of genomic and clinical risk
Jack W. O’Sullivan, MBBS, DPhil,
a
Anna Shcherbina, MS,
a,b
Johanne M Justesen, PhD,
b
Mintu Turakhia,
MD,
a,c,d
Marco Perez, MD,
a
Hannah Wand, MS,
a
Catherine Tcheandjieu, PhD,
a
Shoa L. Clarke, MD,
PhD,
a
Robert A. Harrington, MD,
a
Manuel A. Rivas, DPhil,
b
Euan A Ashley, MB, ChB, DPhil.
a,e
a. Division of Cardiology, Department of Medicine, Stanford University School of Medicine, Stanford,
California, USA.
b. Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
c. Center for Digital Health, Stanford University School of Medicine, Stanford, California, USA
d. Veterans Affairs Palo Alto Health Care System, Palo Alto, California, USA
e. Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
Address for correspondence: Dr Jack O’Sullivan or Professor Euan Ashley
Division of Cardiology
Department of Medicine
Stanford University, California, USA, 94304
jackos@stanford.edu
or euan@stanford.edu
(650) 736-7878
@DrJackOSullivan or @euanashley
Funding: The lead author (JOS) was supported by an NIH T32 grant, otherwise, there is no specific
funding.
Disclosures: EA (founder, advisor Personalis; founder, advisor Deepcell; advisor SequenceBio; advisor
Foresite Labs; advisor Apple)
Word count: 4974
Number of references: 30
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted June 20, 2020. ; https://doi.org/10.1101/2020.06.17.20134163doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
2
Abstract
Background
Atrial fibrillation (AF) is associated with a five-fold increased risk of ischemic stroke. A portion
of this risk is heritable, however current risk stratification tools (CHA2DS2-VASc) don’t include
family history or genetic risk. We hypothesized that we could improve ischemic stroke
prediction in patients with AF by incorporating polygenic risk scores (PRS).
Objectives
To construct and test a PRS to predict ischemic stroke in patients with AF, both independently
and integrated with clinical risk factors.
Methods
Using data from the largest available GWAS in Europeans, we combined over half a million
genetic variants to construct a PRS to predict ischemic stroke in patients with AF. We externally
validated this PRS in independent data from the UK Biobank (UK Biobank), both independently
and integrated with clinical risk factors.
Results
The integrated PRS and clinical risk factors risk tool had the greatest predictive ability.
Compared with the currently recommended risk tool (CHA2DS2-VASc ), the integrated tool
significantly improved net reclassification (NRI: 2.3% (95%CI: 1.3% to 3.0%)), and fit (
χ
2 P
=0.002). Using this improved tool, >115,000 people with AF would have improved risk
classification in the US. Independently, PRS was a significant predictor of ischemic stroke in
patients with AF prospectively (Hazard Ratio: 1.13 per 1 SD (95%CI: 1.06 to 1.23))). Lastly,
polygenic risk scores were uncorrelated with clinical risk factors (Pearson’s correlation
coefficient: -0.018).
Conclusions
In patients with AF, there appears to be a significant association between PRS and risk of
ischemic stroke. The greatest predictive ability was found with the integration of PRS and
clinical risk factors, however the prediction of stroke remains challenging.
Key words: Atrial fibrillation, stroke, genetics, cardiology, prediction, clinical risk tool.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted June 20, 2020. ; https://doi.org/10.1101/2020.06.17.20134163doi: medRxiv preprint
3
Abbreviations
1. CHA2DS2-VASc : Acronym of the currently recommended tool for the risk stratification
of ischemic stroke in patients with AF. C = Congestive Heart Failure, H = Hypertension,
A
2
=Age (over 65 or over 75), D = Diabetes Mellitus, S = Stroke, V = Vascular Disease,
S = Sex
2. CHA2DS2-VASc -G: A proposed term for the integrated genetic and clinical risk
stratification tool, where G = Polygenic risk score.
3. GWAS: Genome-wide association study
4. AF: Atrial Fibrillation
5. SNV: Single nucleotide variant (polymorphism)
6. PRS: Polygenic risk score
7. SD: Standard deviation
8. NRI: Net reclassification index
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted June 20, 2020. ; https://doi.org/10.1101/2020.06.17.20134163doi: medRxiv preprint
4
Genomic and clinical risk score for the prediction of
ischemic stroke in atrial fibrillation.
Jack W O’Sullivan, MBBS, DPhil,
a
Anna Shcherbina, MS,
a,b
Johanne M Justesen, PhD,
b
Mintu
Turakhia, MD,
a,c,d
Marco Perez, MD,
a
Hannah Wand, MS,
a
Shoa Clarke, MD, PhD,
a
Robert A.
Harrington, MD,
a
Manuel A. Rivas, DPhil,
b
Euan A Ashley, MB, ChB, DPhil.
a,e
Introduction
Atrial fibrillation (AF) is the most common cardiac arrhythmia and its prevalence is increasing
(1). Atrial fibrillation itself can cause substantial morbidity, including a 5-fold increased risk of
ischemic stroke (2).
To help prevent the thromboembolic complications of AF, selected patients are offered
prophylactic anticoagulation. This prophylaxis is highly effective in the right patient (3–5), but
the selection of these patients remains difficult (6, 7). The current gold standard risk stratification
tool is an amalgamation of clinical risk factors (CHA2DS2-VASc ) (4). However, there are
limitations in the development, validation and performance of CHA2DS2-VASc . Most notably
the small number of AF patients in the development (n=1,084) (6), and short follow up, small
numbers, and conflicting performance in validation studies (8). Additionally, CHA2DS2-VASc
tool does not include family history or genetic risk of ischemic stroke, despite evidence
suggesting the risk of ischemic stroke is heritable (~40% heritability) (9). Previous research has
shown that polygenic risk scores are comparable to clinical risk factors in the prediction of
ischemic stroke in the general population (10), however this has not been extended into patients
with AF, nor did it examine CHA2DS2-VASc .
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted June 20, 2020. ; https://doi.org/10.1101/2020.06.17.20134163doi: medRxiv preprint
5
Given the known heritability of ischemic stroke, and the apparent need to improve the existing
gold standard risk tool (CHA2DS2-VASc ), we set out to construct a polygenic risk score (PRS),
and then an integrated genetic and clinical risk tool (CHA2DS2-VASc + PRS) to help predict
which patients with AF will go on to develop ischemic stroke.
Methods
Study design
We followed a similar study design to previously published PRS papers (10–14); in line with
recommended methodological (15) and reporting guidance (16). We will briefly describe the five
broad steps we completed in this paragraph (Figure1), and then we elaborate on each of these
steps individually in the below paragraphs. The five steps were: 1. Curation of previously
published GWAS summary statistics, 2. Accounting for linkage disequilibrium (LD) in GWAS
summary statistics, using the R package lassosum (17) 3. Construction of PRS (see eMethods) in
our UK Biobank prevalent cohort. Eighty different PRS were constructed across the lassosum
hyperparameters (
λ
and s). 4. Determining the most accurate PRS in the UK Biobank prevalent
Cohort. 5. The PRS with the greatest predictive accuracy (from step 4) was then validated in the
UK Biobank incident cohort.
We attained GWAS summary statistics from the MEGASTROKE consortium
(http://www.megastroke.org/
). This GWAS was performed on 446,696 participants (40,585 cases
(stroke); 406,111 noncases (no stroke)) and stratified results by ancestry and stroke sub-type
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted June 20, 2020. ; https://doi.org/10.1101/2020.06.17.20134163doi: medRxiv preprint