scispace - formally typeset
Search or ask a question
Posted ContentDOI

A phenome-wide association study (PheWAS) of COVID-19 outcomes by race using the electronic health records data in Michigan Medicine

TL;DR: A disease-disease phenome-wide association study using data representing 5,698 COVID-19 patients from a large academic medical center, stratified by race to explore the association of 1,043 pre-occurring conditions with several CO VID-19 outcomes: testing positive, hospitalization, ICU admission, and mortality.
Abstract: Blacks/African Americans are overrepresented in the number of hospitalizations and deaths from COVID-19 in the United States, which could be explained through differences in the prevalence of existing comorbidities. We performed a disease-disease phenome-wide association study (PheWAS) using data representing 5,698 COVID-19 patients from a large academic medical center, stratified by race. We explore the association of 1,043 pre-occurring conditions with several COVID-19 outcomes: testing positive, hospitalization, ICU admission, and mortality. Obesity, iron deficiency anemia and type II diabetes were associated with susceptibility in the full cohort, while ill-defined descriptions/complications of heart disease and stage III chronic kidney disease were associated among non-Hispanic White (NHW) and non-Hispanic Black/African American (NHAA) patients, respectively. The top phenotype hits in the full, NHW, and NHAA cohorts for hospitalization were acute renal failure, hypertension, and insufficiency/arrest respiratory failure, respectively. Suggestive relationships between respiratory issues and COVID-19-related ICU admission and mortality were observed, while circulatory system diseases showed stronger association in NHAA patients. We were able to replicate some known comorbidities related to COVID-19 outcomes while discovering potentially unknown associations, such as endocrine/metabolic conditions related to hospitalization and mental disorders related to mortality, for future validation. We provide interactive PheWAS visualization for broader exploration.

Summary (3 min read)

1. Introduction

  • The emergence of electronic health records (EHR) and rise of EHR-linked biobanks have made it possible for researchers to explore omics-based relationships agnostically on a large scale instead of targeted hypothesis testing.
  • These pre-existing conditions include liver, kidney, heart, and respiratory disease [2,13–17].
  • There have been a number of studies examining differences across racial groups for an ensemble of COVID-19-associated conditions and outcomes in US patient cohorts [13,19–25].
  • A naïve comparison of the positive versus negative test results is highly biased [13].

2.1. Study Design

  • The authors extracted the EHR for patients who were tested for COVID-19 at the University of Michigan Health System, also known as Michigan Medicine (MM), from 10 March 2020 to 2 September 2020.
  • The authors analytic cohort was restricted to those individuals on whom the authors have EHR data for at least 14 days prior to the first COVID-19 test.
  • The authors resulting analytic cohort comprised 47,862 tested/diagnosed patients, of whom 2133 tested positive.
  • Since the testing protocol in MM [26] focused on prioritized testing based on symptoms, exposure, occupation and other patient level factors, this is a non-representative sample of the population.

2.2. Data Processing

  • 2.1. Classifying Patients Who Were Still in Hospital and ICU to Define COVID-19 Outcomes.
  • The authors categorized COVID-19-positive patients into non-hospitalized, hospitalized (includes ICU stays), and hospitalized with ICU stay based on the admission and discharge J. Clin.
  • Each of these traits (PheWAS codes) was coded as a binary risk factor (present/absent) and used as a predictor in the association models with COVID-19 outcomes.
  • While the PheWAS is performed on PheWAS codes, one can view the mapping of ICD-to-PheWAS code relationships on this website: https://prsweb.sph.umich.edu:8443/phecodeData/searchPhecode (accessed on 22 September 2020).
  • A summary data dictionary is available with the source and definition of each variable used in their analysis (Table S1A in Supplement).

2.3. Statistical Analysis

  • Full models were adjusted for age, sex, race, and the neighborhood deprivation index (NDI).
  • PheWAS adjusting for an additional comorbidity score (indicating whether the patient was diagnosed with conditions across seven disease categories associated with COVID-19 susceptibility and adverse outcomes: respiratory, circulatory, any cancer, type II diabetes, kidney, liver, and autoimmune; ranges from 0 to 7) is included as a sensitivity analysis on their accompanying website: https://cphds.sph.umich.edu/covidphewas/ (accessed on 10 March 2021).
  • Results for PheWAS analysis are easier to visualize when −log10(p-values) corresponding to each of the 1363 tests are plotted against the disease codes grouped into disease categories.
  • Table S2 contains descriptive statistics stratified by race.

3. Results

  • There were 53,853 patients who were either tested for or diagnosed with COVID19 who were eligible for inclusion in this study.
  • Of those eligible for inclusion, their study population comprised 47,862 individuals (ntested = 47,862 [npositive = 2133]) who had available ICD code data after applying the 14-day-prior to testing restriction to the EHR.
  • The authors note that the Black cohort is both younger and more female than the White cohort, a trend that also appears in the tested and hospitalized cohorts (Table S2).

3.1. Phenome-Wide Comorbidity Association Analysis

  • The association results for the top 50 traits from the comorbidity PheWAS can be found in Tables S3–S6 for the full cohort, Whites, and Blacks, side-by-side.
  • Models are adjusted for age, sex, race (full cohort only), and three census tract-level socioeconomic indicators: proportion with less than high school education, proportion unemployed, and proportion with annual income below the federal poverty level.
  • The y-axis represents the −log10 transformed p-value of the association.
  • These traits exhibited consistent risks among races for hospitalization and for ICU admission/mortality .

3.1.4. Summary Takeaways

  • When comparing the top 50 traits between Whites and Blacks, acidosis, pulmonary, acute/chronic renal diseases showed an association with hospitalization and ICU admission/mortality in both races, while acute renal consistently stood out as well as in mortality (Table 2).
  • Phenome-wide significant (for the overall cohort) effect estimates for parent phecodes and corresponding confidence intervals for phenome-wide significant traits by outcome by cohort are present in forest plots of parent phecodes and child phecodes .
  • These results enable us to understand the risk profiles that are associated with poor COVID-19 prognosis.
  • It will be interesting to study the association of these pre-existing conditions with post-covid acute complications or “long COVID” syndrome [31,32].

4. Discussion

  • Using data from a cohort of tested/diagnosed COVID-19 patients at MM, the authors performed what they believe is the first PheWAS looking at multiple COVID-19 outcomes stratified by race.
  • This technique allowed us to explore and identify potentially associated conditions across the medical phenome that are associated with susceptibility, hospitalization, ICU admission or mortality.
  • The authors results can inform targeted prevention across racial groups, which includes J. Clin.
  • Third, the sample size for a PheWAS is still rather small to be able to identify statistically significant associations—particularly for mortality.
  • While potentially relevant, the authors did not explore past medication data in their analyses because the available EHR data did not provide comprehensive medication coverage but predominantly medication orders and administrations for hospitalized patients.

5. Conclusions

  • This work contributes to a new area of COVID-19 research that rigorously examines racial differences in disease prognosis with pre-existing conditions captured across the medical phenome.
  • Moreover, the authors incorporated a census tract-level socioeconomic status (SES) covariate, which is important to consider when comparing races [37].
  • Odds ratios and 95% confidence intervals are shown for each trait whose PheWAS code is given in parentheses.
  • Ethical review and approval were waived for this study, due to its qualification for a federal exemption as secondary research for which consent is not required, also known as Institutional Review Board Statement.

Did you find this useful? Give us your feedback

Figures (7)

Content maybe subject to copyright    Report

Journal of
Clinical Medicine
Article
A Phenome-Wide Association Study (PheWAS) of
COVID-19 Outcomes by Race Using the Electronic Health
Records Data in Michigan Medicine
Maxwell Salvatore
1,2,3
, Tian Gu
1,2
, Jasmine A. Mack
1
, Swaraaj Prabhu Sankar
2,4,5
, Snehal Patil
1,2
,
Thomas S. Valley
6,7,8
, Karandeep Singh
8,9
, Brahmajee K. Nallamothu
7,10
, Sachin Kheterpal
8,11
, Lynda Lisabeth
3
,
Lars G. Fritsche
1,2,4,12
and Bhramar Mukherjee
1,2,3,
*


Citation: Salvatore, M.; Gu, T.; Mack,
J.A.; Prabhu Sankar, S.; Patil, S.;
Valley, T.S.; Singh, K.; Nallamothu,
B.K.; Kheterpal, S.; Lisabeth, L.; et al.
A Phenome-Wide Association Study
(PheWAS) of COVID-19 Outcomes by
Race Using the Electronic Health
Records Data in Michigan Medicine.
J. Clin. Med. 2021, 10, 1351. https://
doi.org/10.3390/jcm10071351
Academic Editor: Alessandra Falchi
Received: 10 February 2021
Accepted: 17 March 2021
Published: 25 March 2021
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights,
Ann Arbor, MI 48109, USA; mmsalva@umich.edu (M.S.); gutian@umich.edu (T.G.);
jasamack@umich.edu (J.A.M.); snehal@umich.edu (S.P.); larsf@umich.edu (L.G.F.)
2
Center for Precision Health Data Science, University of Michigan, Ann Arbor, MI 48109, USA;
swarsank@umich.edu
3
Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA;
llisabet@umich.edu
4
Rogel Cancer Center, Michigan Medicine, Ann Arbor, MI 48109, USA
5
Data Office for Clinical and Translational Research, University of Michigan, Ann Arbor, MI 41809, USA
6
Division of Pulmonary and Critical Care Medicine, University of Michigan Medicine,
Ann Arbor, MI 48109, USA; valleyt@med.umich.edu
7
Department of Internal Medicine, Michigan Medicine, Ann Arbor, MI 48109, USA; bnallamo@med.umich.edu
8
Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, MI 48109, USA;
kdpsingh@med.umich.edu (K.S.); sachinkh@med.umich.edu (S.K.)
9
Department of Learning Health Sciences, University of Michigan, Ann Arbor, MI 48109, USA
10
Division of Cardiovascular Medicine, Michigan Medicine, Ann Arbor, MI 48109, USA
11
Department of Anesthesiology, Michigan Medicine, Ann Arbor, MI 48109, USA
12
Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
* Correspondence: bhramar@umich.edu; Tel.: +1-(734)-764-6544
Abstract:
Background: We performed a phenome-wide association study to identify pre-existing
conditions related to Coronavirus disease 2019 (COVID-19) prognosis across the medical phe-
nome and how they vary by race. Methods: The study is comprised of 53,853 patients who were
tested/diagnosed for COVID-19 between 10 March and 2 September 2020 at a large academic medical
center. Results: Pre-existing conditions strongly associated with hospitalization were renal failure,
pulmonary heart disease, and respiratory failure. Hematopoietic conditions were associated with
intensive care unit (ICU) admission/mortality and mental disorders were associated with mortality
in non-Hispanic Whites. Circulatory system and genitourinary conditions were associated with
ICU admission/mortality in non-Hispanic Blacks. Conclusions: Understanding pre-existing clinical
diagnoses related to COVID-19 outcomes informs the need for targeted screening to support specific
vulnerable populations to improve disease prevention and healthcare delivery.
Keywords: biobank; health disparities; EHR; phenome; odds ratio; risk profile
1. Introduction
The emergence of electronic health records (EHR) and rise of EHR-linked biobanks
have made it possible for researchers to explore omics-based relationships agnostically
on a large scale instead of targeted hypothesis testing. Introduced by Denny et al. in
2010, a phenome-wide association study (PheWAS) is an omnibus scan to identify gene–
disease associations across the medical phenome [
1
]. PheWAS typically associates a genetic
variant (
G
) with hundreds of disease codes (
D
j
,
j =
1,
. . .
,
J
) with association models
of the structure
logit
P
D
j
G, Con f ounders)

= β
0j
+ β
Gj
G + β
C
Con f ounders
. Due to
J. Clin. Med. 2021, 10, 1351. https://doi.org/10.3390/jcm10071351 https://www.mdpi.com/journal/jcm

J. Clin. Med. 2021, 10, 1351 2 of 16
computational advances and development of widely available analytic frameworks [26],
PheWAS is now relatively easy to implement. The main goal of a PheWAS is to replicate
known gene–disease relationships and to search for hidden and unanticipated associations
(for example, Li et al. found that there is a strong negative association between the tag single
nucleotide polymorphism (SNP) for blood group O antigen and arm impedance) [
2
,
7
10
].
As of 15 January 2021, there were 23,759,743 confirmed COVID-19 cases in the US [
11
],
representing approximately 25% of all global cases. Because COVID-19 is a respiratory disease
and produces flu-like symptoms, the testing strategies in the US initially focused on those
with symptoms, the elderly, and those with pre-existing conditions [
12
]—i.e., populations who
are at risk of severe disease and complications. Only a handful of pre-existing comorbidities
are known to be associated with experiencing adverse COVID-19-related outcomes. These
pre-existing conditions include liver, kidney, heart, and respiratory disease [2,1317].
There has been a remarkable surge within the academic and medical communities to
conduct rapid research on COVID-19 [
18
]. There have been a number of studies examining dif-
ferences across racial groups for an ensemble of COVID-19-associated conditions and outcomes
in US patient cohorts [
13
,
19
25
]. Instead of a hypothesis-driven approach a priori restricted to
certain disease categories, this study applied an agnostic disease–disease PheWAS framework
to COVID-19 outcomes in a cohort of 53,853 patients who were tested or diagnosed with
COVID-19 at a large academic medical center. We looked at correlates of disease prognosis
among all COVID-19 patients as well as separately among non-Hispanic White (White) and
non-Hispanic Black/African American (Black) patients. The primary objective of this study was
to agnostically identify pre-existing conditions present in an individuals medical record that
may be associated with hospitalization, intensive care unit (ICU) admission, and mortality. We
also present the results from race-stratified susceptibility PheWAS to predict who tests positive
for COVID-19 in the Supplementary Materials. Our reason to downplay the outcome of who
gets COVID-19 or who tests positive for COVID-19 is due to the prioritized testing strategy that
makes this tested sample highly non-representative of the population. A naïve comparison of
the positive versus negative test results is highly biased [
13
]. However, conditional on testing or
being diagnosed positive, downstream prognostic outcomes are less prone to such selection
biases and we primarily focus on these outcomes.
2. Materials and Methods
2.1. Study Design
COVID-19 Cohort
We extracted the EHR for patients who were tested for COVID-19 at the University of
Michigan Health System, also known as Michigan Medicine (MM), from 10 March 2020 to
2 September 2020. A total of 53,260 patients (98.9%) who were tested at MM and 593 patients
(1.1%) who were treated for COVID-19 in MM, but tested elsewhere, constituted our initial
study cohort of 53,853 patients, of whom 2582 tested positive. Our analytic cohort was
restricted to those individuals on whom we have EHR data for at least 14 days prior
to the first COVID-19 test. This restriction is used to eliminate symptoms which may
indicate manifestation of underlying COVID-19 disease or symptoms, whereas our goal
was to search for truly “pre-existing” conditions prior to COVID-19 testing/diagnosis. Our
resulting analytic cohort comprised 47,862 tested/diagnosed patients, of whom 2133 tested
positive. Since the testing protocol in MM [
26
] focused on prioritized testing based on
symptoms, exposure, occupation and other patient level factors, this is a non-representative
sample of the population. Study protocols were reviewed and approved by the University
of Michigan Medical School Institutional Review Board (IRB ID HUM00180294).
2.2. Data Processing
2.2.1. Classifying Patients Who Were Still in Hospital and ICU to Define
COVID-19 Outcomes
We categorized COVID-19-positive patients into non-hospitalized, hospitalized (in-
cludes ICU stays), and hospitalized with ICU stay based on the admission and discharge

J. Clin. Med. 2021, 10, 1351 3 of 16
data. A total of 22 patients were still admitted in the hospital at the time of data extraction
(17 had at least one ICU stay and five had no ICU stay).
2.2.2. Generation of the Medical Phenome
We constructed the medical phenome by extracting available International Classifi-
cation of Diseases (ICD; ninth and tenth editions) codes from EHR and grouping them
into 1813 traits using the PheWAS R package (as described in [
1
]). Each of these traits
(PheWAS codes) was coded as a binary risk factor (present/absent) and used as a predictor
in the association models with COVID-19 outcomes. As mentioned before, to differenti-
ate pre-existing conditions from phenotypes related to COVID-19 testing/treatment, we
applied a 14-day-prior restriction on the tested cohort by removing diagnoses that first
appeared within the 14 days before the first test or diagnosis date, whichever was ear-
lier. The analyses in this study were restricted to 1363 traits that appeared in the EHR
14-day-prior of at least ten COVID-19-positive patients. While the PheWAS is performed
on PheWAS codes, one can view the mapping of ICD-to-PheWAS code relationships on
this website: https://prsweb.sph.umich.edu:8443/phecodeData/searchPhecode (accessed
on 22 September 2020).
2.2.3. Description of Variables
A summary data dictionary is available with the source and definition of each variable
used in our analysis (Table S1A in Supplement).
2.3. Statistical Analysis
We performed PheWAS to identify predictors of three COVID-19 prognostic outcomes
in this study (detailed definition in Table S1B in the Supplement), among those who were
diagnosed/tested positive, comparing:
(i)
those who were hospitalized with those who were not;
(ii)
those who were admitted to ICU or died with those who were not;
(iii)
those who died with those who were alive at the time of data extraction.
We also present results from the susceptibility PheWAS (comparing those who were
diagnosed with COVID-19 with those who were not tested at all [matched controls]) in
the Supplementary Materials. All COVID-19 outcomes of interest are binary; thus, logistic
regression was our primary tool for association analysis. All logistic regression models were of
the following form:
logit p
(
Y
COVID
= 1
|
Covariates, PheCode
j
) = β
0
+ β
>
Cov
Covariates + β
j
I[Phecode
j
= 1] (1)
j =
1,
. . .
, 1363. Here
Y
COVID
is various COVID-19-related outcomes under consider-
ation (e.g., COVID-19 hospitalization, ICU admission, and mortality). The Firth correction
was used to address potential separation issues in logistic regression models. For all
models, adjusted odds ratios (OR), 95% Wald-type confidence interval and p-values were
presented [
27
29
]. Full models were adjusted for age, sex, race, and the neighborhood
deprivation index (NDI). The NDI is defined by US census tract (corresponding to the
residential address available in each patient’s EHR) for the year 2010 and are from the
National Neighborhood Data Archive (NaNDA) [
30
]. PheWAS adjusting for an addi-
tional comorbidity score (indicating whether the patient was diagnosed with conditions
across seven disease categories associated with COVID-19 susceptibility and adverse out-
comes: respiratory, circulatory, any cancer, type II diabetes, kidney, liver, and autoimmune;
ranges from 0 to 7) is included as a sensitivity analysis on our accompanying website:
https://cphds.sph.umich.edu/covidphewas/ (accessed on 10 March 2021). Results for
PheWAS analysis are easier to visualize when
log
10
(p-values) corresponding to each of
the 1363 tests are plotted against the disease codes grouped into disease categories. We
use this visualization tool to present our analysis while all detailed summary results are
available at the website above and in the online supplement.

J. Clin. Med. 2021, 10, 1351 4 of 16
Race-Stratified Analysis
Since the prognostic factors could potentially be different across races, we repeated the
entire analysis stratified by race. We restricted our attention to Whites and Blacks due to
limitations of sample size for other racial groups. Table S2 contains descriptive statistics
stratified by race. We checked for the equality of the log(OR) corresponding to Whites
and Blacks through a Wald test for the difference of the log(OR). A conservative Bonfer-
roni multiple testing correction was implemented to conclude statistically significant results
(p = 0.05/number of tests in analysis), and p < 0.05 was used as a threshold for suggestive traits.
3. Results
There were 53,853 patients who were either tested for or diagnosed with COVID-
19 who were eligible for inclusion in this study. Of those eligible for inclusion, our
study population comprised 47,862 individuals (n
tested
= 47,862 [n
positive
= 2133]) who
had available ICD code data after applying the 14-day-prior to testing restriction to the
EHR. Furthermore, a total of 1813 qualified ICD-code-based phenotypes, referred to as
PheWAS codes, were initially screened, of which 1363 had at least 10 occurrences in our
COVID-19-positive cohort and were included in the analysis.
Of those 53,853 who were tested for COVID-19, 44.2% (23,814) were males and the
median age was 47 years. The majority were White (72.4% (38,977)), while 10.7% were
Black (5763). We note that the Black cohort is both younger and more female than the
White cohort, a trend that also appears in the tested and hospitalized cohorts (Table S2).
Similarly, Blacks tend to have more autoimmune disease, kidney disease, type 2 diabetes,
and circulatory disease diagnoses, while Whites tend to have more cancer diagnoses (note
that our definition of cancer includes skin cancer) in our tested cohort (Table S2). Out of the
study cohort, 4.8% (2582) were tested positive (Table 1). Among the 2582 positive patients,
54.6% (1411) were White, 25.0% (646) were Black, 27.8% (719) were hospitalized, 14.6%
(377) were admitted to ICU and 5.0% (129) died. A flowchart describing the sample sizes of
the overall cohort and race-specific cohorts by COVID-19 outcome is included in Figure S1.
3.1. Phenome-Wide Comorbidity Association Analysis
The association results for the top 50 traits from the comorbidity PheWAS can be found
in Tables S3–S6 for the full cohort, Whites, and Blacks, side-by-side. Interactive versions of
the PheWAS plots are online at https://cphds.sph.umich.edu/covidphewas/ (accessed on
10 March 2021). This online resource also provides tables with the adjusted odds ratios,
95% confidence intervals, p-values, and counts of occurrence in cases and controls for all
traits included in the PheWAS performed.
3.1.1. Full Cohort Prognostic Associations
As the disease outcome progresses (from hospitalized to ICU, and to deceased),
stronger associations with circulatory system, genitourinary (renal diseases in particu-
lar) and respiratory diseases were observed. Forty-four traits, including 12 circulatory
system and 11 respiratory diseases, were phenome-wide significantly associated with
hospitalization, as well as an additional 263 suggestive traits under the threshold of p < 0.05
(Figure 1A)—respiratory failure, insufficiency, arrest (p = 3.98
×
10
20
), acute renal failure
(p = 6.31
×
10
13
), viral pneumonia (p = 2.51
×
10
11
), and acid-base balance disorder
(p = 2.40
×
10
10
). Moreover, 58 phenome-wide significant hits (e.g., respiratory failure,
insufficiency, arrest [p = 1.58
×
10
15
], acid-base balance disorder [p = 3.98
×
10
14
], and
hypotension [p = 1.58
×
10
11
]) as well as 286 suggestive hits were noted for associa-
tion with ICU admission/mortality (Figure 2A), including 77 circulatory system, 36 en-
docrine/metabolic, 35 genitourinary, and 31 respiratory diseases. There were 22 phenome-
wide significant traits associated with COVID-19 mortality (Figure 3A), along with an
additional 227 suggestive traits under the threshold p < 0.05. In addition to 64 circula-
tory system and 31 endocrine/metabolic diseases, 23 mental disorders stood out as the

J. Clin. Med. 2021, 10, 1351 5 of 16
third largest disease group associated with mortality, including delirium due to conditions
classified elsewhere (p = 9.33
×
10
7
), memory loss (p = 3.98
×
10
4
) and aphasia (p = 5.37
×
10
4
).
Table 1.
Descriptive Characteristics of the COVID-19 Tested/Diagnosed cohort at Michigan Medicine (10 March–2 September 2020).
Individuals, No. (%)
a
Tested for COVID-19
Positive Results
Overall Negative Results Overall Hospitalized ICU Deceased
Variable (n = 53,853) (n = 51,271) (n = 2582) (n = 719) (n = 377) (n = 129)
Age, y
Mean (SD) 44.8 (23.1) 44.7 (23.2) 47.4 (20) 58.5 (17.6) 58.6 (17.5) 69 (14.3)
Median (IQR) 47 (38) 46 (38) 49 (31) 61 (23) 61 (22) 71 (22)
<18 6895 (12.8) 6768 (13.2) 127 (4.9) 14 (1.9) 10 (2.7) 0 (0)
[18,35) 12,652 (23.5) 12,017 (23.4) 635 (24.6) 65 (9) 33 (8.8) 3 (2.3)
[35,50) 9273 (17.2) 8697 (17) 576 (22.3) 125 (17.4) 56 (14.9) 11 (8.5)
[50,65) 12,116 (22.5) 11,440 (22.3) 676 (26.2) 224 (31.2) 120 (31.8) 33 (25.6)
[65,80) 10,257 (19) 9825 (19.2) 432 (16.7) 209 (29.1) 124 (32.9) 43 (33.3)
80 2660 (4.9) 2524 (4.9) 136 (5.3) 82 (11.4) 34 (9) 39 (30.2)
Male Gender 23,814 (44.2) 22,651 (44.2) 1163 (45) 403 (56.1) 233 (61.8) 80 (62)
Primary Care in MM 31,357 (58.2) 29,969 (58.5)
1388 (53.8)
253 (35.2) 128 (34) 35 (27.1)
BMI
Mean (SD) 29.1 (7.6) 29.1 (7.6) 30.9 (8.4) 32.6 (10.1) 32.9 (11.5) 31.3 (6.9)
<18.5 826 (1.9) 804 (2) 22 (1) 9 (1.3) 4 (1.1) 1 (0.8)
[18.5,25) 12,857 (29.7) 12,357 (30) 500 (22.9) 102 (14.9) 61 (16.9) 17 (13.7)
[25,30) 13,371 (30.8) 12,723 (30.9) 648 (29.7) 211 (30.9) 110 (30.5) 45 (36.3)
30 16,291 (37.6) 15,281 (37.1)
1010 (46.3)
361 (52.9) 186 (51.5) 61 (49.2)
Smoking Status
Never 31,041 (63.2) 29,549 (63)
1492 (68.7)
368 (60.2) 159 (54.6) 30 (39)
Past 13,725 (28) 13,145 (28) 580 (26.7) 219 (35.8) 120 (41.2) 44 (57.1)
Current 4314 (8.8) 4215 (9) 99 (4.6) 24 (3.9) 12 (4.1) 3 (3.9)
Ever 18,039 (36.8) 17,360 (37) 679 (31.3) 243 (39.8) 132 (45.4) 47 (61)
Alcohol consumption 25,894 (68.4) 24,768 (68.6)
1126 (66.2)
261 (63.2) 128 (63.7) 35 (61.4)
Race/ethnicity
White 38,977 (72.4) 37,566 (73.3)
1411 (54.6)
326 (45.3) 172 (45.6) 56 (43.4)
Black 5763 (10.7) 5117 (10) 646 (25) 265 (36.9) 139 (36.9) 42 (32.6)
Other
b
4869 (9) 4616 (9) 253 (9.8) 63 (8.8) 21 (5.6) 6 (4.7)
Unknown
c
4244 (7.9) 3972 (7.7) 272 (10.5) 65 (9) 45 (11.9) 25 (19.4)
NDI, mean (SD) 0.1 (0.07) 0.1 (0.07) 0.12 (0.09) 0.15 (0.1) 0.16 (0.11) 0.16 (0.11)
Population density
persons/mile
2
2375.8 (2422.1) 2343.2 (2412.8)
2997.3
(2512.8)
3658.7 (2635)
3826.4
(2675.2)
4128.4
(2770.3)
Respiratory Diseases 34,471 (72) 32,850 (71.8) 1621 (76) 399 (79.6) 205 (81.7) 82 (90.1)
Circulatory Diseases 32,419 (67.7) 30,940 (67.7)
1479 (69.3)
428 (85.4) 218 (86.9) 87 (95.6)
Any Cancer 13,831 (28.9) 13,344 (29.2) 487 (22.8) 164 (32.7) 88 (35.1) 42 (46.2)
Type 2 Diabetes 7841 (16.4) 7409 (16.2) 432 (20.3) 191 (38.1) 107 (42.6) 57 (62.6)
Kidney Diseases 7206 (15.1) 6867 (15) 339 (15.9) 194 (38.7) 119 (47.4) 56 (61.5)
Liver Diseases 4406 (9.2) 4234 (9.3) 172 (8.1) 58 (11.6) 32 (12.7) 14 (15.4)
Autoimmune Diseases 7544 (15.8) 7163 (15.7) 381 (17.9) 109 (21.8) 61 (24.3) 19 (20.9)
Comorbidity score
mean (SD)
2.3 (1.5) 2.2 (1.5) 2.3 (1.5) 3.1 (1.6) 3.3 (1.6) 3.9 (1.5)
Abbreviations: BMI, body mass index (calculated as weight in kilograms divided by height in meters squared); COVID-19, coronavirus disease
2019; ICU, intensive care unit; IQR, interquartile range; NDI, 2010 Neighborhood Socioeconomic Disadvantage Index; MM, Michigan Medicine.
a
Percentages are reported as fraction of column totals excluding missing entries.
b
Includes White Hispanic or unknown; Black Hispanic
or unknown; Asian Hispanic, non-Hispanic, or unknown; Native American Hispanic, non-Hispanic, or unknown; Pacific Islander Hispanic,
non-Hispanic, or unknown; and other Hispanic, non-Hispanic, or unknown.
c
Includes missing race and/or ethnicity.

Citations
More filters
Posted ContentDOI
21 Jul 2022-medRxiv
TL;DR: Despite reductions in mortality, Covid-19 mortality remained elevated in nonmetro areas and increased for some racial/ethnic groups, highlighting the need for increased vaccination delivery and equitable public health measures especially in rural communities.
Abstract: Prior research has established that American Indian, Alaska Native, Black, Hispanic, and Pacific Islander populations in the United States have experienced substantially higher mortality rates from Covid-19 compared to non-Hispanic white residents during the first year of the pandemic. What remains less clear is how mortality rates have changed for each of these racial/ethnic groups during 2021, given the increasing prevalence of vaccination. In particular, it is unknown how these changes in mortality have varied geographically. In this study, we used provisional data from the National Center for Health Statistics (NCHS) to produce age-standardized estimates of Covid-19 mortality by race/ethnicity in the United States from March 2020 to February 2022 in each metro-nonmetro category, Census region, and Census division. We calculated changes in mortality rates between the first and second years of the pandemic and examined mortality changes by month. We found that when Covid-19 first affected a geographic area, non-Hispanic Black and Hispanic populations experienced extremely high levels of Covid-19 mortality and racial/ethnic inequity that were not repeated at any other time during the pandemic. Between the first and second year of the pandemic, racial/ethnic inequities in Covid-19 mortality decreased but were not eliminated for Hispanic, non-Hispanic Black, and non-Hispanic AIAN residents. These inequities decreased due to reductions in mortality for these populations alongside increases in non-Hispanic white mortality. Though racial/ethnic inequities in Covid-19 mortality decreased, substantial inequities still existed in most geographic areas during the pandemic's second year: Non-Hispanic Black, non-Hispanic AIAN, and Hispanic residents reported higher Covid-19 death rates in rural areas than in urban areas, indicating that these communities are facing serious public health challenges. At the same time, the non-Hispanic white mortality rate worsened in rural areas during the second year of the pandemic, suggesting there may be unique factors driving mortality in this population. Finally, vaccination rates were associated with reductions in Covid-19 mortality for Hispanic, non-Hispanic Black, and non-Hispanic white residents, and increased vaccination may have contributed to the decreases in racial/ethnic inequities in Covid-19 mortality observed during the second year of the pandemic. Despite reductions in mortality, Covid-19 mortality remained elevated in nonmetro areas and increased for some racial/ethnic groups, highlighting the need for increased vaccination delivery and equitable public health measures especially in rural communities. Taken together, these findings highlight the continued need to prioritize health equity in the pandemic response and to modify the structures and policies through which systemic racism operates and has generated racial health inequities.

12 citations

31 Jan 2019
TL;DR: In this paper, associations between 4956 GWAS catalog reported SNPs and 67 traits were examined among 7726 African Americans from the Reasons for Geographic and Racial Differences in Stroke (REGARDS) study, which is focused on identifying factors that increase stroke risk.
Abstract: Cardiovascular disease, diabetes, and kidney disease are among the leading causes of death and disability worldwide. However, knowledge of genetic determinants of those diseases in African Americans remains limited. In our study, associations between 4956 GWAS catalog reported SNPs and 67 traits were examined among 7726 African Americans from the REasons for Geographic and Racial Differences in Stroke (REGARDS) study, which is focused on identifying factors that increase stroke risk. The prevalent and incident phenotypes studied included inflammation, kidney traits, cardiovascular traits and cognition. Our results validated 29 known associations, of which eight associations were reported for the first time in African Americans. Our cross-racial validation of GWAS findings provide additional evidence for the important roles of these loci in the disease process and may help identify genes especially important for future functional validation.

5 citations

Posted ContentDOI
22 Oct 2022-medRxiv
TL;DR: The authors' agnostic screen of time stamped EHR data uncovered a plethora of PASC-associated diagnoses across many categories and highlighted a complex arrangement of presenting and likely pre-disposing features -- the latter with a potential for risk stratification approaches.
Abstract: Objective: The growing number of Coronavirus Disease-2019 (COVID-19) survivors who are affected by Post-Acute Sequelae of SARS CoV-2 infection (PACS) represent a worldwide public health challenge. Yet, the novelty of this condition and the resulting limited data on underlying pathomechanisms so far hampered the advancement of effective therapies. Using electronic health records (EHR) data, we aimed to characterize PASC-associated diagnoses and to develop risk prediction models. Methods: In our cohort of 63,675 COVID-19 positive patients seen at Michigan Medicine, 1,724 (2.7 %) had a recorded PASC diagnosis. We used a case control study design comparing PASC cases with 17,205 matched controls and performed phenome-wide association studies (PheWASs) to characterize enriched phenotypes of the post-COVID-19 period and potential PASC pre-disposing phenotypes of the pre-, and acute-COVID-19 periods. We also integrated PASC-associated phenotypes into Phenotype Risk Scores (PheRSs) and evaluated their predictive performance. Results: In the post-COVID-19 period, cases were significantly enriched for known PASC symptoms (e.g., shortness of breath, malaise/fatigue, and cardiac dysrhythmias) but also many musculoskeletal, infectious, and digestive disorders. We found seven phenotypes in the pre-COVID-19 period (irritable bowel syndrome, concussion, nausea/vomiting, shortness of breath, respiratory abnormalities, allergic reaction to food, and circulatory disease) and 69 phenotypes in the acute-COVID-19 period (predominantly respiratory, circulatory, neurological, digestive, and mental health phenotypes) that were significantly associated with PASC. The derived pre-COVID-19 PheRS and acute-COVID-19 PheRS had low accuracy to differentiate cases from controls; however, they stratified risk well, e.g., a combination of the two PheRSs identified a quarter of the COVID-19 positive cohort at a 3.5-fold increased risk for PASC compared to the bottom 50% of their distributions. Conclusions: Our agnostic screen of time stamped EHR data uncovered a plethora of PASC-associated diagnoses across many categories and highlighted a complex arrangement of presenting and likely pre-disposing features -- the latter with a potential for risk stratification approaches. Yet, considerably more work will need to be done to better characterize PASC and its subtypes, especially long-term consequences, and to consider more comprehensive risk models.

3 citations

Posted ContentDOI
09 Jan 2022-medRxiv
TL;DR: The results indicate that it is plausible that specific diagnoses and treatments are associated with worse COVID-19 outcomes, and those with cancer continued to have elevated risk of severe COVID [cancer OR (95% CI) among those fully vaccinated: 1.10, 2.62)] relative to those without cancer even among vaccinated.
Abstract: Background: Observational studies have identified patients with cancer as a potential subgroup of individuals at elevated risk of severe SARS-CoV-2 (COVID-19) disease and mortality. Early studies showed an increased risk of COVID-19 mortality for cancer patients, but it is not well understood how this association varies by cancer site, cancer treatment, and vaccination status. Methods: Using electronic health record data from an academic medical center, we identified 259,893 individuals who were tested for or diagnosed with COVID-19 from March 10, 2020, to February 2, 2022. Of these, 41,218 tested positive for COVID-19 of whom 10,266 had a past or current cancer diagnosis. We conducted Firth-corrected, covariate-adjusted logistic regression to assess the association of cancer status, cancer type, and cancer treatment with four COVID-19 outcomes: hospitalization, intensive care unit (ICU) admission, mortality, and a composite "severe COVID-19" outcome which is the union of the first three outcomes. We examine the effect of the timing of cancer diagnosis and treatment relative to COVID diagnosis, and the effect of vaccination. Results: Cancer status was associated with higher rates of severe COVID-19 infection [OR (95% CI): 1.18 (1.08, 1.29)], hospitalization [OR (95% CI): 1.18 (1.06, 1.28)], and mortality [OR (95% CI): 1.22 (1.00, 1.48)]. These associations were driven by patients whose most recent initial cancer diagnosis was within the past three years. Chemotherapy receipt was positively associated with all four COVID-19 outcomes (e.g., severe COVID [OR (95% CI): 1.96 (1.73, 2.22)], while receipt of either radiation or surgery alone were not associated with worse COVID-19 outcomes. Among cancer types, hematologic malignancies [OR (95% CI): 1.62 (1.39, 1.88)] and lung cancer [OR (95% CI): 1.81 (1.34, 2.43)] were significantly associated with higher odds of hospitalization. Hematologic malignancies were associated with ICU admission [OR (95% CI): 1.49 (1.11, 1.97)] and mortality [OR (95% CI): 1.57 (1.15, 2.11)], while melanoma and breast cancer were not associated with worse COVID-19 outcomes. Vaccinations were found to reduce the frequency of occurrence for the four COVID-19 outcomes across cancer status but those with cancer continued to have elevated risk of severe COVID [cancer OR (95% CI) among those fully vaccinated: 1.69 (1.10, 2.62)] relative to those without cancer even among vaccinated. Conclusion: Our study provides insight to the relationship between cancer diagnosis, treatment, cancer type, vaccination, and COVID-19 outcomes. Our results indicate that it is plausible that specific diagnoses (e.g., hematologic malignancies, lung cancer) and treatments (e.g., chemotherapy) are associated with worse COVID-19 outcomes. Vaccines significantly reduce the risk of severe COVID-19 outcomes in individuals with cancer and those without, but cancer patients are still at higher risk of breakthrough infections and more severe COVID outcomes even after vaccination. These findings provide actionable insights for risk identification and targeted treatment and prevention strategies.
References
More filters
Journal ArticleDOI
TL;DR: In this paper, the first-order term is removed from the asymptotic bias of maximum likelihood estimates by a suitable modification of the score function, and the effect is to penalize the likelihood by the Jeffreys invariant prior.
Abstract: SUMMARY It is shown how, in regular parametric problems, the first-order term is removed from the asymptotic bias of maximum likelihood estimates by a suitable modification of the score function. In exponential families with canonical parameterization the effect is to penalize the likelihood by the Jeffreys invariant prior. In binomial logistic models, Poisson log linear models and certain other generalized linear models, the Jeffreys prior penalty function can be imposed in standard regression software using a scheme of iterative adjustments to the data.

3,362 citations

Journal ArticleDOI
TL;DR: Black race was not associated with higher in-hospital mortality than white race, after adjustment for differences in sociodemographic and clinical characteristics on admission, and black patients had higher prevalences of obesity, diabetes, hypertension, and chronic kidney disease than white patients.
Abstract: BACKGROUND: Many reports on coronavirus disease 2019 (Covid-19) have highlighted age- and sex-related differences in health outcomes. More information is needed about racial and ethnic differences in outcomes from Covid-19. METHODS: In this retrospective cohort study, we analyzed data from patients seen within an integrated-delivery health system (Ochsner Health) in Louisiana between March 1 and April 11, 2020, who tested positive for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, the virus that causes Covid-19) on qualitative polymerase-chain-reaction assay. The Ochsner Health population is 31% black non-Hispanic and 65% white non-Hispanic. The primary outcomes were hospitalization and in-hospital death. RESULTS: A total of 3626 patients tested positive, of whom 145 were excluded (84 had missing data on race or ethnic group, 9 were Hispanic, and 52 were Asian or of another race or ethnic group). Of the 3481 Covid-19-positive patients included in our analyses, 60.0% were female, 70.4% were black non-Hispanic, and 29.6% were white non-Hispanic. Black patients had higher prevalences of obesity, diabetes, hypertension, and chronic kidney disease than white patients. A total of 39.7% of Covid-19-positive patients (1382 patients) were hospitalized, 76.9% of whom were black. In multivariable analyses, black race, increasing age, a higher score on the Charlson Comorbidity Index (indicating a greater burden of illness), public insurance (Medicare or Medicaid), residence in a low-income area, and obesity were associated with increased odds of hospital admission. Among the 326 patients who died from Covid-19, 70.6% were black. In adjusted time-to-event analyses, variables that were associated with higher in-hospital mortality were increasing age and presentation with an elevated respiratory rate; elevated levels of venous lactate, creatinine, or procalcitonin; or low platelet or lymphocyte counts. However, black race was not independently associated with higher mortality (hazard ratio for death vs. white race, 0.89; 95% confidence interval, 0.68 to 1.17). CONCLUSIONS: In a large cohort in Louisiana, 76.9% of the patients who were hospitalized with Covid-19 and 70.6% of those who died were black, whereas blacks comprise only 31% of the Ochsner Health population. Black race was not associated with higher in-hospital mortality than white race, after adjustment for differences in sociodemographic and clinical characteristics on admission.

1,348 citations

Journal ArticleDOI

996 citations


"A phenome-wide association study (P..." refers background or result in this paper

  • ...For example, we provide additional evidence on the previously reported concern that patients with mental health disorders are at higher risk of infection and experience barriers in seeking treatment leading to poor prognosis [23]....

    [...]

  • ...recently published a PheWAS looking at phenotypes associated with COVID-19-related hospitalization, which is consistent with our results.[23] This technique allowed us to explore and identify potentially associated conditions across the medical phenome that are associated with susceptibility, hospitalization, ICU admission or mortality....

    [...]

Journal ArticleDOI
TL;DR: A novel method to scan phenomic data for genetic associations using International Classification of Disease billing codes, which are available in most EMR systems, and develops a code translation table to automatically define 776 different disease populations and their controls using prevalent ICD9 codes derived from EMR data.
Abstract: Motivation: Emergence of genetic data coupled to longitudinal electronic medical records (EMRs) offers the possibility of phenomewide association scans (PheWAS) for disease–gene associations. We propose a novel method to scan phenomic data for genetic associations using International Classification of Disease (ICD9) billing codes, which are available in most EMR systems. We have developed a code translation table to automatically define 776 different disease populations and their controls using prevalent ICD9 codes derived from EMR data. As a proof of concept of this algorithm, we genotyped the first 6005 European–Americans accrued into BioVU, Vanderbilt’s DNA biobank, at five single nucleotide polymorphisms (SNPs) with previously reported disease associations: atrial fibrillation, Crohn’s disease, carotid artery stenosis, coronary artery disease, multiple sclerosis, systemic lupus erythematosus and rheumatoid arthritis. The PheWAS software generated cases and control populations across all ICD9 code groups for each of these five SNPs, and disease-SNP associations were analyzed. The primary outcome of this study was replication of seven previously known SNP–disease associations for these SNPs. Results: Four of seven known SNP–disease associations using the PheWAS algorithm were replicated with P-values between 2.8 × 10 −6 and 0.011. The PheWAS algorithm also identified 19 previously unknown statistical associations between these SNPs and diseases at P < 0.01. This study indicates that PheWAS analysis is a feasible method to investigate SNP–disease associations. Further evaluation is needed to determine the validity of these associations and the appropriate statistical thresholds for clinical significance. Availability: The PheWAS software and code translation table are freely available at http://knowledgemap.mc.vanderbilt.edu/research.

958 citations

Journal ArticleDOI
TL;DR: There is a need for greater attention to understanding how risks and resources in the social environment are systematically patterned by race, ethnicity and SES, and how they combine to influence cardiovascular disease and other health outcomes.
Abstract: Race/ethnicity and socioeconomic status (SES) are social categories that capture differential exposure to conditions of life that have health consequences. Race/ethnicity and SES are linked to each other, but race matters for health even after SES is considered. This commentary considers the complex ways in which race combines with SES to affect health. There is a need for greater attention to understanding how risks and resources in the social environment are systematically patterned by race, ethnicity and SES, and how they combine to influence cardiovascular disease and other health outcomes. Future research needs to examine how the levels, timing and accumulation of institutional and interpersonal racism combine with other toxic exposures, over the life-course, to influence the onset and course of illness. There is also an urgent need for research that seeks to build the science base that will identify the multilevel interventions that are likely to enhance the health of all, even while they improve the health of disadvantaged groups more rapidly than the rest of the population so that inequities in health can be reduced and ultimately eliminated. We also need sustained research attention to identifying how to build the political support to reduce the large shortfalls in health. (PsycINFO Database Record

761 citations


"A phenome-wide association study (P..." refers methods in this paper

  • ...Moreover, we incorporated a census tract-level SES covariate, which are important to consider when comparing races [22]....

    [...]

Frequently Asked Questions (2)
Q1. What contributions have the authors mentioned in the paper "A phenome-wide association study (phewas) of covid-19 outcomes by race using the electronic health records data in michigan medicine" ?

The authors performed a phenome-wide association study to identify pre-existing conditions related to Coronavirus disease 2019 ( COVID-19 ) prognosis across the medical phenome and how they vary by race. The study is comprised of 53,853 patients who were tested/diagnosed for COVID-19 between 10 March and 2 September 2020 at a large academic medical center. 

The authors hope this exploratory effort will inspire hypothesis generation for future research that might result in targeted prevention and care as they are still combatting this pandemic.