scispace - formally typeset
Search or ask a question
Posted ContentDOI

Personalized survival probabilities for SARS-CoV-2 positive patients by explainable machine learning

TL;DR: In this paper, a machine learning model was trained to predict mortality within 12 weeks of the first positive SARS-CoV-2 test, which can aid clinicians to implement precision medicine.
Abstract: Interpretable risk assessment of SARS-CoV-2 positive patients can aid clinicians to implement precision medicine. Here we trained a machine learning model to predict mortality within 12 weeks of a first positive SARS-CoV-2 test. By leveraging data on 33,928 confirmed SARS-CoV-2 cases in eastern Denmark, we considered 2,723 variables extracted from electronic health records (EHR) including demographics, diagnoses, medications, laboratory test results and vital parameters. A discrete-time framework for survival modelling enabled us to predict personalized survival curves and explain individual risk factors. Performances of weighted concordance index 0.95 and precision-recall area under the curve 0.71 were measured on the test set. Age, sex, number of medications, previous hospitalizations and lymphocyte counts were identified as top mortality risk factors. Our explainable survival model developed on EHR data also revealed temporal dynamics of the 22 selected risk factors. Upon further validation, this model may allow direct reporting of personalized survival probabilities in routine care.

Summary (2 min read)

Jump to: [INTRODUCTION][Patient cohort][DISCUSSION] and [CONCLUSION]

INTRODUCTION

  • Coronavirus disease 2019 (COVID-19) caused by infection with Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has by October 2021 claimed almost 5 million lives since its outbreak in late 20191.
  • Both people already vaccinated and patients not being vaccinated continue to develop critical COVID-19 disease11.
  • Among hospitalized patients, risk factors for severe disease or death include low lymphocyte counts, elevated inflammatory markers and elevated kidney and liver parameters indicating organ dysfunction6.
  • While great efforts have been put into providing prognostic models based on data collected from health systems, traditional modelling approaches solely based on domain knowledge may fail.
  • Furthermore, ML models facilitate clinical insights21 when coupled with methods for model explainability such as SHapley Additive exPlanations (SHAP) values22.

Patient cohort

  • Based on centralized EHR and SARS-CoV-2 test results from test centers in eastern Denmark, the authors identified 33,938 patients who had at least one SARS-CoV-2 RT-PCR positive test from 963,265 individuals who had a test performed between 17th of March 2020 and 2nd of March 2021 (Fig. 1).
  • The median of the predicted cumulative death probabilities by survival status reflected the discriminative performance of the individual survival predictions (Fig. 3a).
  • From the original set of 2,723 features generated from routine EHR data (Supplementary Table 2), 22 features were selected.
  • As expected, patients with more hospitalizations and longer cumulative admission days prior to FPT exhibited a higher risk of death (Fig 5e-f).

DISCUSSION

  • The authors here developed an explainable Machine Learning model for predicting the risk of death within the first 12 weeks from a positive SARS-CoV-2 PCR test.
  • Additionally, instead of characterizing patients’ relevant history using a limited set of preselected variables, the set of 22 features in the final model were derived using a data-driven approach from an initial set of 2,723 features that encoded available demographics, laboratory test results, hospitalizations, vital parameters, diagnoses and medicines.
  • This has been the predominant modelling approach in COVID1918,34 related outcomes.
  • Multiple approaches have been proposed to open “black-box” models and allow explainability by, for example, removing features and measuring their impact on the model43.
  • This suggests that predicting late deaths requires a different set of risk factors and consideration of their interactions than predicting early death.

CONCLUSION

  • The authors developed a data-driven machine learning model to identify SARS-CoV-2 positive patients with a high risk of death within 12-week from the first positive test.
  • The discrete-time modelling approach implemented not only allowed us to train survival models with high performance but also enabled model explainability through SHAP values.
  • By learning temporal dynamics and interactions between clinical features, the model was able to identify personalized risk factors and high-risk patients for early interventions while improving the understanding of the disease.
  • At the same time, the authors demonstrate that leveraging electronic health records with explainable ML models provide a framework for the implementation of precision medicine in routine care which can be adapted to other diseases.

Did you find this useful? Give us your feedback

Figures (6)

Content maybe subject to copyright    Report

1


Adrian G. Zucco
1
, Rudi Agius
2
, Rebecka Svanberg
2
, Kasper S. Moestrup
1
, Ramtin Z. Marandi
1
,
Cameron Ross MacPherson
1
, Jens Lundgren
1,4
, Sisse R. Ostrowski
3,4*
, Carsten U. Niemann
2,4*
1
PERSIMUNE Center of Excellence, Rigshospitalet, Copenhagen, Denmark.
2
Department of Hematology, Rigshospitalet, Copenhagen, Denmark.
3
Department of Clinical Immunology, Rigshospitalet, Copenhagen, Denmark.
4
Department of Clinical Medicine, University of Copenhagen, Denmark.
*Co-senior authors.
Correspondence should be addressed to: A.G.Z (adrian.gabriel.zucco@regionh.dk), S.R.O
(Sisse.Rye.Ostrowski@regionh.dk) or C.U.N (Carsten.Utoft.Niemann@regionh.dk).
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 29, 2021. ; https://doi.org/10.1101/2021.10.28.21265598doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

2
ABSTRACT
Interpretable risk assessment of SARS-CoV-2 positive patients can aid clinicians to implement
precision medicine. Here we trained a machine learning model to predict mortality within 12 weeks of
a first positive SARS-CoV-2 test. By leveraging data on 33,928 confirmed SARS-CoV-2 cases in
eastern Denmark, we considered 2,723 variables extracted from electronic health records (EHR)
including demographics, diagnoses, medications, laboratory test results and vital parameters. A
discrete-time framework for survival modelling enabled us to predict personalized survival curves and
explain individual risk factors. Performances of weighted concordance index 0.95 and precision-recall
area under the curve 0.71 were measured on the test set. Age, sex, number of medications, previous
hospitalizations and lymphocyte counts were identified as top mortality risk factors. Our explainable
survival model developed on EHR data also revealed temporal dynamics of the 22 selected risk
factors. Upon further validation, this model may allow direct reporting of personalized survival
probabilities in routine care.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 29, 2021. ; https://doi.org/10.1101/2021.10.28.21265598doi: medRxiv preprint

3
INTRODUCTION
Coronavirus disease 2019 (COVID-19) caused by infection with Severe acute respiratory syndrome
coronavirus 2 (SARS-CoV-2) has by October 2021 claimed almost 5 million lives since its outbreak in
late 2019
1
. Infected individuals present a variety of symptoms, ranging from asymptomatic to life-
threatening diseases
2
. Although the majority of cases experience mild to moderate disease
approximately 15% of confirmed SARS-CoV-2 positive cases are estimated to develop severe
disease
3
. Progression to severe disease seems to occur within 1-2 weeks from symptom onset, and
is characterized by clinical signs of pneumonia with dyspnea, increased respiratory rate, and
decreased blood oxygen saturation requiring supplemental oxygen
37
. Development of critical illness
is driven by systemic inflammation, leading to acute respiratory distress syndrome (ARDS),
respiratory failure, septic shock, multi-organ failure, and/or disseminated coagulopathy
4,5,8
. The
majority of these patients require mechanical ventilation, and mortality for patients admitted to an
Intensive Care Unit (ICU) is reported to be 32-50%
3,810
. Despite the current vaccination program, both
people already vaccinated and patients not being vaccinated continue to develop critical COVID-19
disease
11
. Thus, the pandemic still poses a great burden on health care systems worldwide, locally
approaching the limit of capacity due to high patient burden and challenging clinical management.
Several factors associated with increased risk of severe disease course have been established
including old age, male gender, and lifestyle factors such as smoking and obesity
12,13
. Comorbidities
including hypertension, type 2 diabetes, renal disease, as well as pre-existing conditions of immune
dysfunction and cancer, are also associated with a higher risk of severe disease and COVID-19
related death
12,1416
. Among hospitalized patients, risk factors for severe disease or death include low
lymphocyte counts, elevated inflammatory markers and elevated kidney and liver parameters
indicating organ dysfunction
6
. However, many of these factors likely reflect an ongoing progression of
COVID-19. Thus, identification of high-risk patients at or prior to hospital admission is warranted to
facilitate personalized interventions.
Multiple COVID-19 prognostic models have been built on reduced sets of predictive features from
demographics, patient history, physical examination, and laboratory results
17
processed by traditional
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 29, 2021. ; https://doi.org/10.1101/2021.10.28.21265598doi: medRxiv preprint

4
statistical frameworks or machine learning (ML) algorithms. A systematic review of 50 prognostic
models has concluded that overall such models have been poorly reported and are at a high risk of
bias
18
. While great efforts have been put into providing prognostic models based on data collected
from health systems, traditional modelling approaches solely based on domain knowledge may fail.
This represents a risk of missing novel markers and insights about the disease that could come from
data-driven models in a hypothesis-free manner
19
, which have been reported to outperform models
based on curated variables from domain experts
20
.
Furthermore, ML models facilitate clinical insights
21
when coupled with methods for model
explainability such as SHapley Additive exPlanations (SHAP) values
22
. Model explainability has been
developed mainly in the context of regression and binary classification, but in clinical research where
censored observations are common, explainable time-to-event modelling is required to avoid
selection bias
23,24
. Multiple ML algorithms have been developed for time-to-event modelling, either by
building on top of existing models such as Cox proportional hazards or by defining new loss functions
that model time as continuous
25
. Here we used an alternative approach that considered time in
discrete intervals and performed binary classification at such time intervals
26
. This allowed us to
implement gradient boosting decision trees for binary classification to predict personalized survival
probabilities
27
and allow explainability at the individual patient level using SHAP values
22
including
temporal dynamics of risk factors over the course of the disease. This approach not only allows to
predict personalized survival probabilities and risk factors for SARS-CoV-2 positive patients but also
provides a framework for precision medicine that can be applied to other diseases based on routine
electronic health records.
RESULTS
Patient cohort
Based on centralized EHR and SARS-CoV-2 test results from test centers in eastern Denmark, we
identified 33,938 patients who had at least one SARS-CoV-2 RT-PCR positive test from 963,265
individuals who had a test performed between 17th of March 2020 and 2
nd
of March 2021 (Fig. 1). In
this cohort, 5,077 patients were hospitalized, of whom 502 were admitted to the ICU (Supplementary
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 29, 2021. ; https://doi.org/10.1101/2021.10.28.21265598doi: medRxiv preprint

5
Fig. 1). Overall, 1,803 (5.34%) deaths occurred among all individuals with a positive SARS-CoV-2
RT-PCR test, of whom 141 died later than 12 weeks from the first positive test (FPT) hence considered
as alive for this analysis. Right-censoring was only observed for patients tested after the 8
th
of
December 2020 with less than 12 weeks of follow-up available while deaths that occurred the same
day of FPT were not considered for training. For the initial model, demographics, laboratory test
results, hospitalizations, vital parameters, diagnoses, medicines (ordered and administered) and
summary features were included. Feature encoding resulted in 2,723 features (Supplementary Table
2) which after feature selection were reduced to 23 features. A summary of the cohort based on the
final feature set can be found in Table 1. This cohort represents an updated subset of individuals
residing in Denmark characterized in a previous publication
28
.
Survival modelling with machine learning achieves high discriminative performance
To predict the risk of death within 12 weeks from FPT, we trained gradient boosting decision trees
considering time as discrete in a time-to-event framework. Performance was measured on 20% of the
data (test set) unblinded only for performance assessment. The weighted concordance index (C-
index) for predicting risk of death for all 12 weeks with 95% confidence intervals (CI) was 0.946 (0.941-
0.950). Binary metrics were calculated for each predicted week by excluding censored individuals
(Fig. 2). At week 12, the precision-recall area under the curve (PR-AUC) and Mathew correlation
coefficient (MCC) with 95% CI were 0.686 (0.651-0.720) and 0.580 (0.562-0.597) respectively. The
sensitivity was 99.3% and the specificity was 86.4%. The performance for subgroups of patients
displayed some differences. In patients tested outside the hospital (Fig 2b), the C-index was 0.955
(0.950-0.960), the PR-AUC and MCC were 0.675 (0.632-0.719) and 0.585 (0.562-0.605) respectively.
98.9% sensitivity and 89.9% specificity were measured in this group. For patients previously admitted
to the hospital at the time of test (Fig. 2c), the C-Index was 0.809 (0.787-0.829), the PR-AUC and
MCC were 0.705 (0.640-0.760) and 0.357 (0.325-0.387) respectively. The sensitivity was 100% and
the specificity 31.0% indicating a higher number of false positives when using a 0.5 probability
threshold for this group (Supplementary Table 1).
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 29, 2021. ; https://doi.org/10.1101/2021.10.28.21265598doi: medRxiv preprint

Citations
More filters
Journal ArticleDOI
19 May 2022-Blood
TL;DR: Patients with CLL with close hospital contactss and in particular those above 70 years of age with one or more comorbidities should be considered for closer monitoring and pre-emptive antiviral therapy upon a positive SARS-CoV-2 test.

32 citations

References
More filters
Journal ArticleDOI
TL;DR: An explanation method for trees is presented that enables the computation of optimal local explanations for individual predictions, and the authors demonstrate their method on three medical datasets.
Abstract: Tree-based machine learning models such as random forests, decision trees and gradient boosted trees are popular nonlinear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here we improve the interpretability of tree-based models through three main contributions. (1) A polynomial time algorithm to compute optimal explanations based on game theory. (2) A new type of explanation that directly measures local feature interaction effects. (3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to (1) identify high-magnitude but low-frequency nonlinear mortality risk factors in the US population, (2) highlight distinct population subgroups with shared risk characteristics, (3) identify nonlinear interaction effects among risk factors for chronic kidney disease and (4) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains. Tree-based machine learning models are widely used in domains such as healthcare, finance and public services. The authors present an explanation method for trees that enables the computation of optimal local explanations for individual predictions, and demonstrate their method on three medical datasets.

2,548 citations

Journal ArticleDOI
22 May 2020-BMJ
TL;DR: In study participants, mortality was high, independent risk factors were increasing age, male sex, and chronic comorbidity, including obesity, and the importance of pandemic preparedness and the need to maintain readiness to launch research studies in response to outbreaks is shown.
Abstract: Objective To characterise the clinical features of patients admitted to hospital with coronavirus disease 2019 (covid-19) in the United Kingdom during the growth phase of the first wave of this outbreak who were enrolled in the International Severe Acute Respiratory and emerging Infections Consortium (ISARIC) World Health Organization (WHO) Clinical Characterisation Protocol UK (CCP-UK) study, and to explore risk factors associated with mortality in hospital. Design Prospective observational cohort study with rapid data gathering and near real time analysis. Setting 208 acute care hospitals in England, Wales, and Scotland between 6 February and 19 April 2020. A case report form developed by ISARIC and WHO was used to collect clinical data. A minimal follow-up time of two weeks (to 3 May 2020) allowed most patients to complete their hospital admission. Participants 20 133 hospital inpatients with covid-19. Main outcome measures Admission to critical care (high dependency unit or intensive care unit) and mortality in hospital. Results The median age of patients admitted to hospital with covid-19, or with a diagnosis of covid-19 made in hospital, was 73 years (interquartile range 58-82, range 0-104). More men were admitted than women (men 60%, n=12 068; women 40%, n=8065). The median duration of symptoms before admission was 4 days (interquartile range 1-8). The commonest comorbidities were chronic cardiac disease (31%, 5469/17 702), uncomplicated diabetes (21%, 3650/17 599), non-asthmatic chronic pulmonary disease (18%, 3128/17 634), and chronic kidney disease (16%, 2830/17 506); 23% (4161/18 525) had no reported major comorbidity. Overall, 41% (8199/20 133) of patients were discharged alive, 26% (5165/20 133) died, and 34% (6769/20 133) continued to receive care at the reporting date. 17% (3001/18 183) required admission to high dependency or intensive care units; of these, 28% (826/3001) were discharged alive, 32% (958/3001) died, and 41% (1217/3001) continued to receive care at the reporting date. Of those receiving mechanical ventilation, 17% (276/1658) were discharged alive, 37% (618/1658) died, and 46% (764/1658) remained in hospital. Increasing age, male sex, and comorbidities including chronic cardiac disease, non-asthmatic chronic pulmonary disease, chronic kidney disease, liver disease and obesity were associated with higher mortality in hospital. Conclusions ISARIC WHO CCP-UK is a large prospective cohort study of patients in hospital with covid-19. The study continues to enrol at the time of this report. In study participants, mortality was high, independent risk factors were increasing age, male sex, and chronic comorbidity, including obesity. This study has shown the importance of pandemic preparedness and the need to maintain readiness to launch research studies in response to outbreaks. Study registration ISRCTN66726260.

2,459 citations

Journal ArticleDOI
TL;DR: This article shows how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario.
Abstract: To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet. Accuracy and F1 score computed on confusion matrices have been (and still are) among the most popular adopted metrics in binary classification tasks. However, these statistical measures can dangerously show overoptimistic inflated results, especially on imbalanced datasets. The Matthews correlation coefficient (MCC), instead, is a more reliable statistical rate which produces a high score only if the prediction obtained good results in all of the four confusion matrix categories (true positives, false negatives, true negatives, and false positives), proportionally both to the size of positive elements and the size of negative elements in the dataset. In this article, we show how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario. We believe that the Matthews correlation coefficient should be preferred to accuracy and F1 score in evaluating binary classification tasks by all scientific communities.

2,358 citations

Journal ArticleDOI
07 Apr 2020-BMJ
TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.
Abstract: Objective To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at increased risk of covid-19 infection or being admitted to hospital with the disease. Design Living systematic review and critical appraisal by the COVID-PRECISE (Precise Risk Estimation to optimise covid-19 Care for Infected or Suspected patients in diverse sEttings) group. Data sources PubMed and Embase through Ovid, up to 1 July 2020, supplemented with arXiv, medRxiv, and bioRxiv up to 5 May 2020. Study selection Studies that developed or validated a multivariable covid-19 related prediction model. Data extraction At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). Results 37 421 titles were screened, and 169 studies describing 232 prediction models were included. The review identified seven models for identifying people at risk in the general population; 118 diagnostic models for detecting covid-19 (75 were based on medical imaging, 10 to diagnose disease severity); and 107 prognostic models for predicting mortality risk, progression to severe disease, intensive care unit admission, ventilation, intubation, or length of hospital stay. The most frequent types of predictors included in the covid-19 prediction models are vital signs, age, comorbidities, and image features. Flu-like symptoms are frequently predictive in diagnostic models, while sex, C reactive protein, and lymphocyte counts are frequent prognostic factors. Reported C index estimates from the strongest form of validation available per model ranged from 0.71 to 0.99 in prediction models for the general population, from 0.65 to more than 0.99 in diagnostic models, and from 0.54 to 0.99 in prognostic models. All models were rated at high or unclear risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, high risk of model overfitting, and unclear reporting. Many models did not include a description of the target population (n=27, 12%) or care setting (n=75, 32%), and only 11 (5%) were externally validated by a calibration plot. The Jehi diagnostic model and the 4C mortality score were identified as promising models. Conclusion Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that almost all pubished prediction models are poorly reported, and at high risk of bias such that their reported predictive performance is probably optimistic. However, we have identified two (one diagnostic and one prognostic) promising models that should soon be validated in multiple cohorts, preferably through collaborative efforts and data sharing to also allow an investigation of the stability and heterogeneity in their performance across populations and settings. Details on all reviewed models are publicly available at https://www.covprecise.org/. Methodological guidance as provided in this paper should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, prediction model authors should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Systematic review registration Protocol https://osf.io/ehc47/, registration https://osf.io/wy245. Readers’ note This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 3 of the original article published on 7 April 2020 (BMJ 2020;369:m1328). Previous updates can be found as data supplements (https://www.bmj.com/content/369/bmj.m1328/related#datasupp). When citing this paper please consider adding the update number and date of access for clarity.

2,183 citations

Journal ArticleDOI
TL;DR: The potential for furthering medical research and clinical care using EHR data and the challenges that must be overcome before this is a reality are considered.
Abstract: The adoption of electronic health records will provide a rich resource for biomedical researchers. This Review discusses the potential for their use in informed decision making in the clinic, for a finer understanding of genotype–phenotype relationships and for selection of research cohorts, along with the current challenges for their mining and use. Clinical data describing the phenotypes and treatment of patients represents an underused data source that has much greater research potential than is currently realized. Mining of electronic health records (EHRs) has the potential for establishing new patient-stratification principles and for revealing unknown disease correlations. Integrating EHR data with genetic data will also give a finer understanding of genotype–phenotype relationships. However, a broad range of ethical, legal and technical reasons currently hinder the systematic deposition of these data in EHRs and their mining. Here, we consider the potential for furthering medical research and clinical care using EHR data and the challenges that must be overcome before this is a reality.

1,376 citations

Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "Personalized survival probabilities for sars-cov-2 positive patients by explainable machine learning" ?

Is the author/funder, who has granted medRxiv a license to display the preprint in ( which was not certified by peer review ) preprint