scispace - formally typeset
Open AccessPosted ContentDOI

A Fully Automatic Deep Learning System for COVID-19 Diagnostic and Prognostic Analysis

TLDR
Deep learning provides a convenient tool for fast screening COVID-19 and finding potential high-risk patients, which may be helpful for medical resource optimization and early prevention before patients show severe symptoms.
Abstract
Coronavirus disease 2019 (COVID-19) has spread globally, and medical resources become insufficient in many regions. Fast diagnosis of COVID-19, and finding high-risk patients with worse prognosis for early prevention and medical resources optimization is important. Here, we proposed a fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis by routinely used computed tomography. We retrospectively collected 5372 patients with computed tomography images from 7 cities or provinces. Firstly, 4106 patients with computed tomography images and gene information were used to pre-train the DL system, making it learn lung features. Afterwards, 1266 patients (924 with COVID-19, and 471 had follow-up for 5+ days; 342 with other pneumonia) from 6 cities or provinces were enrolled to train and externally validate the performance of the deep learning system. In the 4 external validation sets, the deep learning system achieved good performance in identifying COVID-19 from other pneumonia (AUC=0.87 and 0.88) and viral pneumonia (AUC=0.86). Moreover, the deep learning system succeeded to stratify patients into high-risk and low-risk groups whose hospital-stay time have significant difference (p=0.013 and 0.014). Without human-assistance, the deep learning system automatically focused on abnormal areas that showed consistent characteristics with reported radiological findings. Deep learning provides a convenient tool for fast screening COVID-19 and finding potential high-risk patients, which may be helpful for medical resource optimization and early prevention before patients show severe symptoms. Take-home message Fully automatic deep learning system provides a convenient method for COVID-19 diagnostic and prognostic analysis, which can help COVID-19 screening and finding potential high-risk patients with worse prognosis.

read more

Content maybe subject to copyright    Report

A fully automatic deep learning system
for COVID-19 diagnostic and prognostic
analysis
Shuo Wang
1,12
, Yunfei Zha
2,12
, Weimin Li
3,12
, Qingxia Wu
4,12
, Xiaohu Li
5,12
,
Meng Niu
6,12
, Meiyun Wang
7,12
, Xiaoming Qiu
8,12
, Hongjun Li
9,12
,HeYu
3
,
Wei Gong
2
, Yan Bai
7
,LiLi
9
, Yongbei Zhu
1
, Liusu Wang
1
and Jie Tian
1,10,11
@ERSpublications
A fully automatic deep learning system provides a convenient method for COVID-19 diagnostic and
prognostic analysis, which can help COVID-19 screening and finding potential high-risk patients with
worse prognosis https://bit.ly/3bRaxGw
Cite this article as: Wang S, Zha Y, Li W, et al. A fully automatic deep learning system for COVID-19
diagnostic and prognostic analysis. Eur Respir J 2020; 56: 2000775 [https://doi.org/10.1183/
13993003.00775-2020].
ABSTRACT Coronavirus disease 2019 (COVID-19) has spread globally, and medical resources become
insufficient in many regions. Fast diagnosis of COVID-19 and finding high-risk patients with worse
prognosis for early prevention and medical resource optimisation is important. Here, we proposed a fully
automatic deep learning system for COVID-19 diagnostic and prognostic analysis by routinely used
computed tomography.
We retrospectively collected 5372 patients with computed tomography images from seven cities or
provinces. Firstly, 4106 patients with computed tomography images were used to pre-train the deep
learning system, making it learn lung features. Following this, 1266 patients (924 with COVID-19 (471 had
follow-up for >5 days) and 342 with other pneumonia) from six cities or provinces were enrolled to train
and externally validate the performance of the deep learning system.
In the four external validation sets, the deep learning system achieved good performance in identifying
COVID-19 from other pneumonia (AUC 0.87 and 0.88, respectively) and viral pneumonia (AUC 0.86).
Moreover, the deep learning system succeeded to stratify patients into high- and low-risk groups whose
hospital-stay time had significant difference ( p=0.013 and p=0.014, respectively). Without human
assistance, the deep learning system automatically focused on abnormal areas that showed consistent
characteristics with reported radiological findings.
Deep learning provides a convenient tool for fast screening of COVID-19 and identifying potential
high-risk patients, which may be helpful for medical resource optimisation and early prevention before
patients show severe symptoms.
This article has supplementary material available from erj.ersjournals.com
Received: 19 March 2020 | Accepted after revision: 16 May 2020
Copyright ©ERS 2020. This version is distributed under the terms of the Creative Commons Attribution Non-
Commercial Licence 4.0.
https://doi.org/10.1183/13993003.00775-2020 Eur Respir J 2020; 56: 2000775
|
ORIGINAL ARTICLE
INFECTIOUS DISEASE

Introduction
In December 2019, the novel coronavirus disease 2019 (COVID-19) occurred in Wuhan, China and
became a global health emergency very fast with >170 000 people infected [13]. Due to its high infection
rate, fast diagnosis and optimised medical resource assignment in epidemic areas are urgent. Accurate and
fast diagnosis of COVID-19 can help isolating infected patients slow the spread of this disease. However,
in epidemic areas insufficient medical resources have become a big challenge [4]. Therefore, finding
high-risk patients with worse prognosis for prior medical resources and special care is crucial in the
treatment of COVID-19.
Currently, reverse transcription (RT)-PCR is used as the gold truth for diagnosing COVID-19. However,
the limited sensitivity of RT-PCR and the shortage of testing kits in epidemic areas increase the screening
burden, and many infected people are thereby not isolated immediately [5, 6]. This accelerates the spread
of COVID-19. Conversely, due to the lack of medical resources, many infected patients cannot receive
immediate treatment. In this situation, finding high-risk patients with worse prognosis for prior treatment
and early prevention is important. Consequently, fast diagnosis and finding high-risk patients with worse
prognosis are very helpful for the control and management of COVID-19.
In recent studies, radiological findings demonstrated that computed tomography (CT) has great diagnostic and
prognostic value for COVID-19. For example, CT showed much higher sensitivity than RT-PCR in diagnosing
COVID-19 [5, 6]. For patients with COVID-19, bilateral lung lesions consisting of ground-glass opacities were
frequently observed in CT images [68]. Even in asymptomatic patients, abnormalities and changes were
observed in serial CT [9, 10]. As a common diagnostic tool, CT is easy and fast to acquire without adding
much cost. Building a sensitive diagnostic tool using CT imaging can accelerate the diagnostic process and is
complementary to RT-PCR. However, predicting personalised prognosis using CT imaging can identify the
potential high-risk patients who are more likely to become severe and need urgent medical resources.
Deep learning (DL) as an artificial intelligence method has shown promising results in assisting lung disease
analysis using CT images [1115]. Benefiting from the strong feature learning ability, DL can mine features
that are related to clinical outcomes from CT images automatically. Features learned by DL models can
reflect high-dimensional abstract mappings which are difficult for humans to sense but are strongly
associated with clinical outcomes. In contrast to the published DL models [16, 17], we aim to provide a fully
automatic DL system for COVID-19 diagnostic and prognostic analysis. Without requiring any
human-assisted annotation, this novel DL system is fast and robust in clinical use. Moreover, we collected a
large multi-regional dataset for training and validating the proposed DL system, including 1266 patients (471
had follow-up) from six cities or provinces. Notably, different from many studies using transfer learning
from natural images. We collected a large auxiliary dataset including 4106 patients with chest CT images and
gene information to pre-train the DL system, aiming at making the DL system learn lung features that can
reflect the association between micro-level lung functional abnormalities and chest CT images.
Methods
Study design and participants
The institutional review board of the seven hospitals (supplementary methods S1) approved this
multi-regional retrospective study and waived the need to obtain informed consent from the patients. In
this study, we collected two datasets: COVID-19 dataset (n=1266) and CT-epidermal growth factor
receptor (EGFR) dataset (n=4106). In the COVID-19 dataset, 1266 patients were finally included who met
the following inclusion criteria: 1) RT-PCR confirmed COVID-19; 2) laboratory confirmed other types of
pneumonia before December 2019; 3) have non-contrast enhanced chest CT at diagnosis time. Since
RT-PCR has a relatively high false-negative rate, we collected other types of pneumonia before December
Affiliations:
1
Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and
Engineering, Beihang University, Beijing, China.
2
Dept of Radiology, Renmin Hospital of Wuhan University,
Wuhan, China.
3
Dept of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University,
Chengdu, China.
4
College of Medicine and Biomedical Information Engineering, Northeastern University,
Shenyang, China.
5
Dept of Radiology, the First Affiliated Hospital of Anhui Medical University, Hefei, China.
6
Dept
of Interventional Radiology, the First Hospital of China Medical University, Shenyang, China.
7
Dept of Medical
Imaging, Henan Provincial Peoples Hospital and the Peoples Hospital of Zhengzhou University, Zhengzhou,
China.
8
Dept of Radiology, Huangshi Central Hospital, Affiliated Hospital of Hubei Polytechnic University, Edong
Healthcare Group, Huangshi, China.
9
Dept of Radiology, Beijing Youan Hospital of Capital Medical University,
Beijing, China.
10
CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of
Sciences, Beijing, China.
11
Engineering Research Center of Molecular and Neuro Imaging of Ministry of
Education, School of Life Science and Technology, Xidian University, Xian, China.
12
Contributed equally.
Correspondence: Jie Tian, CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy
of Sciences, Beijing 100190, China. E-mail: jie.tian@ia.ac.cn
https://doi.org/10.1183/13993003.00775-2020 2
INFECTIOUS DISEASE | S. WANG ET AL.

2019 when COVID-19 did not show up to guarantee the diagnoses of typical pneumonia are correct. In
the COVID-19 dataset, patients from Wuhan city and Henan province formed the training set; patients
from Anhui province formed the external validation set 1; patients from Heilongjiang province formed the
validation set 2; patients from Beijing formed the validation set 3; and patients from Huangshi city formed
the validation set 4 (figure 1).
In the CT-EGFR dataset, 4106 patients with lung cancer were finally included who met the following
criteria: 1) EGFR gene sequencing was obtained; and 2) non-contrast enhanced chest CT data obtained
within 4 weeks before EGFR gene sequencing. The CT-EGFR dataset was used for auxiliary training of the
DL system, making the DL system learn lung features automatically. CT scanning parameters about the
COVID-19 and CT-EGFR datasets are available in supplementary methods S1.
For prognostic analysis, 471 patients with COVID-19 and regular follow-up for at least 5 days were used.
We defined the prognostic end event as the hospital stay time which was determined from the diagnosis of
COVID-19 to the time when the patient was discharged from hospital (supplementary methods S2).
A short hospital stay time corresponds to good prognosis, and a long hospital stay time means worse
prognosis. Patients with long hospital stay time might take longer time to recover and are defined as
high-risk patients in this study. These patients need prior medical resources and special care since they are
more likely to become severe.
Training
Wuhan and Henan n=4106
CT image
COVID-19: n=560
Other pneumonia: n=149
Follow-up >5 days: n=301
External validation 4
Huangshi n=117
CT image
All with COVID-19
follow-up >5 days
External validation 1
Anhui n=226
CT image
COVID-19: n=102
Other pneumonia: n=124
External validation 2
Heilongjiang n=161
CT image
COVID-19: n=92
Other pneumonia: n=69
External validation 3
Beijing n=53
CT image
All with COVID-19
follow-up >5 days
Auxiliary training
Sichuan n=4106
CT image
EGFR gene
mutation status
Mutant: n=2115
Wild type: n=1991
FIGURE 1 Datasets used in this study. A total of 5372 patients with computed tomography (CT) images from seven cities or provinces were
enrolled in this study. The auxiliary training set included 4106 patients with lung cancer and epidermal growth factor receptor (EGFR) gene
mutation status information, and is used to pre-train the COVID-19Net to learn lung features from CT images. The training set includes 709
patients from Wuhan city and Henan province. The external validation set 1 (226 patients) from Anhui province, and the external validation set 2
(161 patients) from Heilongjiang province are used to assess the diagnostic performance of the deep learning (DL) system. The external validation
set 3 (53 patients with COVID-19) from Beijing, and the external validation set 4 (117 patients with COVID-19) from Huangshi city are used to
evaluate the prognostic performance of the DL system.
https://doi.org/10.1183/13993003.00775-2020 3
INFECTIOUS DISEASE | S. WANG ET AL.

The training set was used to train the proposed DL system; validation sets 1 and 2 were used to evaluate
the diagnostic performance of the DL system; and validation sets 3 and 4 were used for evaluating the
prognostic performance of the DL system.
The fully automatic DL system for COVID-19 diagnostic and prognostic analysis
The proposed DL system includes three parts: automatic lung segmentation, non-lung area suppression,
and COVID-19 diagnostic and prognostic analysis. In this DL system, two DL networks were involved:
DenseNet121-FPN for lung segmentation in chest CT image, and the proposed novel COVID-19Net for
COVID-19 diagnostic and prognostic analysis. DL is a family of hierarchical neural networks that aim at
learning the abstract mapping between raw data to the desired clinical outcome. The computational units
in the DL model are defined as layers and are integrated to simulate the inference process of the human
brain. The main computational formulas are convolution, pooling, activation and batch normalisation as
defined in supplementary methods S3.
Automatic lung segmentation
Routinely used chest CT images includes some non-lung areas (muscle, heart, etc.) and blank space
outside body. To focus on analysing lung area we used a fully automatic DL model (DenseNet121-FPN)
[18, 19] to segment lung areas in chest CT images. This model was pre-trained using ImageNet dataset,
and fine-tuned on the VESSEL12 dataset (supplementary methods S4) [20].
Through this automatic lung segmentation procedure, we acquired the lung mask on CT images. However,
some inflammatory tissues attaching to the lung wall may be falsely excluded by the DenseNet121-FPN
model. To increase the robustness of the DL system, we used the cubic bounding box of the segmented
lung mask to crop lung areas in CT images, and defined this cubic lung area as lung-region of interest
(ROI) (figure 2). In this lung-ROI, all inflammatory tissues and the whole lung were correctly reserved,
and most areas outside the lung were eliminated.
3D convolution
(kernel=3×3×3)
3D convolution
(kernel=1×1×1)
Max pooling
(window, stride=2)
Batch normalisation
Deep learning feature
Global average
pooling
Dense connection
Non-lung areaAutomatic lung segmentation
COVID-19Net: COVID-19 prognostic and diagnostic analysis model
Use CT and gene data of 4106 lung cancer patients to pre-train the COVID-19Net
Clinical features
Stepwise feature selection
Multivariate Cox
regression
CT image of patients
with lung cancer
EGFR gene
mutation status
Learn lung features that can reflect micro-level lung functional abnormality
Prognostic
outcome
COVID-19
probability
Lung mask Lung-ROIDensNet121-FPN
suppression
CT image
input
Prognostic and
diagnostic outcome
Auxiliary training
process
FIGURE 2 Illustration of the proposed deep learning (DL) system. Using the chest computed tomography (CT) scanning of a patient, the DL system
predicts the probability the patient has COVID-19 and the prognosis of this patient directly without any human annotation. The DL system includes
three parts: automatic lung segmentation (DenseNet121-FPN), non-lung area suppression, and COVID-19 diagnostic and prognostic analysis
(COVID-19Net). To let the COVID-19Net learn lung features from the large dataset we used the auxiliary training process for pre-training, which
trained the DL network to predict epidermal growth factor receptor (EGFR) gene mutation status using CT images of 4106 patients. The dense
connection in this figure means each convolutional layer is connected to all of its previous convolutional layers inside the same dense block.
https://doi.org/10.1183/13993003.00775-2020 4
INFECTIOUS DISEASE | S. WANG ET AL.

Non-lung area suppression
After the above processing, some non-lung tissues or organs (e.g. spine and heart) inside the lung-ROI
may also exist. Consequently, we proposed a non-lung area suppression operation to suppress the
intensities of non-lung areas inside the lung-ROI (supplementary methods S4). Finally, the lung-ROI was
standardised by z-score normalisation, and resized to the size of 48×240×360 voxel for further process.
DL model for COVID-19 diagnosis and prognosis
After the non-lung area suppression operation, the standardised lung-ROI was sent to the COVID-19Net
for diagnostic and prognostic analysis. Figure 2 illustrates the topological structure of the proposed novel
COVID-19Net (table S1). This DL model used a DenseNet-like structure [18], consisting of four dense
blocks, where each dense block was multiple stacks of convolution, batch normalisation and ReLU
activation layers. Inside each dense block, we used dense connection to consider multi-level image
information. At the end of the last convolutional layer, we used global average pooling to generate the
64-dimensional DL features. Finally, the output neuron was fully connected to the DL features to predict
the probability the input patient had COVID-19.
To enable the COVID-19Net to learn discriminative features associated with COVID-19, a large training
set was needed. Consequently, we proposed a two-step transfer learning process. Firstly, we proposed an
auxiliary training process using the large CT-EGFR dataset (4106 patients) as illustrated in figure 2. In this
auxiliary training process, we trained the COVID-19Net to predict EGFR mutation status (EGFR-mutant
or EGFR wild-type) using the lung-ROI [11]. Benefitting from the large CT-EGFR dataset, the
COVID-19Net learned CT features that can reflect the associations between micro-level lung functional
abnormality and macro-level CT images.
In the second training process, we transferred the pre-trained COVID-19Net to the COVID-19 dataset to
specifically mine lung characteristics associated with COVID-19. After an iterative training process in the
COVID-19 dataset (supplementary methods S5), the COVID-19Net can predict the probability of the
input patient being infected with COVID-19; this probability was defined as DL score in this study.
To explore the prognostic value of the DL features, we extracted the 64-dimensional DL feature from the
COVID-19Net for prognostic analysis. Firstly, we combined the 64-dimensional DL feature and clinical
features (age, sex and comorbidity) to construct a combined feature vector. Afterwards, we used a stepwise
method to select prognostic features. These selected features were then used to build a multivariate Cox
proportional hazard model [21] to predict the risk of the patient needing a long hospital stay time to
recover.
Visualisation of lung features learnt by the DL system
Through the two-step transfer learning technique, the DL system learnt lung features from CT images of
4815 patients. To further understand the inference process of the DL system, we used a DL visualisation
algorithm to analyse features learnt by the COVID-19Net from two perspectives: 1) visualising the
DL-discovered suspicious lung area that contributes most to identifying COVID-19 for the DL system; 2)
visualising the feature patterns extracted by hierarchical convolutional layers in the COVID-19Net
(supplementary methods S6 and S7).
Statistical analysis
Area under the receiver operating characteristic (ROC) curve, accuracy, sensitivity, specificity, F1-score,
calibration curves and Hosmer-Lemeshow test were used to assess the performance of the DL system in
diagnosing COVID-19. KaplanMeier analysis and log-rank test were used to evaluate the performance of
the DL system for prognostic analysis. The implementation of the DL system used the Keras 2.3.1 toolkit
and Python 3.7 (https://github.com/wangshuocas/COVID-19).
Results
Clinical characteristics of patients in the COVID-19 dataset are presented in table 1. This dataset was
collected from six cities or provinces including Wuhan city in China.
Diagnostic performance of the DL system
Table 2 and figure 3 illustrated the diagnostic performance of the DL system. In the training set, the DL
system showed good diagnostic performance (AUC: 0.90, sensitivity: 78.93%, specificity: 89.93%). This
performance was further confirmed in the two external validation sets (AUC: 0.87 and 0.88; sensitivity:
80.39% and 79.35%; specificity: 76.61% and 81.16%, respectively). The DL score revealed a significant
difference between COVID-19 and other pneumonia groups in the three datasets (p<0.0001). The good
performance in the validation sets indicated that the DL system generalised well on diagnosing COVID-19
https://doi.org/10.1183/13993003.00775-2020 5
INFECTIOUS DISEASE | S. WANG ET AL.

Figures
Citations
More filters
Journal ArticleDOI

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.
Journal ArticleDOI

The Role of Imaging in the Detection and Management of COVID-19: A Review

TL;DR: It is indicated that typical imaging characteristics and their changes can play crucial roles in the detection and management of COVID-19 and AI or other quantitative image analysis methods are urgently needed to maximize the value of imaging in the management of the disease.
Journal ArticleDOI

Adoption of Digital Technologies in Health Care During the COVID-19 Pandemic: Systematic Review of Early Scientific Literature.

TL;DR: It is reported that digital solutions and innovative technologies have mainly been proposed for the diagnosis of COVID-19 and digital solutions that integrate with the traditional methods, such as AI-based diagnostic algorithms based both on imaging and/or clinical data, seem promising.
Journal ArticleDOI

Leveraging Data Science to Combat COVID-19: A Comprehensive Review

TL;DR: This paper attempts to systematise the various COVID-19 research activities leveraging data science, where data science is defined broadly to encompass the various methods and tools that can be used to store, process, and extract insights from data.
Journal ArticleDOI

The ensemble deep learning model for novel COVID-19 on CT images.

TL;DR: An ensemble deep learning model can better meet the rapid detection requirements of the novel coronavirus disease COVID-19 and was compared with three component classifiers to evaluate accuracy, sensitivity, specificity, F value, and Matthews correlation coefficient.
References
More filters
Proceedings ArticleDOI

Densely Connected Convolutional Networks

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Proceedings ArticleDOI

Feature Pyramid Networks for Object Detection

TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
Journal ArticleDOI

Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China.

TL;DR: The epidemiological and clinical characteristics of novel coronavirus (2019-nCoV)-infected pneumonia in Wuhan, China, and hospital-associated transmission as the presumed mechanism of infection for affected health professionals and hospitalized patients are described.
Related Papers (5)
Frequently Asked Questions (19)
Q1. What have the authors contributed in "A fully automatic deep learning system for covid-19 diagnostic and prognostic analysis" ?

Here, the authors proposed a fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis by routinely used computed tomography. Without human assistance, the deep learning system automatically focused on abnormal areas that showed consistent characteristics with reported radiological findings. This article has supplementary material available from erj. ersjournals. Deep learning provides a convenient tool for fast screening of COVID-19 and identifying potential high-risk patients, which may be helpful for medical resource optimisation and early prevention before patients show severe symptoms. 

To avoid time-consuming lesion annotation by radiologists, automatic lesion segmentation models [ 17, 24 ] were used in further studies. In the future, the authors will use a generative adversarial network to convert CT images of different slice thickness into CT images with a unified slice thickness, which may further improve the diagnostic performance of the DL system. 

The main computational formulas are convolution, pooling, activation and batch normalisation as defined in supplementary methods S3. 

Area under the receiver operating characteristic (ROC) curve, accuracy, sensitivity, specificity, F1-score, calibration curves and Hosmer-Lemeshow test were used to assess the performance of the DL system in diagnosing COVID-19. 

The proposed DL system includes three parts: automatic lung segmentation, non-lung area suppression, and COVID-19 diagnostic and prognostic analysis. 

fast diagnosis and finding high-risk patients with worse prognosis are very helpful for the control and management of COVID-19. 

For patients with COVID-19, bilateral lung lesions consisting of ground-glass opacities were frequently observed in CT images [6–8]. 

In the COVID-19 dataset, 1266 patients were finally included who met the following inclusion criteria: 1) RT-PCR confirmed COVID-19; 2) laboratory confirmed other types of pneumonia before December 2019; 3) have non-contrast enhanced chest CT at diagnosis time. 

using a chest CT dataset for auxiliary training (pre-training) enables the DL model learn features that are more specific to chest CT images. 

Since lesions can be distributed in many locations in lungs, and automatic lesion segmentation may not guarantee very high precision. 

In the CT-EGFR dataset, 4106 patients with lung cancer were finally included who met the following criteria: 1) EGFR gene sequencing was obtained; and 2) non-contrast enhanced chest CT data obtained within 4 weeks before EGFR gene sequencing. 

After an iterative training process in the COVID-19 dataset (supplementary methods S5), the COVID-19Net can predict the probability of the input patient being infected with COVID-19; this probability was defined as DL score in this study. 

In the second training process, the authors transferred the pre-trained COVID-19Net to the COVID-19 dataset to specifically mine lung characteristics associated with COVID-19. 

In this auxiliary training process, the authors trained the COVID-19Net to predict EGFR mutation status (EGFR-mutant or EGFR wild-type) using the lung-ROI [11]. 

Through training in this large CT-EGFR dataset, the DL system learned hierarchical lung features that can reflect the associations between chest CT image and micro-level lung functional abnormality. 

Suspicious lung area discovered by the DL system Through the DL visualisation algorithm [22, 23], the authors are able to visualise the lung area that draws most attention to the DL system. 

In this study, the authors proposed a novel fully automatic DL system using raw chest CT image to help COVID-19 diagnostic and prognostic analysis. 

The good diagnostic and prognostic performance of the DL system illustrates that DL could be helpful in the epidemic control of COVID-19 without adding much cost. 

In recent studies, radiological findings demonstrated that computed tomography (CT) has great diagnostic and prognostic value for COVID-19.