Machine learning-based prediction of COVID-19 diagnosis based on symptoms

doi:10.1038/S41746-020-00372-6

Home
/
Papers
/
Machine learning-based prediction of COVID-19 diagnosis based on symptoms

Journal Article•DOI•

Machine learning-based prediction of COVID-19 diagnosis based on symptoms

Yazeed Zoabi¹, Shira Deri-Rozov¹, Noam Shomron¹•Institutions (1)

Tel Aviv University¹

04 Jan 2021-Vol. 4, Iss: 1, pp 1-5

TL;DR: In this paper, a machine learning approach was used to detect COVID-19 cases by simple features accessed by asking basic questions, such as sex, age ≥ 60 years, known contact with an infected individual, and the appearance of five initial clinical symptoms.

read less

Abstract: Effective screening of SARS-CoV-2 enables quick and efficient diagnosis of COVID-19 and can mitigate the burden on healthcare systems. Prediction models that combine several features to estimate the risk of infection have been developed. These aim to assist medical staff worldwide in triaging patients, especially in the context of limited healthcare resources. We established a machine-learning approach that trained on records from 51,831 tested individuals (of whom 4769 were confirmed to have COVID-19). The test set contained data from the subsequent week (47,401 tested individuals of whom 3624 were confirmed to have COVID-19). Our model predicted COVID-19 test results with high accuracy using only eight binary features: sex, age ≥60 years, known contact with an infected individual, and the appearance of five initial clinical symptoms. Overall, based on the nationwide data publicly reported by the Israeli Ministry of Health, we developed a model that detects COVID-19 cases by simple features accessed by asking basic questions. Our framework can be used, among other considerations, to prioritize testing for COVID-19 when testing resources are limited.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Review of Current COVID-19 Diagnostics and Opportunities for Further Development

[...]

Yan Mardian, Herman Kosasih, Muhammad Karyana, Aaron Neal¹, Chuen-Yen Lau¹ - Show less +1 more•Institutions (1)

National Institutes of Health¹

07 May 2021-Frontiers of Medicine in China

TL;DR: A comprehensive review of currently available COVID-19 diagnostics, exploring their pros and cons as well as appropriate indications are suggested in this article, where several sample-toanswer platforms, including high-throughput systems and Point of Care (PoC) assays, have been developed to increase testing capacity and decrease technical errors.

...read moreread less

Abstract: Diagnostic testing plays a critical role in addressing the coronavirus disease 2019 (COVID-19) pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Rapid and accurate diagnostic tests are imperative for identifying and managing infected individuals, contact tracing, epidemiologic characterization, and public health decision making. Laboratory testing may be performed based on symptomatic presentation or for screening of asymptomatic people. Confirmation of SARS-CoV-2 infection is typically by nucleic acid amplification tests (NAAT), which requires specialized equipment and training and may be particularly challenging in resource-limited settings. NAAT may give false-negative results due to timing of sample collection relative to infection, improper sampling of respiratory specimens, inadequate preservation of samples, and technical limitations; false-positives may occur due to technical errors, particularly contamination during the manual real-time polymerase chain reaction (RT-PCR) process. Thus, clinical presentation, contact history and contemporary phyloepidemiology must be considered when interpreting results. Several sample-to-answer platforms, including high-throughput systems and Point of Care (PoC) assays, have been developed to increase testing capacity and decrease technical errors. Alternatives to RT-PCR assay, such as other RNA detection methods and antigen tests may be appropriate for certain situations, such as resource-limited settings. While sequencing is important to monitor on-going evolution of the SARS-CoV-2 genome, antibody assays are useful for epidemiologic purposes. The ever-expanding assortment of tests, with varying clinical utility, performance requirements, and limitations, merits comparative evaluation. We herein provide a comprehensive review of currently available COVID-19 diagnostics, exploring their pros and cons as well as appropriate indications. Strategies to further optimize safety, speed, and ease of SARS-CoV-2 testing without compromising accuracy are suggested. Access to scalable diagnostic tools and continued technologic advances, including machine learning and smartphone integration, will facilitate control of the current pandemic as well as preparedness for the next one.

...read moreread less

83 citations

Journal Article•DOI•

End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: A pilot study

[...]

Harry Coppock¹, Alex Gaskell¹, Panagiotis Tzirakis¹, Alice Baird², Lyn Jones³, Björn Schuller - Show less +2 more•Institutions (3)

Imperial College London¹, University of Augsburg², North Bristol NHS Trust³

01 Apr 2021

TL;DR: In this paper, a deep neural network based model was used to detect symptomatic and asymptomatic COVID-19 cases using breath and cough audio recordings, achieving an area under the curve of the receiver operating characteristics of 0.846.

...read moreread less

Abstract: Background Since the emergence of COVID-19 in December 2019, multidisciplinary research teams have wrestled with how best to control the pandemic in light of its considerable physical, psychological and economic damage. Mass testing has been advocated as a potential remedy; however, mass testing using physical tests is a costly and hard-to-scale solution. Methods This study demonstrates the feasibility of an alternative form of COVID-19 detection, harnessing digital technology through the use of audio biomarkers and deep learning. Specifically, we show that a deep neural network based model can be trained to detect symptomatic and asymptomatic COVID-19 cases using breath and cough audio recordings. Results Our model, a custom convolutional neural network, demonstrates strong empirical performance on a data set consisting of 355 crowdsourced participants, achieving an area under the curve of the receiver operating characteristics of 0.846 on the task of COVID-19 classification. Conclusion This study offers a proof of concept for diagnosing COVID-19 using cough and breath audio signals and motivates a comprehensive follow-up research study on a wider data sample, given the evident advantages of a low-cost, highly scalable digital COVID-19 diagnostic tool.

...read moreread less

69 citations

Journal Article•DOI•

Trends in Using IoT with Machine Learning in Health Prediction System

[...]

Amani Aldahiri, Bashair Ali Alrashed, Walayat Hussain

01 Mar 2021

TL;DR: This article highlights well-known ML algorithms for classification and prediction and demonstrates how they have been used in the healthcare sector and provides some examples of IoT and machine learning to predict future healthcare system trends.

...read moreread less

Abstract: Machine learning (ML) is a powerful tool that delivers insights hidden in Internet of Things (IoT) data. These hybrid technologies work smartly to improve the decision-making process in different areas such as education, security, business, and the healthcare industry. ML empowers the IoT to demystify hidden patterns in bulk data for optimal prediction and recommendation systems. Healthcare has embraced IoT and ML so that automated machines make medical records, predict disease diagnoses, and, most importantly, conduct real-time monitoring of patients. Individual ML algorithms perform differently on different datasets. Due to the predictive results varying, this might impact the overall results. The variation in prediction results looms large in the clinical decision-making process. Therefore, it is essential to understand the different ML algorithms used to handle IoT data in the healthcare sector. This article highlights well-known ML algorithms for classification and prediction and demonstrates how they have been used in the healthcare sector. The aim of this paper is to present a comprehensive overview of existing ML approaches and their application in IoT medical data. In a thorough analysis, we observe that different ML prediction algorithms have various shortcomings. Depending on the type of IoT dataset, we need to choose an optimal method to predict critical healthcare data. The paper also provides some examples of IoT and machine learning to predict future healthcare system trends.

...read moreread less

60 citations

Journal Article•DOI•

Artificial intelligence in the diagnosis of COVID-19: challenges and perspectives.

[...]

Shigao Huang¹, Jie Yang¹, Simon Fong¹, Qi Zhao¹•Institutions (1)

University of Macau¹

10 Apr 2021-International Journal of Biological Sciences

TL;DR: In this paper, the authors address the clinical applications of machine learning and deep learning, including clinical characteristics, electronic medical records, medical images (CT, X-ray, ultrasound images, etc.) in the COVID-19 diagnosis, and the current challenges and future perspectives provided in this review can be used to direct an ideal deployment of AI technology in a pandemic.

...read moreread less

Abstract: Artificial intelligence (AI) is being used to aid in various aspects of the COVID-19 crisis, including epidemiology, molecular research and drug development, medical diagnosis and treatment, and socioeconomics. The association of AI and COVID-19 can accelerate to rapidly diagnose positive patients. To learn the dynamics of a pandemic with relevance to AI, we search the literature using the different academic databases (PubMed, PubMed Central, Scopus, Google Scholar) and preprint servers (bioRxiv, medRxiv, arXiv). In the present review, we address the clinical applications of machine learning and deep learning, including clinical characteristics, electronic medical records, medical images (CT, X-ray, ultrasound images, etc.) in the COVID-19 diagnosis. The current challenges and future perspectives provided in this review can be used to direct an ideal deployment of AI technology in a pandemic.

...read moreread less

56 citations

Journal Article•DOI•

Initial Stage COVID-19 Detection System Based on Patients’ Symptoms and Chest X-Ray Images

[...]

Mohammad Attaullah, Mushtaq Ali, M. Almufareh, Muneer Ahmad, Lal Hussain, Nz Jhanjhi, Mamoona Humayun - Show less +3 more

18 Apr 2022-Applied Artificial Intelligence

TL;DR: In this paper , the authors proposed a COVID-19 detection system with the potential to detect COVID19 in the initial stage by employing deep learning models over patients' symptoms and chest X-ray images, which obtained average accuracy 78.88%, specificity 94%, and sensitivity 77% on a testing dataset containing 800 patients' X-Ray images and 800 patients's symptoms.

...read moreread less

Abstract: ABSTRACT The accurate diagnosis of the initial stage COVID-19 is necessary for minimizing its spreading rate. The physicians most often recommend RT-PCR tests; this is invasive, time-consuming, and ineffective in reducing the spread rate of COVID-19. However, this can be minimized by using noninvasive and fast machine learning methods trained either on labeled patients’ symptoms or medical images. The machine learning methods trained on labeled patients’ symptoms cannot differentiate between different types of pneumonias like COVID-19, viral pneumonia, and bacterial pneumonia because of similar symptoms, i.e., cough, fever, headache, sore throat, and shortness of breath. The machine learning methods trained on labeled patients’ medical images have the potential to overcome the limitation of the symptom-based method; however, these methods are incapable of detecting COVID-19 in the initial stage because the infection of COVID-19 takes 3 to 12 days to appear. This research proposes a COVID-19 detection system with the potential to detect COVID-19 in the initial stage by employing deep learning models over patients’ symptoms and chest X-Ray images. The proposed system obtained average accuracy 78.88%, specificity 94%, and sensitivity 77% on a testing dataset containing 800 patients’ X-Ray images and 800 patients’ symptoms, better than existing COVID-19 detection methods.

...read moreread less

45 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

14,872 citations

Journal Article•DOI•

An Introduction to the Bootstrap.

[...]

Stephen T. Buckland, Bradley Efron, Robert Tibshirani

01 Sep 1994-Biometrics

9,844 citations

Journal Article•DOI•

An interactive web-based dashboard to track COVID-19 in real time.

[...]

Ensheng Dong¹, Hongru Du¹, Lauren Gardner¹•Institutions (1)

Johns Hopkins University¹

01 May 2020-Lancet Infectious Diseases

TL;DR: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around the world.

...read moreread less

Abstract: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around t

...read moreread less

8,336 citations

Journal Article•DOI•

The reproductive number of COVID-19 is higher compared to SARS coronavirus.

[...]

Ying Liu¹, Albert A Gayle², Annelies Wilder-Smith³, Annelies Wilder-Smith², Joacim Rocklöv² - Show less +1 more•Institutions (3)

Xiamen University¹, Umeå University², Heidelberg University³

13 Mar 2020-Journal of Travel Medicine

TL;DR: The authors' review found the average R0 for 2019-nCoV to be 3.28, which exceeds WHO estimates of 1.4 to 2.5, and is higher than expected.

...read moreread less

Abstract: Teaser: Our review found the average R0 for 2019-nCoV to be 3.28, which exceeds WHO estimates of 1.4 to 2.5.

...read moreread less

2,664 citations

Journal Article•DOI•

Do we need hundreds of classifiers to solve real world classification problems

[...]

Manuel Fernández-Delgado¹, E. Cernadas¹, Senén Barro¹, Dinani Gomes Amorim•Institutions (1)

University of Santiago de Compostela¹

01 Jan 2014-Journal of Machine Learning Research

TL;DR: The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).

...read moreread less

Abstract: We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today. We use 121 data sets, which represent the whole UCI data base (excluding the large-scale problems) and other own real problems, in order to achieve significant conclusions about the classifier behavior, not dependent on the data set collection. The classifiers most likely to be the bests are the random forest (RF) versions, the best of which (implemented in R and accessed via caret) achieves 94.1% of the maximum accuracy overcoming 90% in the 84.3% of the data sets. However, the difference is not statistically significant with the second best, the SVM with Gaussian kernel implemented in C using LibSVM, which achieves 92.3% of the maximum accuracy. A few models are clearly better than the remaining ones: random forest, SVM with Gaussian and polynomial kernels, extreme learning machine with Gaussian kernel, C5.0 and avNNet (a committee of multi-layer perceptrons implemented in R with the caret package). The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in the top-20, respectively).

...read moreread less

2,616 citations