scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Machine learning-based prediction of COVID-19 diagnosis based on symptoms

04 Jan 2021-Vol. 4, Iss: 1, pp 1-5
TL;DR: In this paper, a machine learning approach was used to detect COVID-19 cases by simple features accessed by asking basic questions, such as sex, age ≥ 60 years, known contact with an infected individual, and the appearance of five initial clinical symptoms.
Abstract: Effective screening of SARS-CoV-2 enables quick and efficient diagnosis of COVID-19 and can mitigate the burden on healthcare systems. Prediction models that combine several features to estimate the risk of infection have been developed. These aim to assist medical staff worldwide in triaging patients, especially in the context of limited healthcare resources. We established a machine-learning approach that trained on records from 51,831 tested individuals (of whom 4769 were confirmed to have COVID-19). The test set contained data from the subsequent week (47,401 tested individuals of whom 3624 were confirmed to have COVID-19). Our model predicted COVID-19 test results with high accuracy using only eight binary features: sex, age ≥60 years, known contact with an infected individual, and the appearance of five initial clinical symptoms. Overall, based on the nationwide data publicly reported by the Israeli Ministry of Health, we developed a model that detects COVID-19 cases by simple features accessed by asking basic questions. Our framework can be used, among other considerations, to prioritize testing for COVID-19 when testing resources are limited.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of currently available COVID-19 diagnostics, exploring their pros and cons as well as appropriate indications are suggested in this article, where several sample-toanswer platforms, including high-throughput systems and Point of Care (PoC) assays, have been developed to increase testing capacity and decrease technical errors.
Abstract: Diagnostic testing plays a critical role in addressing the coronavirus disease 2019 (COVID-19) pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Rapid and accurate diagnostic tests are imperative for identifying and managing infected individuals, contact tracing, epidemiologic characterization, and public health decision making. Laboratory testing may be performed based on symptomatic presentation or for screening of asymptomatic people. Confirmation of SARS-CoV-2 infection is typically by nucleic acid amplification tests (NAAT), which requires specialized equipment and training and may be particularly challenging in resource-limited settings. NAAT may give false-negative results due to timing of sample collection relative to infection, improper sampling of respiratory specimens, inadequate preservation of samples, and technical limitations; false-positives may occur due to technical errors, particularly contamination during the manual real-time polymerase chain reaction (RT-PCR) process. Thus, clinical presentation, contact history and contemporary phyloepidemiology must be considered when interpreting results. Several sample-to-answer platforms, including high-throughput systems and Point of Care (PoC) assays, have been developed to increase testing capacity and decrease technical errors. Alternatives to RT-PCR assay, such as other RNA detection methods and antigen tests may be appropriate for certain situations, such as resource-limited settings. While sequencing is important to monitor on-going evolution of the SARS-CoV-2 genome, antibody assays are useful for epidemiologic purposes. The ever-expanding assortment of tests, with varying clinical utility, performance requirements, and limitations, merits comparative evaluation. We herein provide a comprehensive review of currently available COVID-19 diagnostics, exploring their pros and cons as well as appropriate indications. Strategies to further optimize safety, speed, and ease of SARS-CoV-2 testing without compromising accuracy are suggested. Access to scalable diagnostic tools and continued technologic advances, including machine learning and smartphone integration, will facilitate control of the current pandemic as well as preparedness for the next one.

83 citations

Journal ArticleDOI
01 Apr 2021
TL;DR: In this paper, a deep neural network based model was used to detect symptomatic and asymptomatic COVID-19 cases using breath and cough audio recordings, achieving an area under the curve of the receiver operating characteristics of 0.846.
Abstract: Background Since the emergence of COVID-19 in December 2019, multidisciplinary research teams have wrestled with how best to control the pandemic in light of its considerable physical, psychological and economic damage. Mass testing has been advocated as a potential remedy; however, mass testing using physical tests is a costly and hard-to-scale solution. Methods This study demonstrates the feasibility of an alternative form of COVID-19 detection, harnessing digital technology through the use of audio biomarkers and deep learning. Specifically, we show that a deep neural network based model can be trained to detect symptomatic and asymptomatic COVID-19 cases using breath and cough audio recordings. Results Our model, a custom convolutional neural network, demonstrates strong empirical performance on a data set consisting of 355 crowdsourced participants, achieving an area under the curve of the receiver operating characteristics of 0.846 on the task of COVID-19 classification. Conclusion This study offers a proof of concept for diagnosing COVID-19 using cough and breath audio signals and motivates a comprehensive follow-up research study on a wider data sample, given the evident advantages of a low-cost, highly scalable digital COVID-19 diagnostic tool.

69 citations

Journal ArticleDOI
01 Mar 2021
TL;DR: This article highlights well-known ML algorithms for classification and prediction and demonstrates how they have been used in the healthcare sector and provides some examples of IoT and machine learning to predict future healthcare system trends.
Abstract: Machine learning (ML) is a powerful tool that delivers insights hidden in Internet of Things (IoT) data. These hybrid technologies work smartly to improve the decision-making process in different areas such as education, security, business, and the healthcare industry. ML empowers the IoT to demystify hidden patterns in bulk data for optimal prediction and recommendation systems. Healthcare has embraced IoT and ML so that automated machines make medical records, predict disease diagnoses, and, most importantly, conduct real-time monitoring of patients. Individual ML algorithms perform differently on different datasets. Due to the predictive results varying, this might impact the overall results. The variation in prediction results looms large in the clinical decision-making process. Therefore, it is essential to understand the different ML algorithms used to handle IoT data in the healthcare sector. This article highlights well-known ML algorithms for classification and prediction and demonstrates how they have been used in the healthcare sector. The aim of this paper is to present a comprehensive overview of existing ML approaches and their application in IoT medical data. In a thorough analysis, we observe that different ML prediction algorithms have various shortcomings. Depending on the type of IoT dataset, we need to choose an optimal method to predict critical healthcare data. The paper also provides some examples of IoT and machine learning to predict future healthcare system trends.

60 citations

Journal ArticleDOI
TL;DR: In this paper, the authors address the clinical applications of machine learning and deep learning, including clinical characteristics, electronic medical records, medical images (CT, X-ray, ultrasound images, etc.) in the COVID-19 diagnosis, and the current challenges and future perspectives provided in this review can be used to direct an ideal deployment of AI technology in a pandemic.
Abstract: Artificial intelligence (AI) is being used to aid in various aspects of the COVID-19 crisis, including epidemiology, molecular research and drug development, medical diagnosis and treatment, and socioeconomics. The association of AI and COVID-19 can accelerate to rapidly diagnose positive patients. To learn the dynamics of a pandemic with relevance to AI, we search the literature using the different academic databases (PubMed, PubMed Central, Scopus, Google Scholar) and preprint servers (bioRxiv, medRxiv, arXiv). In the present review, we address the clinical applications of machine learning and deep learning, including clinical characteristics, electronic medical records, medical images (CT, X-ray, ultrasound images, etc.) in the COVID-19 diagnosis. The current challenges and future perspectives provided in this review can be used to direct an ideal deployment of AI technology in a pandemic.

56 citations

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a COVID-19 detection system with the potential to detect COVID19 in the initial stage by employing deep learning models over patients' symptoms and chest X-ray images, which obtained average accuracy 78.88%, specificity 94%, and sensitivity 77% on a testing dataset containing 800 patients' X-Ray images and 800 patients's symptoms.
Abstract: ABSTRACT The accurate diagnosis of the initial stage COVID-19 is necessary for minimizing its spreading rate. The physicians most often recommend RT-PCR tests; this is invasive, time-consuming, and ineffective in reducing the spread rate of COVID-19. However, this can be minimized by using noninvasive and fast machine learning methods trained either on labeled patients’ symptoms or medical images. The machine learning methods trained on labeled patients’ symptoms cannot differentiate between different types of pneumonias like COVID-19, viral pneumonia, and bacterial pneumonia because of similar symptoms, i.e., cough, fever, headache, sore throat, and shortness of breath. The machine learning methods trained on labeled patients’ medical images have the potential to overcome the limitation of the symptom-based method; however, these methods are incapable of detecting COVID-19 in the initial stage because the infection of COVID-19 takes 3 to 12 days to appear. This research proposes a COVID-19 detection system with the potential to detect COVID-19 in the initial stage by employing deep learning models over patients’ symptoms and chest X-Ray images. The proposed system obtained average accuracy 78.88%, specificity 94%, and sensitivity 77% on a testing dataset containing 800 patients’ X-Ray images and 800 patients’ symptoms, better than existing COVID-19 detection methods.

45 citations

References
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Journal ArticleDOI
TL;DR: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around the world.
Abstract: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around t

8,336 citations

Journal ArticleDOI
TL;DR: The authors' review found the average R0 for 2019-nCoV to be 3.28, which exceeds WHO estimates of 1.4 to 2.5, and is higher than expected.
Abstract: Teaser: Our review found the average R0 for 2019-nCoV to be 3.28, which exceeds WHO estimates of 1.4 to 2.5.

2,664 citations

Journal ArticleDOI
TL;DR: The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).
Abstract: We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today. We use 121 data sets, which represent the whole UCI data base (excluding the large-scale problems) and other own real problems, in order to achieve significant conclusions about the classifier behavior, not dependent on the data set collection. The classifiers most likely to be the bests are the random forest (RF) versions, the best of which (implemented in R and accessed via caret) achieves 94.1% of the maximum accuracy overcoming 90% in the 84.3% of the data sets. However, the difference is not statistically significant with the second best, the SVM with Gaussian kernel implemented in C using LibSVM, which achieves 92.3% of the maximum accuracy. A few models are clearly better than the remaining ones: random forest, SVM with Gaussian and polynomial kernels, extreme learning machine with Gaussian kernel, C5.0 and avNNet (a committee of multi-layer perceptrons implemented in R with the caret package). The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in the top-20, respectively).

2,616 citations

Related Papers (5)