scispace - formally typeset
Search or ask a question
Author

Dijia Wu

Other affiliations: ShanghaiTech University, Princeton University, Microsoft  ...read more
Bio: Dijia Wu is an academic researcher from Siemens. The author has contributed to research in topics: Segmentation & Image segmentation. The author has an hindex of 19, co-authored 67 publications receiving 1221 citations. Previous affiliations of Dijia Wu include ShanghaiTech University & Princeton University.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper developed a dual-sampling attention network to automatically diagnose COVID-19 from the community acquired pneumonia (CAP) in chest computed tomography (CT), and proposed a novel online attention module with a 3D convolutional network (CNN) to focus on the infection regions in lungs when making decisions of diagnoses.
Abstract: The coronavirus disease (COVID-19) is rapidly spreading all over the world, and has infected more than 1,436,000 people in more than 200 countries and territories as of April 9, 2020. Detecting COVID-19 at early stage is essential to deliver proper healthcare to the patients and also to protect the uninfected population. To this end, we develop a dual-sampling attention network to automatically diagnose COVID-19 from the community acquired pneumonia (CAP) in chest computed tomography (CT). In particular, we propose a novel online attention module with a 3D convolutional network (CNN) to focus on the infection regions in lungs when making decisions of diagnoses. Note that there exists imbalanced distribution of the sizes of the infection regions between COVID-19 and CAP, partially due to fast progress of COVID-19 after symptom onset. Therefore, we develop a dual-sampling strategy to mitigate the imbalanced learning. Our method is evaluated (to our best knowledge) upon the largest multi-center CT data for COVID-19 from 8 hospitals. In the training-validation stage, we collect 2186 CT scans from 1588 patients for a 5-fold cross-validation. In the testing stage, we employ another independent large-scale testing dataset including 2796 CT scans from 2057 patients. Results show that our algorithm can identify the COVID-19 images with the area under the receiver operating characteristic curve (AUC) value of 0.944, accuracy of 87.5%, sensitivity of 86.9%, specificity of 90.1%, and F1-score of 82.0%. With this performance, the proposed algorithm could potentially aid radiologists with COVID-19 diagnosis from CAP, especially in the early stage of the COVID-19 outbreak.

256 citations

Journal ArticleDOI
TL;DR: An infection size-aware random forest method (iSARF) was proposed for discriminating COVID-19 from CAP and yielded its best performance when using the handcrafted features, with a sensitivity and accuracy of 90.7%, a specificity and an accuracy of 89.4% over state-of-the-art classifiers.
Abstract: The worldwide spread of coronavirus disease (COVID-19) has become a threatening risk for global public health. It is of great importance to rapidly and accurately screen patients with COVID-19 from community acquired pneumonia (CAP). In this study, a total of 1658 patients with COVID-19 and 1027 patients of CAP underwent thin-section CT. All images were preprocessed to obtain the segmentations of both infections and lung fields, which were used to extract location-specific features. An infection Size Aware Random Forest method (iSARF) was proposed, in which subjects were automated categorized into groups with different ranges of infected lesion sizes, followed by random forests in each group for classification. Experimental results show that the proposed method yielded sensitivity of 0.907, specificity of 0.833, and accuracy of 0.879 under five-fold cross-validation. Large performance margins against comparison methods were achieved especially for the cases with infection size in the medium range, from 0.01% to 10%. The further inclusion of Radiomics features show slightly improvement. It is anticipated that our proposed framework could assist clinical decision making.

225 citations

Journal ArticleDOI
TL;DR: In this article, a unified latent representation is learned which can completely encode information from different aspects of features and is endowed with promising class structure for separability, while the completeness is guaranteed with a group of backward neural networks (each for one type of features), while by using class labels the representation is enforced to be compact within COVID-19/community-acquired pneumonia (CAP).
Abstract: Recently, the outbreak of Coronavirus Disease 2019 (COVID-19) has spread rapidly across the world. Due to the large number of infected patients and heavy labor for doctors, computer-aided diagnosis with machine learning algorithm is urgently needed, and could largely reduce the efforts of clinicians and accelerate the diagnosis process. Chest computed tomography (CT) has been recognized as an informative tool for diagnosis of the disease. In this study, we propose to conduct the diagnosis of COVID-19 with a series of features extracted from CT images. To fully explore multiple features describing CT images from different views, a unified latent representation is learned which can completely encode information from different aspects of features and is endowed with promising class structure for separability. Specifically, the completeness is guaranteed with a group of backward neural networks (each for one type of features), while by using class labels the representation is enforced to be compact within COVID-19/community-acquired pneumonia (CAP) and also a large margin is guaranteed between different types of pneumonia. In this way, our model can well avoid overfitting compared to the case of directly projecting high-dimensional features into classes. Extensive experimental results show that the proposed method outperforms all comparison methods, and rather stable performances are observed when varying the number of training data.

170 citations

Journal ArticleDOI
TL;DR: In this article, a set of handcrafted location-specific features was proposed to best capture the COVID-19 distribution pattern, in comparison to conventional CT severity score (CT-SS) and Radiomics features.
Abstract: The worldwide spread of coronavirus disease (COVID-19) has become a threatening risk for global public health. It is of great importance to rapidly and accurately screen patients with COVID-19 from community acquired pneumonia (CAP). In this study, a total of 1658 patients with COVID-19 and 1027 CAP patients underwent thin-section CT. All images were preprocessed to obtain the segmentations of infections and lung fields. A set of handcrafted location-specific features was proposed to best capture the COVID-19 distribution pattern, in comparison to conventional CT severity score (CT-SS) and Radiomics features. An infection Size Aware Random Forest method (iSARF) was used for classification. Experimental results show that the proposed method yielded best performance when using the handcrafted features with sensitivity of 91.6%, specificity of 86.8%, and accuracy of 89.8% over state-of-the-art classifiers. Additional test on 734 subjects with thick slice images demonstrates great generalizability. It is anticipated that our proposed framework could assist clinical decision making. Furthermore, the data of extracted features will be made available after the review process.

158 citations

Journal ArticleDOI
TL;DR: Experimental results on the CO VID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.
Abstract: Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an A daptive F eature S election guided D eep F orest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE), AUC, precision and F1-score achieved by our method are 91.79%, 93.05%, 89.95%, 96.35%, 93.10% and 93.07%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.

156 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Two specific computer-aided detection problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification are studied, achieving the state-of-the-art performance on the mediastinal LN detection, and the first five-fold cross-validation classification results are reported.
Abstract: Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.

4,249 citations

Journal Article
TL;DR: In this article, a fast Fourier transform method of topography and interferometry is proposed to discriminate between elevation and depression of the object or wave-front form, which has not been possible by the fringe-contour generation techniques.
Abstract: A fast-Fourier-transform method of topography and interferometry is proposed. By computer processing of a noncontour type of fringe pattern, automatic discrimination is achieved between elevation and depression of the object or wave-front form, which has not been possible by the fringe-contour-generation techniques. The method has advantages over moire topography and conventional fringe-contour interferometry in both accuracy and sensitivity. Unlike fringe-scanning techniques, the method is easy to apply because it uses no moving components.

3,742 citations

Journal ArticleDOI
TL;DR: COVID-Net is introduced, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest X-ray (CXR) images that is open source and available to the general public, and COVIDx, an open access benchmark dataset comprising of 13,975 CXR images across 13,870 patient patient cases.
Abstract: The Coronavirus Disease 2019 (COVID-19) pandemic continues to have a devastating effect on the health and well-being of the global population. A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiology examination using chest radiography. It was found in early studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19. Motivated by this and inspired by the open source efforts of the research community, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest X-ray (CXR) images that is open source and available to the general public. To the best of the authors' knowledge, COVID-Net is one of the first open source network designs for COVID-19 detection from CXR images at the time of initial release. We also introduce COVIDx, an open access benchmark dataset that we generated comprising of 13,975 CXR images across 13,870 patient patient cases, with the largest number of publicly available COVID-19 positive cases to the best of the authors' knowledge. Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to not only gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening, but also audit COVID-Net in a responsible and transparent manner to validate that it is making decisions based on relevant information from the CXR images. By no means a production-ready solution, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most.

2,193 citations

Journal ArticleDOI
07 Apr 2020-BMJ
TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.
Abstract: Objective To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at increased risk of covid-19 infection or being admitted to hospital with the disease. Design Living systematic review and critical appraisal by the COVID-PRECISE (Precise Risk Estimation to optimise covid-19 Care for Infected or Suspected patients in diverse sEttings) group. Data sources PubMed and Embase through Ovid, up to 1 July 2020, supplemented with arXiv, medRxiv, and bioRxiv up to 5 May 2020. Study selection Studies that developed or validated a multivariable covid-19 related prediction model. Data extraction At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). Results 37 421 titles were screened, and 169 studies describing 232 prediction models were included. The review identified seven models for identifying people at risk in the general population; 118 diagnostic models for detecting covid-19 (75 were based on medical imaging, 10 to diagnose disease severity); and 107 prognostic models for predicting mortality risk, progression to severe disease, intensive care unit admission, ventilation, intubation, or length of hospital stay. The most frequent types of predictors included in the covid-19 prediction models are vital signs, age, comorbidities, and image features. Flu-like symptoms are frequently predictive in diagnostic models, while sex, C reactive protein, and lymphocyte counts are frequent prognostic factors. Reported C index estimates from the strongest form of validation available per model ranged from 0.71 to 0.99 in prediction models for the general population, from 0.65 to more than 0.99 in diagnostic models, and from 0.54 to 0.99 in prognostic models. All models were rated at high or unclear risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, high risk of model overfitting, and unclear reporting. Many models did not include a description of the target population (n=27, 12%) or care setting (n=75, 32%), and only 11 (5%) were externally validated by a calibration plot. The Jehi diagnostic model and the 4C mortality score were identified as promising models. Conclusion Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that almost all pubished prediction models are poorly reported, and at high risk of bias such that their reported predictive performance is probably optimistic. However, we have identified two (one diagnostic and one prognostic) promising models that should soon be validated in multiple cohorts, preferably through collaborative efforts and data sharing to also allow an investigation of the stability and heterogeneity in their performance across populations and settings. Details on all reviewed models are publicly available at https://www.covprecise.org/. Methodological guidance as provided in this paper should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, prediction model authors should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Systematic review registration Protocol https://osf.io/ehc47/, registration https://osf.io/wy245. Readers’ note This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 3 of the original article published on 7 April 2020 (BMJ 2020;369:m1328). Previous updates can be found as data supplements (https://www.bmj.com/content/369/bmj.m1328/related#datasupp). When citing this paper please consider adding the update number and date of access for clarity.

2,183 citations

Reference EntryDOI
15 Oct 2004

2,118 citations