scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Journal of Biomedical and Health Informatics in 2020"


Journal ArticleDOI
TL;DR: A modified convolutional neural network architecture termed fully dense UNet (FD-UNet) is proposed for removing artifacts from two-dimensional PAT images reconstructed from sparse data and the proposed CNN is compared with the standard UNet in terms of reconstructed image quality.
Abstract: Photoacoustic imaging is an emerging imaging modality that is based upon the photoacoustic effect. In photoacoustic tomography (PAT), the induced acoustic pressure waves are measured by an array of detectors and used to reconstruct an image of the initial pressure distribution. A common challenge faced in PAT is that the measured acoustic waves can only be sparsely sampled. Reconstructing sparsely sampled data using standard methods results in severe artifacts that obscure information within the image. We propose a modified convolutional neural network (CNN) architecture termed fully dense UNet (FD-UNet) for removing artifacts from two-dimensional PAT images reconstructed from sparse data and compare the proposed CNN with the standard UNet in terms of reconstructed image quality.

293 citations


Journal ArticleDOI
TL;DR: In this paper, a novel electrocardiogram (ECG) classification algorithm is proposed for continuous cardiac monitoring on wearable devices with limited processing capacity, which employs a novel architecture consisting of wavelet transform and multiple long short-term memory (LSTM) recurrent neural networks.
Abstract: Objective : A novel electrocardiogram (ECG) classification algorithm is proposed for continuous cardiac monitoring on wearable devices with limited processing capacity. Methods : The proposed solution employs a novel architecture consisting of wavelet transform and multiple long short-term memory (LSTM) recurrent neural networks (see Fig. 1 ). Results : Experimental evaluations show superior ECG classification performance compared to previous works. Measurements on different hardware platforms show the proposed algorithm meets timing requirements for continuous and real-time execution on wearable devices. Conclusion : In contrast to many compute-intensive deep-learning based approaches, the proposed algorithm is lightweight, and therefore, brings continuous monitoring with accurate LSTM-based ECG classification to wearable devices. Significance : The proposed algorithm is both accurate and lightweight. The source code is available online at http://lis.ee.sharif.edu .

223 citations


Journal ArticleDOI
TL;DR: In this article, the authors used automated extraction of COVID-19-related discussions from social media and a natural language process (NLP) method based on topic modeling to uncover various issues related to the disease from public opinions.
Abstract: Internet forums and public social media, such as online healthcare forums, provide a convenient channel for users (people/patients) concerned about health issues to discuss and share information with each other. In late December 2019, an outbreak of a novel coronavirus (infection from which results in the disease named COVID-19) was reported, and, due to the rapid spread of the virus in other parts of the world, the World Health Organization declared a state of emergency. In this paper, we used automated extraction of COVID-19–related discussions from social media and a natural language process (NLP) method based on topic modeling to uncover various issues related to COVID-19 from public opinions. Moreover, we also investigate how to use LSTM recurrent neural network for sentiment classification of COVID-19 comments. Our findings shed light on the importance of using public opinions and suitable computational techniques to understand issues surrounding COVID-19 and to guide related decision-making. In addition, experiments demonstrated that the research model achieved an accuracy of 81.15% – a higher accuracy than that of several other well-known machine-learning algorithms for COVID-19–Sentiment Classification.

200 citations


Journal ArticleDOI
TL;DR: The high sensitivities achieved by most recent COVID-19 classification models are demystified, a homogeneous and balanced database that includes all levels of severity, from normal with Positive RT-PCR, Mild, Moderate to Severe is built and COVID Smart Data based Network (COVID-SDNet) methodology is proposed for improving the generalization capacity of CO VID-classification models.
Abstract: Currently, Coronavirus disease (COVID-19), one of the most infectious diseases in the 21st century, is diagnosed using RT-PCR testing, CT scans and/or Chest X-Ray (CXR) images. CT (Computed Tomography) scanners and RT-PCR testing are not available in most medical centers and hence in many cases CXR images become the most time/cost effective tool for assisting clinicians in making decisions. Deep learning neural networks have a great potential for building COVID-19 triage systems and detecting COVID-19 patients, especially patients with low severity. Unfortunately, current databases do not allow building such systems as they are highly heterogeneous and biased towards severe cases. This article is three-fold: (i) we demystify the high sensitivities achieved by most recent COVID-19 classification models, (ii) under a close collaboration with Hospital Universitario Clinico San Cecilio, Granada, Spain, we built COVIDGR-1.0, a homogeneous and balanced database that includes all levels of severity, from normal with Positive RT-PCR, Mild, Moderate to Severe. COVIDGR-1.0 contains 426 positive and 426 negative PA (PosteroAnterior) CXR views and (iii) we propose COVID Smart Data based Network (COVID-SDNet) methodology for improving the generalization capacity of COVID-classification models. Our approach reaches good and stable results with an accuracy of $\text{97.72}\% \pm \text{0.95}\%$ , $\text{86.90}\% \pm \text{3.20}\%$ , $\text{61.80}\% \pm \text{5.49}\%$ in severe, moderate and mild COVID-19 severity levels. Our approach could help in the early detection of COVID-19. COVIDGR-1.0 along with the severity level labels are available to the scientific community through this link https://dasci.es/es/transferencia/open-data/covidgr/ .

193 citations


Journal ArticleDOI
TL;DR: Integrative analytics approaches driven by associate research branches highlighted in this study promise to revolutionize imaging informatics as known today across the healthcare continuum for both radiology and digital pathology applications.
Abstract: This paper reviews state-of-the-art research solutions across the spectrum of medical imaging informatics, discusses clinical translation, and provides future directions for advancing clinical practice. More specifically, it summarizes advances in medical imaging acquisition technologies for different modalities, highlighting the necessity for efficient medical data management strategies in the context of AI in big healthcare data analytics. It then provides a synopsis of contemporary and emerging algorithmic methods for disease classification and organ/ tissue segmentation, focusing on AI and deep learning architectures that have already become the de facto approach. The clinical benefits of in-silico modelling advances linked with evolving 3D reconstruction and visualization applications are further documented. Concluding, integrative analytics approaches driven by associate research branches highlighted in this study promise to revolutionize imaging informatics as known today across the healthcare continuum for both radiology and digital pathology applications. The latter, is projected to enable informed, more accurate diagnosis, timely prognosis, and effective treatment planning, underpinning precision medicine.

173 citations


Journal ArticleDOI
TL;DR: A new unsupervised learning method using convolutional neural networks under an end-to-end framework, Volume Tweening Network (VTN), for 3D medical image registration is proposed, which is 880x faster (or 3.3x faster without GPU acceleration) than traditional optimization-based methods and achieves state-of-the-art performance inmedical image registration.
Abstract: 3D medical image registration is of great clinical importance. However, supervised learning methods require a large amount of accurately annotated corresponding control points (or morphing), which are very difficult to obtain. Unsupervised learning methods ease the burden of manual annotation by exploiting unlabeled data without supervision. In this article, we propose a new unsupervised learning method using convolutional neural networks under an end-to-end framework, Volume Tweening Network (VTN), for 3D medical image registration. We propose three innovative technical components: (1) An end-to-end cascading scheme that resolves large displacement; (2) An efficient integration of affine registration network; and (3) An additional invertibility loss that encourages backward consistency. Experiments demonstrate that our algorithm is 880x faster (or 3.3x faster without GPU acceleration) than traditional optimization-based methods and achieves state-of-the-art performance in medical image registration.

165 citations


Journal ArticleDOI
TL;DR: A deep learning model capable of forecasting glucose levels with leading accuracy for simulated patient cases and with minimal time lag both in a simulated patient dataset and in a real patient dataset is presented.
Abstract: Control of blood glucose is essential for diabetes management. Current digital therapeutic approaches for subjects with type 1 diabetes mellitus such as the artificial pancreas and insulin bolus calculators leverage machine learning techniques for predicting subcutaneous glucose for improved control. Deep learning has recently been applied in healthcare and medical research to achieve state-of-the-art results in a range of tasks including disease diagnosis, and patient state prediction among others. In this paper, we present a deep learning model that is capable of forecasting glucose levels with leading accuracy for simulated patient cases (root-mean-square error (RMSE) = 9.38 $\pm$ 0.71 [mg/dL] over a 30-min horizon, RMSE = 18.87 $\pm$ 2.25 [mg/dL] over a 60-min horizon) and real patient cases (RMSE = 21.07 $\pm$ 2.35 [mg/dL] for 30 min, RMSE = 33.27 $\pm$ 4.79% for 60 min). In addition, the model provides competitive performance in providing effective prediction horizon ( $\text{PH}_{\text{eff}}$ ) with minimal time lag both in a simulated patient dataset ( $\text{PH}_{\text{eff}}$ = 29.0 $\pm$ 0.7 for 30 min and $\text{PH}_{\text{eff}}$ = 49.8 $\pm$ 2.9 for 60 min) and in a real patient dataset ( $\text{PH}_{\text{eff}}$ = 19.3 $\pm$ 3.1 for 30 min and $\text{PH}_{\text{eff}}$ = 29.3 $\pm$ 9.4 for 60 min). This approach is evaluated on a dataset of ten simulated cases generated from the UVA/Padova simulator and a clinical dataset of ten real cases each containing glucose readings, insulin bolus, and meal (carbohydrate) data. Performance of the recurrent convolutional neural network is benchmarked against four algorithms. The proposed algorithm is implemented on an Android mobile phone, with an execution time of 6 ms on a phone compared to an execution time of 780 ms on a laptop.

165 citations


Journal ArticleDOI
TL;DR: Experimental results on the CO VID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.
Abstract: Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an A daptive F eature S election guided D eep F orest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE), AUC, precision and F1-score achieved by our method are 91.79%, 93.05%, 89.95%, 96.35%, 93.10% and 93.07%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.

156 citations


Journal ArticleDOI
TL;DR: Experimental result demonstrates that the proposed approach outperforms most state-of-the-art methods on seizure prediction, including common spatial pattern (CSP) and convolutional neural network (CNN).
Abstract: Epilepsy seizure prediction paves the way of timely warning for patients to take more active and effective intervention measures. Compared to seizure detection that only identifies the inter-ictal state and the ictal state, far fewer researches have been conducted on seizure prediction because the high similarity makes it challenging to distinguish between the pre-ictal state and the inter-ictal state. In this paper, a novel solution on seizure prediction is proposed using common spatial pattern (CSP) and convolutional neural network (CNN). Firstly, artificial pre-ictal EEG signals based on the original ones are generated by combining the segmented pre-ictal signals to solve the trial imbalance problem between the two states. Secondly, a feature extractor employing wavelet packet decomposition and CSP is designed to extract the distinguishing features in both the time domain and the frequency domain. It can improve overall accuracy while reducing the training time. Finally, a shallow CNN is applied to discriminate between the pre-ictal state and the inter-ictal state. Our proposed solution is evaluated on 23 patients’ data from Boston Children's Hospital-MIT scalp EEG dataset by employing a leave-one-out cross-validation, and it achieves a sensitivity of 92.2% and false prediction rate of 0.12/h. Experimental result demonstrates that the proposed approach outperforms most state-of-the-art methods.

155 citations


Journal ArticleDOI
TL;DR: It is shown that the proposed architecture's decentralized authentication among a distributed affiliated hospital network does not require re-authentication, which will have a considerable impact on increasing throughput, reducing overhead, improving response time, and decreasing energy consumption in the network.
Abstract: In any interconnected healthcare system (e.g., those that are part of a smart city), interactions between patients, medical doctors, nurses and other healthcare practitioners need to be secure and efficient. For example, all members must be authenticated and securely interconnected to minimize security and privacy breaches from within a given network. However, introducing security and privacy-preserving solutions can also incur delays in processing and other related services, potentially threatening patients lives in critical situations. A considerable number of authentication and security systems presented in the literature are centralized, and frequently need to rely on some secure and trusted third-party entity to facilitate secure communications. This, in turn, increases the time required for authentication and decreases throughput due to known overhead, for patients and inter-hospital communications. In this paper, we propose a novel decentralized authentication of patients in a distributed hospital network, by leveraging blockchain. Our notion of a healthcare setting includes patients and allied health professionals (medical doctors, nurses, technicians, etc), and the health information of patients. Findings from our in-depth simulations demonstrate the potential utility of the proposed architecture. For example, it is shown that the proposed architecture's decentralized authentication among a distributed affiliated hospital network does not require re-authentication. This improvement will have a considerable impact on increasing throughput, reducing overhead, improving response time, and decreasing energy consumption in the network. We also provide a comparative analysis of our model in relation to a base model of the network without blockchain to show the overall effectiveness of our proposed solution.

144 citations


Journal ArticleDOI
TL;DR: This paper proposes a secure data storage and sharing method consisted of a selective encryption algorithm combined with fragmentation and dispersion to protect the data safety and privacy even when both transmission media and keys are compromised.
Abstract: The recent spades of cyber attacks have compromised end-users’ data security and privacy in Medical Cyber-Physical Systems (MCPS) in the era of Health 4.0. Traditional standard encryption algorithms for data protection are designed based on a viewpoint of system architecture rather than a viewpoint of end-users. As such encryption algorithms are transferring the protection on the data to the protection on the keys, data safety, and privacy will be compromised once the key is exposed. In this paper, we propose a secure data storage and sharing method consisted of a selective encryption algorithm combined with fragmentation and dispersion to protect the data safety and privacy even when both transmission media (e.g. cloud servers) and keys are compromised. This method is based on a user-centric design that protects the data on a trusted device such as the end-users’ smartphone and lets the end-user control the access for data sharing. We also evaluate the performance of the algorithm on a smartphone platform to prove efficiency.

Journal ArticleDOI
TL;DR: This work aims at finding links between cognitive symptoms and the underlying neurodegeneration process by fusing the information of neuropsychological test outcomes, diagnoses, and other clinical data with the imaging features extracted solely via a data-driven decomposition of MRI.
Abstract: Many classical machine learning techniques have been used to explore Alzheimer's disease (AD), evolving from image decomposition techniques such as principal component analysis toward higher complexity, non-linear decomposition algorithms. With the arrival of the deep learning paradigm, it has become possible to extract high-level abstract features directly from MRI images that internally describe the distribution of data in low-dimensional manifolds. In this work, we try a new exploratory data analysis of AD based on deep convolutional autoencoders. We aim at finding links between cognitive symptoms and the underlying neurodegeneration process by fusing the information of neuropsychological test outcomes, diagnoses, and other clinical data with the imaging features extracted solely via a data-driven decomposition of MRI. The distribution of the extracted features in different combinations is then analyzed and visualized using regression and classification analysis, and the influence of each coordinate of the autoencoder manifold over the brain is estimated. The imaging-derived markers could then predict clinical variables with correlations above 0.6 in the case of neuropsychological evaluation variables such as the MMSE or the ADAS11 scores, achieving a classification accuracy over 80% for the diagnosis of AD.

Journal ArticleDOI
TL;DR: The proposed MTL method for segmenting and classifying tongue images is the first attempt to manage both tasks simultaneously with MTL, and shows that the joint method outperforms the existing tongue characterization methods.
Abstract: Automatic tongue image segmentation and tongue image classification are two crucial tongue characterization tasks in traditional Chinese medicine (TCM). Due to the complexity of tongue segmentation and fine-grained traits of tongue image classification, both tasks are challenging. Fortunately, from the perspective of computer vision, these two tasks are highly interrelated, making them compatible with the idea of Multi-Task Joint learning (MTL). By sharing the underlying parameters and adding two different task loss functions, an MTL method for segmenting and classifying tongue images is proposed in this paper. Moreover, two state-of-the-art deep neural network variants (UNET and Discriminative Filter Learning (DFL)) are fused into the MTL to perform these two tasks. To the best of our knowledge, our method is the first attempt to manage both tasks simultaneously with MTL. We conducted extensive experiments with the proposed method. The experimental results show that our joint method outperforms the existing tongue characterization methods. Besides, visualizations and ablation studies are provided to aid in understanding our approach, which suggest that our method is highly consistent with human perception.

Journal ArticleDOI
TL;DR: The first deep learning method that can detect malaria parasites in thick blood smear images and can run on smartphones is developed, a promising alternative to manual parasite counting for malaria diagnosis, especially in areas lacking experienced parasitologists.
Abstract: Objective: This work investigates the possibility of automated malaria parasite detection in thick blood smears with smartphones. Methods: We have developed the first deep learning method that can detect malaria parasites in thick blood smear images and can run on smartphones. Our method consists of two processing steps. First, we apply an intensity-based Iterative Global Minimum Screening (IGMS), which performs a fast screening of a thick smear image to find parasite candidates. Then, a customized Convolutional Neural Network (CNN) classifies each candidate as either parasite or background. Together with this paper, we make a dataset of 1819 thick smear images from 150 patients publicly available to the research community. We used this dataset to train and test our deep learning method, as described in this paper. Results: A patient-level five-fold cross-evaluation demonstrates the effectiveness of the customized CNN model in discriminating between positive (parasitic) and negative image patches in terms of the following performance indicators: accuracy (93.46% ± 0.32%), AUC (98.39% ± 0.18%), sensitivity (92.59% ± 1.27%), specificity (94.33% ± 1.25%), precision (94.25% ± 1.13%), and negative predictive value (92.74% ± 1.09%). High correlation coefficients (>0.98) between automatically detected parasites and ground truth, on both image level and patient level, demonstrate the practicality of our method. Conclusion: Promising results are obtained for parasite detection in thick blood smears for a smartphone application using deep learning methods. Significance: Automated parasite detection running on smartphones is a promising alternative to manual parasite counting for malaria diagnosis, especially in areas lacking experienced parasitologists.

Journal ArticleDOI
TL;DR: A blockchain model to protect data security and patients’ privacy, ensure data provenance, and provide patients full control of their health records is developed to achieve patient-centric HIE.
Abstract: Health Information Exchange (HIE) exhibits remarkable benefits for patient care such as improving healthcare quality and expediting coordinated care. The Office of the National Coordinator (ONC) for Health Information Technology is seeking patient-centric HIE designs that shift data ownership from providers to patients. There are multiple barriers to patient-centric HIE in the current system, such as security and privacy concerns, data inconsistency, timely access to the right records across multiple healthcare facilities. After investigating the current workflow of HIE, this paper provides a feasible solution to these challenges by utilizing the unique features of blockchain, a distributed ledger technology which is considered “unhackable”. Utilizing the smart contract feature, which is a programmable self-executing protocol running on a blockchain, we developed a blockchain model to protect data security and patients’ privacy, ensure data provenance, and provide patients full control of their health records. By personalizing data segmentation and an “allowed list” for clinicians to access their data, this design achieves patient-centric HIE. We conducted a large-scale simulation of this patient-centric HIE process and quantitatively evaluated the model's feasibility, stability, security, and robustness.

Journal ArticleDOI
TL;DR: A novel joint learning framework to perform accurate COVID-19 identification by effectively learning with heterogeneous datasets with distribution discrepancy is proposed and a powerful backbone is built by redesigning the recently proposed CO VID-Net in aspects of network architecture and learning strategy to improve the prediction accuracy and learning efficiency.
Abstract: The pandemic of coronavirus disease 2019 (COVID-19) has lead to a global public health crisis spreading hundreds of countries. With the continuous growth of new infections, developing automated tools for COVID-19 identification with CT image is highly desired to assist the clinical diagnosis and reduce the tedious workload of image interpretation. To enlarge the datasets for developing machine learning methods, it is essentially helpful to aggregate the cases from different medical systems for learning robust and generalizable models. This paper proposes a novel joint learning framework to perform accurate COVID-19 identification by effectively learning with heterogeneous datasets with distribution discrepancy. We build a powerful backbone by redesigning the recently proposed COVID-Net in aspects of network architecture and learning strategy to improve the prediction accuracy and learning efficiency. On top of our improved backbone, we further explicitly tackle the cross-site domain shift by conducting separate feature normalization in latent space. Moreover, we propose to use a contrastive training objective to enhance the domain invariance of semantic embeddings for boosting the classification performance on each dataset. We develop and evaluate our method with two public large-scale COVID-19 diagnosis datasets made up of CT images. Extensive experiments show that our approach consistently improves the performanceson both datasets, outperforming the original COVID-Net trained on each dataset by 12.16% and 14.23% in AUC respectively, also exceeding existing state-of-the-art multi-site learning methods.

Journal ArticleDOI
Kezhi Li1, Chengyuan Liu1, Taiyu Zhu1, Pau Herrero1, Pantelis Georgiou1 
TL;DR: GluNet is introduced, a framework that leverages on a personalized deep neural network to predict the probabilistic distribution of short-term CGM measurements for subjects with T1D based on their historical data including glucose measurements, meal information, insulin doses, and other factors.
Abstract: For people with Type 1 diabetes (T1D), forecasting of blood glucose (BG) can be used to effectively avoid hyperglycemia, hypoglycemia and associated complications. The latest continuous glucose monitoring (CGM) technology allows people to observe glucose in real-time. However, an accurate glucose forecast remains a challenge. In this work, we introduce GluNet, a framework that leverages on a personalized deep neural network to predict the probabilistic distribution of short-term (30–60 minutes) future CGM measurements for subjects with T1D based on their historical data including glucose measurements, meal information, insulin doses, and other factors. It adopts the latest deep learning techniques consisting of four components: data pre-processing, label transform/recover, multi-layers of dilated convolution neural network (CNN), and post-processing. The method is evaluated in-silico for both adult and adolescent subjects. The results show significant improvements over existing methods in the literature through a comprehensive comparison in terms of root mean square error (RMSE) ( $\text{8.88} \pm \text{0.77}$ mg/dL) with short time lag ( $\text{0.83}\pm \text{0.40}$ minutes) for prediction horizons (PH) = 30 mins (minutes), and RMSE ( $\text{19.90} \pm \text{3.17}$ mg/dL) with time lag ( $\text{16.43}\pm \text{4.07}$ mins) for PH = 60 mins for virtual adult subjects. In addition, GluNet is also tested on two clinical data sets. Results show that it achieves an RMSE ( $\text{19.28} \pm \text{2.76}$ mg/dL) with time lag ( $\text{8.03}\pm \text{4.07}$ mins) for PH = 30 mins and an RMSE ( $\text{31.83} \pm \text{3.49}$ mg/dL) with time lag ( $\text{17.78}\pm \text{8.00}$ mins) for PH = 60 mins. These are the best reported results for glucose forecasting when compared with other methods including the neural network for predicting glucose (NNPG), the support vector regression (SVR), the latent variable with exogenous input (LVX), and the auto regression with exogenous input (ARX) algorithm.

Journal ArticleDOI
TL;DR: A novel framework for generating synthetic data that closely approximates the joint distribution of variables in an original EHR dataset is proposed, providing a readily accessible, legally and ethically appropriate solution to support more open data sharing, enabling the development of AI solutions.
Abstract: The medical and machine learning communities are relying on the promise of artificial intelligence (AI) to transform medicine through enabling more accurate decisions and personalized treatment. However, progress is slow. Legal and ethical issues around unconsented patient data and privacy is one of the limiting factors in data sharing, resulting in a significant barrier in accessing routinely collected electronic health records (EHR) by the machine learning community. We propose a novel framework for generating synthetic data that closely approximates the joint distribution of variables in an original EHR dataset, providing a readily accessible, legally and ethically appropriate solution to support more open data sharing, enabling the development of AI solutions. In order to address issues around lack of clarity in defining sufficient anonymization, we created a quantifiable, mathematical definition for “identifiability”. We used a conditional generative adversarial networks (GAN) framework to generate synthetic data while minimize patient identifiability that is defined based on the probability of re-identification given the combination of all data on any individual patient. We compared models fitted to our synthetically generated data to those fitted to the real data across four independent datasets to evaluate similarity in model performance, while assessing the extent to which original observations can be identified from the synthetic data. Our model, ADS-GAN, consistently outperformed state-of-the-art methods, and demonstrated reliability in the joint distributions. We propose that this method could be used to develop datasets that can be made publicly available while considerably lowering the risk of breaching patient confidentiality.

Journal ArticleDOI
TL;DR: The study shows that well-designed and properly trained deep learning models can achieve PCa Gleason grading accuracy that is comparable to an expert pathologist.
Abstract: Visual inspection of histopathology images of stained biopsy tissue by expert pathologists is the standard method for grading of prostate cancer (PCa). However, this process is time-consuming and subject to high inter-observer variability. Machine learning-based methods have the potential to improve efficient throughput of large volumes of slides while decreasing variability, but they are not easy to develop because they require substantial amounts of labeled training data. In this paper, we propose a deep learning-based classification technique and data augmentation methods for accurate grading of PCa in histopathology images in the presence of limited data. Our method combines the predictions of three separate convolutional neural networks (CNNs) that work with different patch sizes. This enables our method to take advantage of the greater amount of contextual information in larger patches as well as greater quantity of smaller patches in the labeled training data. The predictions produced by the three CNNs are combined using a logistic regression model, which is trained separately after the CNN training. To effectively train our models, we propose new data augmentation methods and empirically study their effects on the classification accuracy. The proposed method achieves an accuracy of $\text{92}\%$ in classifying cancerous patches versus benign patches and an accuracy of $\text{86}\%$ in classifying low-grade (i.e., Gleason grade 3) from high-grade (i.e., Gleason grades 4 and 5) patches. The agreement level of our automatic grading method with expert pathologists is within the range of agreement between pathologists. Our experiments indicate that data augmentation is necessary for achieving expert-level performance with deep learning-based methods. A combination of image-space augmentation and feature-space augmentation leads to the best results. Our study shows that well-designed and properly trained deep learning models can achieve PCa Gleason grading accuracy that is comparable to an expert pathologist.

Journal ArticleDOI
TL;DR: A new end-to-end deep learning architecture for retinal vessel segmentation: HAnet, composed of three decoder networks that dynamically locates which image regions are “hard” or “easy” to analyze, and introduces attention mechanisms in the network to reinforce focus on image features in the “ hard” regions.
Abstract: Automated retinal vessel segmentation is among the most significant application and research topics in ophthalmologic image analysis Deep learning based retinal vessel segmentation models have attracted much attention in the recent years However, current deep network designs tend to predominantly focus on vessels which are easy to segment, while overlooking vessels which are more difficult to segment, such as thin vessels or those with uncertain boundaries To address this critical gap, we propose a new end-to-end deep learning architecture for retinal vessel segmentation: hard attention net (HAnet) Our design is composed of three decoder networks: the first of which dynamically locates which image regions are “hard” or “easy” to analyze, while the other two aim to segment retinal vessels in these “hard” and “easy” regions independently We introduce attention mechanisms in the network to reinforce focus on image features in the “hard” regions Finally, a final vessel segmentation map is generated by fusing all decoder outputs To quantify the network's performance, we evaluate our model on four public fundus photography datasets (DRIVE, STARE, CHASE_DB1, HRF), two recent published color scanning laser ophthalmoscopy image datasets (IOSTAR, RC-SLO), and a self-collected indocyanine green angiography dataset Compared to existing state-of-the-art models, the proposed architecture achieves better/comparable performances in segmentation accuracy, area under the receiver operating characteristic curve (AUC), and f1-score To further gauge the ability to generalize our model, cross-dataset and cross-modality evaluations are conducted, and demonstrate promising extendibility of our proposed network architecture

Journal ArticleDOI
TL;DR: This work proposes a medical image synthesis model named abnormal-to-normal translation generative adversarial network (ANT-GAN) to generate a normal-looking medical image based on its abnormal-looking counterpart without the need for paired training data.
Abstract: The identification of lesion within medical image data is necessary for diagnosis, treatment and prognosis. Segmentation and classification approaches are mainly based on supervised learning with well-paired image-level or voxel-level labels. However, labeling the lesion in medical images is laborious requiring highly specialized knowledge. We propose a medical image synthesis model named abnormal-to-normal translation generative adversarial network (ANT-GAN) to generate a normal-looking medical image based on its abnormal-looking counterpart without the need for paired training data. Unlike typical GANs, whose aim is to generate realistic samples with variations, our more restrictive model aims at producing a normal-looking image corresponding to one containing lesions, and thus requires a special design. Being able to provide a “normal” counterpart to a medical image can provide useful side information for medical imaging tasks like lesion segmentation or classification validated by our experiments. In the other aspect, the ANT-GAN model is also capable of producing highly realistic lesion-containing image corresponding to the healthy one, which shows the potential in data augmentation verified in our experiments.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a multi-domain connectome CNN (MDC-CNN) based on a parallel ensemble of 1D and 2D CNNs to integrate the features from various domains and dimensions using different fusion strategies.
Abstract: Objective: We exploit altered patterns in brain functional connectivity as features for automatic discriminative analysis of neuropsychiatric patients. Deep learning methods have been introduced to functional network classification only very recently for fMRI, and the proposed architectures essentially focused on a single type of connectivity measure. Methods: We propose a deep convolutional neural network (CNN) framework for classification of electroencephalogram (EEG)-derived brain connectome in schizophrenia (SZ). To capture complementary aspects of disrupted connectivity in SZ, we explore combination of various connectivity features consisting of time and frequency-domain metrics of effective connectivity based on vector autoregressive model and partial directed coherence, and complex network measures of network topology. We design a novel multi-domain connectome CNN (MDC-CNN) based on a parallel ensemble of 1D and 2D CNNs to integrate the features from various domains and dimensions using different fusion strategies. We also consider an extension to dynamic brain connectivity using the recurrent neural networks. Results: Hierarchical latent representations learned by the multiple convolutional layers from EEG connectivity reveals apparent group differences between SZ and healthy controls (HC). Results on a large resting-state EEG dataset show that the proposed CNNs significantly outperform traditional support vector machine classifier. The MDC-CNN with combined connectivity features further improves performance over single-domain CNNs using individual features, achieving remarkable accuracy of 91.69% with a decision-level fusion. Conclusion: The proposed MDC-CNN by integrating information from diverse brain connectivity descriptors is able to accurately discriminate SZ from HC. Significance: The new framework is potentially useful for developing diagnostic tools for SZ and other disorders.

Journal ArticleDOI
TL;DR: The method achieves a competitive inter-patient heartbeat classification performance without complex handcrafted features or the intervention of the human expert, and it can also be adjusted to handle various other tasks relative to ECG classification.
Abstract: This paper presents a novel deep learning framework for the inter-patient electrocardiogram (ECG) heartbeat classification. A symbolization approach especially designed for ECG is introduced, which can jointly represent the morphology and rhythm of the heartbeat and alleviate the influence of inter-patient variation through baseline correction. The symbolic representation of the heartbeat is used by a multi-perspective convolutional neural network (MPCNN) to learn features automatically and classify the heartbeat. We evaluate our method for the detection of the supraventricular ectopic beat (SVEB) and ventricular ectopic beat (VEB) on MIT-BIH arrhythmia dataset. Compared with the state-of-the-art methods based on manual features or deep learning models, our method shows superior performance: the overall accuracy of 96.4%, F1 scores for SVEB and VEB of 76.6% and 89.7%, respectively. The ablation study on our method validates the effectiveness of the proposed symbolization approach and joint representation architecture, which can help the deep learning model to learn more general features and improve the ability of generalization for unseen patients. Because our method achieves a competitive inter-patient heartbeat classification performance without complex handcrafted features or the intervention of the human expert, it can also be adjusted to handle various other tasks relative to ECG classification.

Journal ArticleDOI
TL;DR: A multi-scale deep architecture by decomposing an EEG signal into different frequency bands as input to CNNs is proposed to model global temporal context and utilizes the multi-head self-attention module of the transformer model to not only improve performance, but also shorten the training time.
Abstract: Sleep staging is to score the sleep state of a subject into different sleep stages such as Wake and Rapid Eye Movement (REM) . It plays an indispensable role in the diagnosis and treatment of sleep disorders. As manual sleep staging through well-trained sleep experts is time consuming, tedious, and subjective, many automatic methods have been developed for accurate, efficient, and objective sleep staging. Recently, deep learning based methods have been successfully proposed for electroencephalogram (EEG) based sleep staging with promising results. However, most of these methods directly take EEG raw signals as input of convolutional neural networks (CNNs) without considering the domain knowledge of EEG staging. Apart from that, to capture temporal information, most of the existing methods utilize recurrent neural networks such as LSTM (Long Short Term Memory) which are not effective for modelling global temporal context and difficult to train. Therefore, inspired by the clinical guidelines of sleep staging such as AASM (American Academy of Sleep Medicine) rules where different stages are generally characterized by EEG waveforms of various frequencies, we propose a multi-scale deep architecture by decomposing an EEG signal into different frequency bands as input to CNNs. To model global temporal context, we utilize the multi-head self-attention module of the transformer model to not only improve performance, but also shorten the training time. In addition, we choose residual based architecture which makes training end-to-end. Experimental results on two widely used sleep staging datasets, Montreal Archive of Sleep Studies (MASS) and sleep-EDF datasets, demonstrate the effectiveness and significant efficiency (up to 12 times less training time) of our proposed method over the state-of-the-art.

Journal ArticleDOI
TL;DR: A novel end-to-end Deep Multi-Scale Fusion convolutional neural network (DMSFNet) architecture for multi-class arrhythmia detection that can effectively capture abnormal patterns of diseases and suppress noise interference by multi-scale feature extraction and cross-scale information complementarity of ECG signals.
Abstract: Automated electrocardiogram (ECG) analysis for arrhythmia detection plays a critical role in early prevention and diagnosis of cardiovascular diseases. Extracting powerful features from raw ECG signals for fine-grained diseases classification is still a challenging problem today due to variable abnormal rhythms and noise distribution. For ECG analysis, the previous research works depend mostly on heartbeat or single scale signal segments, which ignores underlying complementary information of different scales. In this paper, we formulate a novel end-to-end Deep Multi-Scale Fusion convolutional neural network (DMSFNet) architecture for multi-class arrhythmia detection. Our proposed approach can effectively capture abnormal patterns of diseases and suppress noise interference by multi-scale feature extraction and cross-scale information complementarity of ECG signals. The proposed method implements feature extraction for signal segments with different sizes by integrating multiple convolution kernels with different receptive fields. Meanwhile, joint optimization strategy with multiple losses of different scales is designed, which not only learns scale-specific features, but also realizes cumulatively multi-scale complementary feature learning during the learning process. In our work, we demonstrate our DMSFNet on two open datasets (CPSC_2018 and PhysioNet/CinC_2017) and deliver the state-of-art performance on them. Among them, CPSC_2018 is a 12-lead ECG dataset and CinC_2017 is a single-lead dataset. For these two datasets, we achieve the F1 score $\text{82.8}\%$ and $\text{84.1}\%$ which are higher than previous state-of-art approaches respectively. The results demonstrate that our end-to-end DMSFNet has outstanding performance for feature extraction from a broad range of distinct arrhythmias and elegant generalization ability for effectively handling ECG signals with different leads.

Journal ArticleDOI
TL;DR: The new visions and features of the CPS-based HRS are proposed, the latest progress in related enabling technologies is reviewed, including artificial intelligence, sensing fundamentals, materials and machines, cloud computing and communication, as well as motion capture and mapping.
Abstract: Powered by the technologies that have originated from manufacturing, the fourth revolution of healthcare technologies is happening (Healthcare 4.0). As an example of such revolution, new generation homecare robotic systems (HRS) based on the cyber-physical systems (CPS) with higher speed and more intelligent execution are emerging. In this article, the new visions and features of the CPS-based HRS are proposed. The latest progress in related enabling technologies is reviewed, including artificial intelligence, sensing fundamentals, materials and machines, cloud computing and communication, as well as motion capture and mapping. Finally, the future perspectives of the CPS-based HRS and the technical challenges faced in each technical area are discussed.

Journal ArticleDOI
Serap Aydin1
TL;DR: The results reveal that emotion formation is mostly influenced by individual experiences rather than gender, while local neuronal complexity is mostly sensitive to the affective valance rating, while regional neuro-cortical connectivity levels are mostlysensitive to the affectsive arousal ratings.
Abstract: In the present article, a novel emotional complexity marker is proposed for classification of discrete emotions induced by affective video film clips. Principal Component Analysis (PCA) is applied to full-band specific phase space trajectory matrix (PSTM) extracted from short emotional EEG segment of 6 s, then the first principal component is used to measure the level of local neuronal complexity. As well, Phase Locking Value (PLV) between right and left hemispheres is estimated for in order to observe the superiority of local neuronal complexity estimation to regional neuro-cortical connectivity measurements in clustering nine discrete emotions (fear, anger, happiness, sadness, amusement, surprise, excitement, calmness, disgust) by using Long-Short-Term-Memory Networks as deep learning applications. In tests, two groups (healthy females and males aged between 22 and 33 years old) are classified with the accuracy levels of $\text{68.52}{\%}$ and $\text{79.36}{\%}$ through the proposed emotional complexity markers and and connectivity levels in terms of PLV in amusement. The groups are found to be statistically different ( $p\ll 0.5$ ) in amusement with respect to both metrics, even if gender difference does not lead to different neuro-cortical functions in any of the other discrete emotional states. The high deep learning classification accuracy of $\text{98.00}{\%}$ is commonly obtained for discrimination of positive emotions from negative emotions through the proposed new complexity markers. Besides, considerable useful classification performance is obtained in discriminating mixed emotions from each other through full-band connectivity features. The results reveal that emotion formation is mostly influenced by individual experiences rather than gender. In detail, local neuronal complexity is mostly sensitive to the affective valance rating, while regional neuro-cortical connectivity levels are mostly sensitive to the affective arousal ratings.

Journal ArticleDOI
TL;DR: Deep learning enables automatic sleep staging for suspected OSA patients with high accuracy and expectedly, the accuracy decreased with increasing OSA severity.
Abstract: The identification of sleep stages is essential in the diagnostics of sleep disorders, among which obstructive sleep apnea (OSA) is one of the most prevalent. However, manual scoring of sleep stages is time-consuming, subjective, and costly. To overcome this shortcoming, we aimed to develop an accurate deep learning approach for automatic classification of sleep stages and to study the effect of OSA severity on the classification accuracy. Overnight polysomnographic recordings from a public dataset of healthy individuals (Sleep-EDF, n = 153) and from a clinical dataset ( n = 891) of patients with suspected OSA were used to develop a combined convolutional and long short-term memory neural network. On the public dataset, the model achieved sleep staging accuracy of 83.7% (κ = 0.77) with a single frontal EEG channel and 83.9% (κ = 0.78) when supplemented with EOG. For the clinical dataset, the model achieved accuracies of 82.9% (κ = 0.77) and 83.8% (κ = 0.78) with a single EEG channel and two channels (EEG+EOG), respectively. The sleep staging accuracy decreased with increasing OSA severity. The single-channel accuracy ranged from 84.5% (κ = 0.79) for individuals without OSA diagnosis to 76.5% (κ = 0.68) for patients with severe OSA. In conclusion, deep learning enables automatic sleep staging for suspected OSA patients with high accuracy and expectedly, the accuracy decreased with increasing OSA severity. Furthermore, the accuracies achieved in the public dataset were superior to previously published state-of-the-art methods. Adding an EOG channel did not significantly increase the accuracy. The automatic, single-channel-based sleep staging could enable easy, accurate, and cost-efficient integration of EEG recording into diagnostic ambulatory recordings.

Journal ArticleDOI
TL;DR: This method provides an overall performance improvement in terms of sensitivity, precision, and specificity compared to conventional false positive learning method, and thus achieves the state-of-the-art results on the CVC-ClinicVideoDB video data set.
Abstract: Automatic polyp detection has been shown to be difficult due to various polyp-like structures in the colon and high interclass variations in polyp size, color, shape, and texture. An efficient method should not only have a high correct detection rate (high sensitivity) but also a low false detection rate (high precision and specificity). The state-of-the-art detection methods include convolutional neural networks (CNN). However, CNNs have shown to be vulnerable to small perturbations and noise; they sometimes miss the same polyp appearing in neighboring frames and produce a high number of false positives. We aim to tackle this problem and improve the overall performance of the CNN-based object detectors for polyp detection in colonoscopy videos. Our method consists of two stages: a region of interest (RoI) proposal by CNN-based object detector networks and a false positive (FP) reduction unit. The FP reduction unit exploits the temporal dependencies among image frames in video by integrating the bidirectional temporal information obtained by RoIs in a set of consecutive frames. This information is used to make the final decision. The experimental results show that the bidirectional temporal information has been helpful in estimating polyp positions and accurately predict the FPs. This provides an overall performance improvement in terms of sensitivity, precision, and specificity compared to conventional false positive learning method, and thus achieves the state-of-the-art results on the CVC-ClinicVideoDB video data set.

Journal ArticleDOI
Peng Tang1, Qiaokang Liang1, Xintong Yan1, Shao Xiang1, Dan Zhang2 
TL;DR: A Global-Part Convolutional Neural Network (GP-CNN) model, which treats the fine-grained local information and global context information with equal importance, and a data-transformed ensemble learning strategy, which can boost the classification performance by integrating the different discriminant information from GP-CNNs that are trained with original images, color constancy transformed images, and feature saliency transformed images.
Abstract: Precise skin lesion classification is still challenging due to two problems, i.e., (1) inter-class similarity and intra-class variation of skin lesion images, and (2) the weak generalization ability of single Deep Convolutional Neural Network trained with limited data. Therefore, we propose a Global-Part Convolutional Neural Network (GP-CNN) model, which treats the fine-grained local information and global context information with equal importance. The Global-Part model consists of a Global Convolutional Neural Network (G-CNN) and a Part Convolutional Neural Network (P-CNN). Specifically, the G-CNN is trained with downscaled dermoscopy images, and is used to extract the global-scale information of dermoscopy images and produce the Classification Activation Map (CAM). While the P-CNN is trained with the CAM guided cropped image patches and is used to capture local-scale information of skin lesion regions. Additionally, we present a data-transformed ensemble learning strategy, which can further boost the classification performance by integrating the different discriminant information from GP-CNNs that are trained with original images, color constancy transformed images, and feature saliency transformed images, respectively. The proposed method is evaluated on the ISIC 2016 and ISIC 2017 Skin Lesion Challenge (SLC) classification datasets. Experimental results indicate that the proposed method can achieve the state-of-the-art skin lesion classification performance (i.e., an AP value of 0.718 on the ISIC 2016 SLC dataset and an Average Auc value of 0.926 on the ISIC 2017 SLC dataset) without any external data, compared with other current methods which need to use external data.