scispace - formally typeset
Search or ask a question
Author

M. M. Al Rahhal

Bio: M. M. Al Rahhal is an academic researcher from King Saud University. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 2, co-authored 4 publications receiving 423 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel approach based on deep learning for active classification of electrocardiogram (ECG) signals by learning a suitable feature representation from the raw ECG data in an unsupervised way using stacked denoising autoencoders (SDAEs) with sparsity constraint.

507 citations

Journal ArticleDOI
TL;DR: An asymmetric adaptation neural network (AANN) method for cross-domain classification in remote sensing images by feeding features obtained from a pretrained convolutional neural network to a denoising autoencoder to perform dimensionality reduction.
Abstract: In this letter, we introduce an asymmetric adaptation neural network (AANN) method for cross-domain classification in remote sensing images. Before the adaptation process, we feed the features obtained from a pretrained convolutional neural network to a denoising autoencoder (DAE) to perform dimensionality reduction. Then the first hidden layer of AANN (placed on the top of DAE) maps the labeled source data to the target space, while the subsequent layers control the separation between the available land-cover classes. To learn its weights, the network minimizes an objective function composed of two losses related to the distance between the source and target data distributions and class separation. The results of experiments conducted on six scenarios built from three benchmark scene remote sensing data sets (i.e., Merced, KSA, and AID data sets) are reported and discussed.

49 citations

Journal ArticleDOI
TL;DR: This article proposes a visual question answering (VQA) approach for remote sensing images based on vision-language models based on transformers and demonstrates that this approach can achieve better results with reduced training size compared with the recent state-of-the-art.
Abstract: Recently, vision-language models based on transformers are gaining popularity for joint modeling of visual and textual modalities. In particular, they show impressive results when transferred to several downstream tasks such as zero and few-shot classification. In this article, we propose a visual question answering (VQA) approach for remote sensing images based on these models. The VQA task attempts to provide answers to image-related questions. In contrast, VQA has gained popularity in computer vision, in remote sensing, it is not widespread. First, we use the contrastive language image pretraining (CLIP) network for embedding the image patches and question words into a sequence of visual and textual representations. Then, we learn attention mechanisms to capture the intradependencies and interdependencies within and between these representations. Afterward, we generate the final answer by averaging the predictions of two classifiers mounted on the top of the resulting contextual representations. In the experiments, we study the performance of the proposed approach on two datasets acquired with Sentinel-2 and aerial sensors. In particular, we demonstrate that our approach can achieve better results with reduced training size compared with the recent state-of-the-art.

7 citations

Journal ArticleDOI
TL;DR: A new convolutional neural network framework for COVID-19 detection using computed tomography (CT) images is proposed and validates its superiority over the state-of-the-art methods with values exceeding 99 10% in terms of several metrics, such as accuracy, precision, recall, and F1.
Abstract: With the rapid spread of the coronavirus disease 2019 (COVID-19) worldwide, the establishment of an accurate and fast process to diagnose the disease is important The routine real-time reverse transcription-polymerase chain reaction (rRT-PCR) test that is currently used does not provide such high accuracy or speed in the screening process Among the good choices for an accurate and fast test to screen COVID-19 are deep learning techniques In this study, a new convolutional neural network (CNN) framework for COVID-19 detection using computed tomography (CT) images is proposed The EfficientNet architecture is applied as the backbone structure of the proposed network, in which feature maps with different scales are extracted from the input CT scan images In addition, atrous convolution at different rates is applied to these multi-scale feature maps to generate denser features, which facilitates in obtaining COVID-19 findings in CT scan images The proposed framework is also evaluated in this study using a public CT dataset containing 2482 CT scan images from patients of both classes (i e , COVID-19 and non-COVID-19) To augment the dataset using additional training examples, adversarial examples generation is performed The proposed system validates its superiority over the state-of-the-art methods with values exceeding 99 10% in terms of several metrics, such as accuracy, precision, recall, and F1 The proposed systemalso exhibits good robustness, when it is trained using a small portion of data (20%), with an accuracy of 96 16%

6 citations

Journal ArticleDOI
TL;DR: In this article , a transformer encoder-decoder architecture is proposed to extract image features using the vision transformer (ViT) model and embed the question using a textual encoder transformer, and concatenate the resulting visual and textual representations and feed them into a multi-modal decoder for generating the answer in an autoregressive way.
Abstract: In the clinical and healthcare domains, medical images play a critical role. A mature medical visual question answering system (VQA) can improve diagnosis by answering clinical questions presented with a medical image. Despite its enormous potential in the healthcare industry and services, this technology is still in its infancy and is far from practical use. This paper introduces an approach based on a transformer encoder–decoder architecture. Specifically, we extract image features using the vision transformer (ViT) model, and we embed the question using a textual encoder transformer. Then, we concatenate the resulting visual and textual representations and feed them into a multi-modal decoder for generating the answer in an autoregressive way. In the experiments, we validate the proposed model on two VQA datasets for radiology images termed VQA-RAD and PathVQA. The model shows promising results compared to existing solutions. It yields closed and open accuracies of 84.99% and 72.97%, respectively, for VQA-RAD, and 83.86% and 62.37%, respectively, for PathVQA. Other metrics such as the BLUE score showing the alignment between the predicted and true answer sentences are also reported.

4 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: It is demonstrated that an end-to-end deep learning approach can classify a broad range of distinct arrhythmias from single-lead ECGs with high diagnostic performance similar to that of cardiologists.
Abstract: Computerized electrocardiogram (ECG) interpretation plays a critical role in the clinical ECG workflow1. Widely available digital ECG data and the algorithmic paradigm of deep learning2 present an opportunity to substantially improve the accuracy and scalability of automated ECG analysis. However, a comprehensive evaluation of an end-to-end deep learning approach for ECG analysis across a wide variety of diagnostic classes has not been previously reported. Here, we develop a deep neural network (DNN) to classify 12 rhythm classes using 91,232 single-lead ECGs from 53,549 patients who used a single-lead ambulatory ECG monitoring device. When validated against an independent test dataset annotated by a consensus committee of board-certified practicing cardiologists, the DNN achieved an average area under the receiver operating characteristic curve (ROC) of 0.97. The average F1 score, which is the harmonic mean of the positive predictive value and sensitivity, for the DNN (0.837) exceeded that of average cardiologists (0.780). With specificity fixed at the average specificity achieved by cardiologists, the sensitivity of the DNN exceeded the average cardiologist sensitivity for all rhythm classes. These findings demonstrate that an end-to-end deep learning approach can classify a broad range of distinct arrhythmias from single-lead ECGs with high diagnostic performance similar to that of cardiologists. If confirmed in clinical settings, this approach could reduce the rate of misdiagnosed computerized ECG interpretations and improve the efficiency of expert human ECG interpretation by accurately triaging or prioritizing the most urgent conditions. Analysis of electrocardiograms using an end-to-end deep learning approach can detect and classify cardiac arrhythmia with high accuracy, similar to that of cardiologists.

1,632 citations

Journal ArticleDOI
TL;DR: This paper provides a comprehensive survey on the application of DL, RL, and deep RL techniques in mining biological data and compares the performances of DL techniques when applied to different data sets across various application domains.
Abstract: Rapid advances in hardware-based technologies during the past decades have opened up new possibilities for life scientists to gather multimodal data in various application domains, such as omics , bioimaging , medical imaging , and (brain/body)–machine interfaces . These have generated novel opportunities for development of dedicated data-intensive machine learning techniques. In particular, recent research in deep learning (DL), reinforcement learning (RL), and their combination (deep RL) promise to revolutionize the future of artificial intelligence. The growth in computational power accompanied by faster and increased data storage, and declining computing costs have already allowed scientists in various fields to apply these techniques on data sets that were previously intractable owing to their size and complexity. This paper provides a comprehensive survey on the application of DL, RL, and deep RL techniques in mining biological data. In addition, we compare the performances of DL techniques when applied to different data sets across various application domains. Finally, we outline open issues in this challenging research area and discuss future development perspectives.

622 citations

Journal ArticleDOI
TL;DR: The focus of this review is to provide in-depth summaries of deep learning methods for mobile and wearable sensor-based human activity recognition, and categorise the studies into generative, discriminative and hybrid methods.
Abstract: Human activity recognition systems are developed as part of a framework to enable continuous monitoring of human behaviours in the area of ambient assisted living, sports injury detection, elderly care, rehabilitation, and entertainment and surveillance in smart home environments. The extraction of relevant features is the most challenging part of the mobile and wearable sensor-based human activity recognition pipeline. Feature extraction influences the algorithm performance and reduces computation time and complexity. However, current human activity recognition relies on handcrafted features that are incapable of handling complex activities especially with the current influx of multimodal and high dimensional sensor data. With the emergence of deep learning and increased computation powers, deep learning and artificial intelligence methods are being adopted for automatic feature learning in diverse areas like health, image classification, and recently, for feature extraction and classification of simple and complex human activity recognition in mobile and wearable sensors. Furthermore, the fusion of mobile or wearable sensors and deep learning methods for feature learning provide diversity, offers higher generalisation, and tackles challenging issues in human activity recognition. The focus of this review is to provide in-depth summaries of deep learning methods for mobile and wearable sensor-based human activity recognition. The review presents the methods, uniqueness, advantages and their limitations. We not only categorise the studies into generative, discriminative and hybrid methods but also highlight their important advantages. Furthermore, the review presents classification and evaluation procedures and discusses publicly available datasets for mobile sensor human activity recognition. Finally, we outline and explain some challenges to open research problems that require further research and improvements.

601 citations

Journal ArticleDOI
TL;DR: A new deep learning approach for cardiac arrhythmia (17 classes) detection based on long-duration electrocardiography (ECG) signal analysis based on a new 1D-Convolutional Neural Network model (1D-CNN).

548 citations

Journal ArticleDOI
Ozal Yildirim1
TL;DR: It has been observed that the wavelet-based layer proposed in the study significantly improves the recognition performance of conventional networks and is an important approach that can be applied to similar signal processing problems.

527 citations