Showing papers in "Signal, Image and Video Processing in 2022"

PDF

Open Access

Journal Article•DOI•

MLCA2F: Multi-Level Context Attentional Feature Fusion for COVID-19 lesion segmentation from CT scans

[...]

03 Aug 2022-Signal, Image and Video Processing

TL;DR: Wang et al. as mentioned in this paper presented a robust deep learning model based on a novel multi-scale contextual information fusion strategy, called Multi-Level Context Attentional Feature Fusion (MLCA2F), which consists of the Multi-Scale Context-Attention Network (MSCA-Net) blocks for segmenting COVID-19 lesions from CT images.

...read moreread less

Abstract: In the field of diagnosis and treatment planning of Coronavirus disease 2019 (COVID-19), accurate infected area segmentation is challenging due to the significant variations in the COVID-19 lesion size, shape, and position, boundary ambiguity, as well as complex structure. To bridge these gaps, this study presents a robust deep learning model based on a novel multi-scale contextual information fusion strategy, called Multi-Level Context Attentional Feature Fusion (MLCA2F), which consists of the Multi-Scale Context-Attention Network (MSCA-Net) blocks for segmenting COVID-19 lesions from Computed Tomography (CT) images. Unlike the previous classical deep learning models, the MSCA-Net integrates Multi-Scale Contextual Feature Fusion (MC2F) and Multi-Context Attentional Feature (MCAF) to learn more lesion details and guide the model to estimate the position of the boundary of infected regions, respectively. Practically, extensive experiments are performed on the Kaggle CT dataset to explore the optimal structure of MLCA2F. In comparison with the current state-of-the-art methods, the experiments show that the proposed methodology provides efficient results. Therefore, we can conclude that the MLCA2F framework has the potential to dramatically improve the conventional segmentation methods for assisting clinical decision-making.

...read moreread less

20 citations

Journal Article•DOI•

CNN-MAO: Convolutional Neural Network-based Modified Aquilla Optimization Algorithm for Pothole Identification from Thermal Images

[...]

R. Sathya, B. Saleena

05 Apr 2022-Signal, Image and Video Processing

TL;DR: A novel method to detect the pothole by using a thermal imaging system known as convolutional neural network (CNN)-based modified aquilla optimization (AO) algorithm that enhances the classification accuracy, precision, recall, and F1-score and minimizes the classification error and detection time.

...read moreread less

17 citations

Journal Article•DOI•

Convolution and correlation theorems for Wigner–Ville distribution associated with the quaternion offset linear canonical transform

[...]

13 Jan 2022-Signal, Image and Video Processing

17 citations

Journal Article•DOI•

Real-time detection of flame and smoke using an improved YOLOv4 network

[...]

Yifan Wang, Changchun Hua, Wei Li Ding, Ruinan Wu

17 Jan 2022-Signal, Image and Video Processing

TL;DR: A lightweight detector called Light-YOLOv4 is proposed, which meets flame and smoke detection tasks’ requirements on the accuracy and real time, and has good detection performance and speed on embedded scenarios.

...read moreread less

15 citations

Journal Article•DOI•

Wiener model-based system identification using moth flame optimised Kalman filter algorithm

[...]

Lakshminarayana Janjanam, Suman Kumar Saha, Rajib Kar, Durbadal Mandal

26 Jan 2022-Signal, Image and Video Processing

13 citations

Journal Article•DOI•

Real-time detection of flame and smoke using an improved YOLOv4 network

[...]

Yifan Wang, Changchun Hua, Weili Ding, Ruinan Wu

17 Jan 2022-Signal, Image and Video Processing

13 citations

Journal Article•DOI•

A potential crack region method to detect crack using image processing of multiple thresholding

[...]

Cheng Chen, Hyungjoon Seo, Changhyun Jun, Yang Zhao

15 Jan 2022-Signal, Image and Video Processing

TL;DR: In this paper , a potential crack region method is proposed to detect road pavement cracks by using the adaptive threshold, which combines the global threshold and the local threshold to segment the image according to the grayscale distribution characteristics of the crack image.

...read moreread less

Abstract: Abstract In this paper, a potential crack region method is proposed to detect road pavement cracks by using the adaptive threshold. To reduce the noises of the image, the pre-treatment algorithm was applied according to the following steps: grayscale processing, histogram equalization, filtering traffic lane. From the image segmentation methods, the algorithm combines the global threshold and the local threshold to segment the image. According to the grayscale distribution characteristics of the crack image, the sliding window is used to obtain the window deviation, and then, the deviation image is segmented based on the maximum inter-class deviation. Obtain a potential crack region and then perform a local threshold-based segmentation algorithm. Real images of pavement surface were used at the Su Tong Li road in Suzhou, China. It was found that the proposed approach could give a more explicit description of pavement cracks in images. The method was tested on 509 images of the German asphalt pavement distress (Gap) dataset: The test results were found to be promising (precision = 0.82, recall = 0.81, F 1 score = 0.83).

...read moreread less

12 citations

Journal Article•DOI•

Abs-CAM: a gradient optimization interpretable approach for explanation of convolutional neural networks

[...]

Chunyan Zeng, Kang Yan, Zheng Wang, Yan Yu, Shiyan Xia, Nan Zhao - Show less +2 more

08 Jul 2022-Signal, Image and Video Processing

TL;DR: Zhang et al. as discussed by the authors proposed an absolute value class activation mapping-based (Abs-CAM) method, which optimizes the gradients derived from the backpropagation and turns all of them into positive gradients to enhance the visual features of output neurons' activation and improve the localization ability of the saliency map.

...read moreread less

Abstract: The black-box nature of deep neural networks severely hinders its performance improvement and application in specific scenes. In recent years, class activation mapping-based method has been widely used to interpret the internal decisions of models in computer vision tasks. However, when this method uses backpropagation to obtain gradients, it will cause noise in the saliency map and even locate features that are irrelevant to decisions. In this paper, we propose an absolute value class activation mapping-based (Abs-CAM) method, which optimizes the gradients derived from the backpropagation and turns all of them into positive gradients to enhance the visual features of output neurons’ activation and improve the localization ability of the saliency map. The framework of Abs-CAM is divided into two phases: generating initial saliency map and generating final saliency map. The first phase improves the localization ability of the saliency map by optimizing the gradient, and the second phase linearly combines the initial saliency map with the original image to enhance the semantic information of the saliency map. We conduct qualitative and quantitative evaluation of the proposed method, including Deletion, Insertion, and Pointing Game. The experimental results show that the Abs-CAM can obviously eliminate the noise in the saliency map, and can better locate the features related to decisions, and is superior to the previous methods in recognition and localization tasks.

...read moreread less

12 citations

Journal Article•DOI•

Multi-feature fusion partitioned local binary pattern method for finger vein recognition

[...]

Zhongxia Zhang, Mingwen Wang

20 Jan 2022-Signal, Image and Video Processing

12 citations

Journal Article•DOI•

Multiscale transunet + + : dense hybrid U-Net with transformer for medical image segmentation

[...]

Bo Wang, Fan Wang, Peng-shu Dong, Chongyi Li

27 Jan 2022-Signal, Image and Video Processing

11 citations

Journal Article•DOI•

Fast HEVC intra-CU decision partition algorithm with modified LeNet-5 and AlexNet

[...]

Werda Imen, Maraoui Amna, Belghith Fatma, Sayadi Fatma Ezahra, Nouri Masmoudi - Show less +1 more

26 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Pest identification via hyperspectral image and deep learning

[...]

Zhitao Xiao, Kai Yin, Lei Geng, Jun Wu, Feng Zhang, Yanbei Liu - Show less +2 more

06 Jan 2022-Signal, Image and Video Processing

TL;DR: An end-to-end pest identification network that combines deep learning and hyperspectral imaging technology is proposed and results prove that this method has higher pest identification accuracy and is more suitable for pest identification tasks than other methods.

...read moreread less

Journal Article•DOI•

A new transfer learning approach to detect cardiac arrhythmia from ECG signals

[...]

Mohebbanaaz, Lopamudra Kumar, Y. Padma Sai

18 Feb 2022-Signal, Image and Video Processing

Journal Article•DOI•

Drowning behavior detection in swimming pool based on deep learning

[...]

Fei Lei, Hengyu Zhu, Feifei Tang, Xinyuan Wang

03 Jan 2022-Signal, Image and Video Processing

TL;DR: The method proposed in this paper meets the real-time detection requirements and does well in swimmer behavior recognition and provides technical support for reducing drowning accidents in public swimming pools.

...read moreread less

Journal Article•DOI•

Classification of white blood cells with SVM by selecting SqueezeNet and LIME properties by mRMR method

[...]

Erdal Başaran

21 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

End-to-end deep learning of lane detection and path prediction for real-time autonomous driving

[...]

22 Apr 2022-Signal, Image and Video Processing

TL;DR: In this article , a lightweight UNet using depthwise separable convolutions (DSUNet) was proposed for end-to-end learning of lane detection and path prediction in autonomous driving.

...read moreread less

Abstract: Inspired by the UNet architecture of semantic image segmentation, we propose a lightweight UNet using depthwise separable convolutions (DSUNet) for end-to-end learning of lane detection and path prediction (PP) in autonomous driving. We also design and integrate a PP algorithm with convolutional neural network (CNN) to form a simulation model (CNN-PP) that can be used to assess CNN’s performance qualitatively, quantitatively, and dynamically in a host agent car driving along with other agents all in a real-time autonomous manner. DSUNet is 5.12 $$\times $$ lighter in model size and 1.61 $$\times $$ faster in inference than UNet. DSUNet-PP outperforms UNet-PP in mean average errors of predicted curvature and lateral offset for path planning in dynamic simulation. DSUNet-PP outperforms a modified UNet in lateral error, which is tested in a real car on real road. These results show that DSUNet is efficient and effective for lane detection and path prediction in autonomous driving.

...read moreread less

Journal Article•DOI•

COVID-19 detection on chest radiographs using feature fusion based deep learning

[...]

Fatih Bayram, Alaa Eleyan

24 Jan 2022-Signal, Image and Video Processing

TL;DR: In this article , a multi-stream convolutional neural network (CNN) was used for feature extraction and classification of X-ray images in diagnosing COVID-19 patients.

...read moreread less

Abstract: The year 2020 will certainly be remembered in human history as the year in which humans faced a global pandemic that drastically affected every living soul on planet earth. The COVID-19 pandemic certainly had a massive impact on human's social and daily lives. The economy and relations of all countries were also radically impacted. Due to such unexpected situations, healthcare systems either collapsed or failed under colossal pressure to cope with the overwhelming numbers of patients arriving at emergency rooms and intensive care units. The COVID -19 tests used for diagnosis were expensive, slow, and gave indecisive results. Unfortunately, such a hindered diagnosis of the infection prevented abrupt isolation of the infected people which, in turn, caused the rapid spread of the virus. In this paper, we proposed the use of cost-effective X-ray images in diagnosing COVID-19 patients. Compared to other imaging modalities, X-ray imaging is available in most healthcare units. Deep learning was used for feature extraction and classification by implementing a multi-stream convolutional neural network model. The model extracts and concatenates features from its three inputs, namely; grayscale, local binary patterns, and histograms of oriented gradients images. Extensive experiments using fivefold cross-validation were carried out on a publicly available X-ray database with 3886 images of three classes. Obtained results outperform the results of other algorithms with an accuracy of 97.76%. The results also show that the proposed model can make a significant contribution to the rapidly increasing workload in health systems with an artificial intelligence-based automatic diagnosis tool.

...read moreread less

Journal Article•DOI•

RBI-2RCNN: Residual Block Intensity Feature using a Two-stage Residual Convolutional Neural Network for Static Hand Gesture Recognition

[...]

Jaya Prakash Sahoo, Suraj Prakash Sahoo, Samit Ari, Sarat Kumar Patra

18 Feb 2022-Signal, Image and Video Processing

TL;DR: A two-stage residual CNN (2RCNN) architecture for learning of features from the color hand gesture images which overcomes the need of a specific preprocessing step, and a novel residual block intensity (RBI) feature to extract the global and local information from the hand gestures images.

...read moreread less

Journal Article•DOI•

Detection of counterfeit banknotes by security components based on image processing and GoogLeNet deep learning network

[...]

Kamran Teymournezhad, Hossein Azgomi, Ali Asghari

10 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Wiener model-based system identification using moth flame optimised Kalman filter algorithm

[...]

Lakshminarayana Janjanam, Suman Kumar Saha, Rajib Kar, Durbadal Mandal

26 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Low-dose CT image denoising using deep convolutional neural networks with extended receptive fields

[...]

Dinh Hoan Trinh¹, Edern Hirstein²•Institutions (2)

University of Lorraine¹, Quy Nhon University²

12 Feb 2022-Signal, Image and Video Processing

TL;DR: Wang et al. as mentioned in this paper proposed a dilated convolutional neural network (DCNN) for low-dose CT image denoising, where preprocessing and post-processing techniques are integrated into a DCNN to extend receptive fields.

...read moreread less

Abstract: How to reduce radiation dose while preserving the image quality as when using standard dose is an important topic in the computed tomography (CT) imaging domain because the quality of low-dose CT (LDCT) images is often strongly affected by noise and artifacts. Recently, there has been considerable interest in using deep learning as a post-processing step to improve the quality of reconstructed LDCT images. This paper provides, first, an overview of learning-based LDCT image denoising methods from patch-based early learning methods to state-of-the-art CNN-based ones and, then, a novel CNN-based method is presented. In the proposed method, preprocessing and post-processing techniques are integrated into a dilated convolutional neural network to extend receptive fields. Hence, large distance pixels in input images will participate in enriching feature maps of the learned model, leading to effective denoising. Experimental results showed that the proposed method is light, while its denoising effectiveness is competitive with well-known CNN-based models.

...read moreread less

Journal Article•DOI•

Skin lesion segmentation with attention-based SC-Conv U-Net and feature map distortion

[...]

Yu Wang, Shengsheng Wang

18 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Drowning behavior detection in swimming pool based on deep learning

[...]

Fei Lei, Hengyu Zhu, Feifei Tang, Xinyuan Wang

03 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Unconstrained face mask and face-hand interaction datasets: building a computer vision system to help prevent the transmission of COVID-19

[...]

Fevziye Irem Eyiokur¹•Institutions (1)

Istanbul Technical University¹

22 Jul 2022-Signal, Image and Video Processing

TL;DR: In this article , a computer vision system was developed to help prevent the transmission of COVID-19 by detecting face mask usage, face-hand interaction detection, and measuring social distance between people.

...read moreread less

Abstract: Health organizations advise social distancing, wearing face mask, and avoiding touching face to prevent the spread of coronavirus. Based on these protective measures, we developed a computer vision system to help prevent the transmission of COVID-19. Specifically, the developed system performs face mask detection, face-hand interaction detection, and measures social distance. To train and evaluate the developed system, we collected and annotated images that represent face mask usage and face-hand interaction in the real world. Besides assessing the performance of the developed system on our own datasets, we also tested it on existing datasets in the literature without performing any adaptation on them. In addition, we proposed a module to track social distance between people. Experimental results indicate that our datasets represent the real-world's diversity well. The proposed system achieved very high performance and generalization capacity for face mask usage detection, face-hand interaction detection, and measuring social distance in a real-world scenario on unseen data. The datasets are available at https://github.com/iremeyiokur/COVID-19-Preventions-Control-System.

...read moreread less

Journal Article•DOI•

DcsNet: a real-time deep network for crack segmentation

[...]

Jie Pang, Hua Zhang, Hao Zhao, Linjing Li

17 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Multi-classification speech emotion recognition based on two-stage bottleneck features selection and MCJD algorithm

[...]

Linhui Sun, Yiqing Huang, Qiuyan Li, Pingan Li

12 Jan 2022-Signal, Image and Video Processing

Journal Article•DOI•

Imbalance domain adaptation network with adversarial learning for fault diagnosis of rolling bearing

[...]

Hongqiu Zhu, Ziyi Huang, Biliang Lu, Fei Cheng, Can Zhou - Show less +1 more

17 May 2022-Signal, Image and Video Processing

TL;DR: A new imbalance domain adaptation network with adversarial learning (IDAL) is proposed that applies adversarialLearning to data augmentation of the target domain and uses the domain adaptation based on a neural network to narrow the feature distribution discrepancy between the source and target domains.

...read moreread less

Journal Article•DOI•

Speech emotion recognition using data augmentation method by cycle-generative adversarial networks

[...]

Arash Shilandari

09 Feb 2022-Signal, Image and Video Processing

TL;DR: In this article , a cycle-generative adversarial network (cycle-GAN) was proposed for data augmentation in speech emotion recognition (SER) systems, which is trained in an adversarial way to produce feature vectors similar to those in the training set.

...read moreread less

Abstract: One of the obstacles in developing speech emotion recognition (SER) systems is the data scarcity problem, i.e., the lack of labeled data for training these systems. Data augmentation is an effective method for increasing the amount of training data. In this paper, we propose a cycle-generative adversarial network (cycle-GAN) for data augmentation in the SER systems. For each of the five emotions considered, an adversarial network is designed to generate data that have a similar distribution to the main data in that class but have a different distribution to those of other classes. These networks are trained in an adversarial way to produce feature vectors similar to those in the training set, which are then added to the original training sets. Instead of using the common cross-entropy loss to train cycle-GANs, we use the Wasserstein divergence to mitigate the gradient vanishing problem and to generate high-quality samples. The proposed network has been applied to SER using the EMO-DB dataset. The quality of the generated data is evaluated using two classifiers based on support vector machine and deep neural network. The results showed that the recognition accuracy in unweighted average recall was about 83.33%, which is better than the baseline methods compared.

...read moreread less

Journal Article•DOI•

Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning

[...]

P. Sreevidya, S. Veni, O. V. Ramana Murthy

18 Jan 2022-Signal, Image and Video Processing

TL;DR: In this article , a multi-modal system is developed which has integrated information from audio and video modalities, and features are extracted, and neural network models with backpropagation are attempted for developing the models.

...read moreread less

Abstract: The objective of the work is to develop an automated emotion recognition system specifically targeted to elderly people. A multi-modal system is developed which has integrated information from audio and video modalities. The database selected for experiments is ElderReact, which contains 1323 video clips of 3 to 8 s duration of people above the age of 50. Here, all the six available emotions Disgust, Anger, Fear, Happiness, Sadness and Surprise are considered. In order to develop an automated emotion recognition system for aged adults, we attempted different modeling techniques. Features are extracted, and neural network models with backpropagation are attempted for developing the models. Further, for the raw video model, transfer learning from pretrained networks is attempted. Convolutional neural network and long short-time memory-based models were taken by maintaining the continuity in time between the frames while capturing the emotions. For the audio model, cross-model transfer learning is applied. Both the models are combined by fusion of intermediate layers. The layers are selected through a grid-based search algorithm. The accuracy and F1-score show that the proposed approach is outperforming the state-of-the-art results. Classification of all the images shows a minimum relative improvement of 6.5% for happiness to a maximum of 46% increase for sadness over the baseline results.

...read moreread less

Journal Article•DOI•

A Novel Threshold-Based Segmentation Method for Quantification of COVID-19 Lung Abnormalities

[...]

Azrin Khan, Rachael Garner, Marianna La Rocca, Sana Salehi, Dominique Duncan - Show less +1 more

28 Mar 2022-Signal, Image and Video Processing

TL;DR: In this article , a semi-automatic threshold-based segmentation method was proposed to generate region of interest (ROI) segmentations of infection visible on lung computed tomography (CT) scans.

...read moreread less

Abstract: Since December 2019, the novel coronavirus disease 2019 (COVID-19) has claimed the lives of more than 3.75 million people worldwide. Consequently, methods for accurate COVID-19 diagnosis and classification are necessary to facilitate rapid patient care and terminate viral spread. Lung infection segmentations are useful to identify unique infection patterns that may support rapid diagnosis, severity assessment, and patient prognosis prediction, but manual segmentations are time-consuming and depend on radiologic expertise. Deep learning-based methods have been explored to reduce the burdens of segmentation; however, their accuracies are limited due to the lack of large, publicly available annotated datasets that are required to establish ground truths. For these reasons, we propose a semi-automatic, threshold-based segmentation method to generate region of interest (ROI) segmentations of infection visible on lung computed tomography (CT) scans. Infection masks are then used to calculate the percentage of lung abnormality (PLA) to determine COVID-19 severity and to analyze the disease progression in follow-up CTs. Compared with other COVID-19 ROI segmentation methods, on average, the proposed method achieved improved precision ( 47.49% ) and specificity ( 98.40% ) scores. Furthermore, the proposed method generated PLAs with a difference of ±3.89% from the ground-truth PLAs. The improved ROI segmentation results suggest that the proposed method has potential to assist radiologists in assessing infection severity and analyzing disease progression in follow-up CTs.

...read moreread less

Collapse