Showing papers on "Standard test image published in 2021"

PDF

Open Access

Book Chapter•DOI•

Convolution-Free Medical Image Segmentation Using Transformers

[...]

Davood Karimi¹, Serge Vasylechko¹, Ali Gholipour¹•Institutions (1)

27 Sep 2021

TL;DR: In this paper, a network based on self-attention between neighboring patches and without any convolution operations was proposed to achieve better segmentation performance than a traditional CNN model for medical image segmentation.

...read moreread less

Abstract: Like other applications in computer vision, medical image segmentation and his email address have been most successfully addressed using deep learning models that rely on the convolution operation as their main building block. Convolutions enjoy important properties such as sparse interactions, weight sharing, and translation equivariance. These properties give convolutional neural networks (CNNs) a strong and useful inductive bias for vision tasks. However, the convolution operation also has important shortcomings: it performs a fixed operation on every test image regardless of the content and it cannot efficiently model long-range interactions. In this work we show that a network based on self-attention between neighboring patches and without any convolution operations can achieve better results. Given a 3D image block, our network divides it into $n^3$ 3D patches, where $n=3 \text { or } 5$ and computes a 1D embedding for each patch. The network predicts the segmentation map for the center patch of the block based on the self-attention between these patch embeddings. We show that the proposed model can achieve higher segmentation accuracies than a state of the art CNN. For scenarios with very few labeled images, we propose methods for pre-training the network on large corpora of unlabeled images. Our experiments show that with pre-training the advantage of our proposed network over CNNs can be significant when labeled training data is small.

...read moreread less

69 citations

Journal Article•DOI•

Test-time adaptable neural networks for robust medical image segmentation

[...]

Neerav Karani¹, Ertunc Erdil¹, Krishna Chaitanya¹, Ender Konukoglu¹•Institutions (1)

ETH Zurich¹

01 Feb 2021-Medical Image Analysis

TL;DR: In this article, a concatenation of two sub-networks, a relatively shallow image normalization network and a deep CNN segmentation network, is proposed for medical image segmentation.

...read moreread less

66 citations

Posted Content•

ShaRF: Shape-conditioned Radiance Fields from a Single View

[...]

Konstantinos Rematas¹, Ricardo Martin-Brualla¹, Vittorio Ferrari¹•Institutions (1)

Google¹

17 Feb 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: A method for estimating neural scenes representations of objects given only a single image based on a generative process that first maps a latent code to a voxelized shape, and then renders it to an image, with the object appearance being controlled by a second latent code.

...read moreread less

Abstract: We present a method for estimating neural scenes representations of objects given only a single image. The core of our method is the estimation of a geometric scaffold for the object and its use as a guide for the reconstruction of the underlying radiance field. Our formulation is based on a generative process that first maps a latent code to a voxelized shape, and then renders it to an image, with the object appearance being controlled by a second latent code. During inference, we optimize both the latent codes and the networks to fit a test image of a new object. The explicit disentanglement of shape and appearance allows our model to be fine-tuned given a single image. We can then render new views in a geometrically consistent manner and they represent faithfully the input object. Additionally, our method is able to generalize to images outside of the training domain (more realistic renderings and even real photographs). Finally, the inferred geometric scaffold is itself an accurate estimate of the object's 3D shape. We demonstrate in several experiments the effectiveness of our approach in both synthetic and real images.

...read moreread less

53 citations

Journal Article•DOI•

These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition

[...]

Gurmail Singh¹, Kin-Choong Yow¹•Institutions (1)

University of Regina¹

09 Mar 2021-IEEE Access

TL;DR: In this article, a negative-positive prototypical part network (NP-ProtoPNet) is proposed to imitate human reasoning for image recognition while comparing the parts of a test image with the corresponding parts of the images from known classes.

...read moreread less

Abstract: Interpretation of the reasoning process of a prediction made by a deep learning model is always desired. However, when it comes to the predictions of a deep learning model that directly impacts on the lives of people then the interpretation becomes a necessity. In this paper, we introduce a deep learning model: negative-positive prototypical part network (NP-ProtoPNet). This model attempts to imitate human reasoning for image recognition while comparing the parts of a test image with the corresponding parts of the images from known classes. We demonstrate our model on the dataset of chest $X$ -ray images of Covid-19 patients, pneumonia patients and normal people. The accuracy and precision that our model receives is on par with the best performing non-interpretable deep learning models.

...read moreread less

41 citations

Proceedings Article•DOI•

Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning

[...]

Zhixiang Chi¹, Yang Wang¹, Yuanhao Yu¹, Jin Tang¹•Institutions (1)

Huawei¹

01 Jun 2021

TL;DR: Li et al. as mentioned in this paper proposed a self-supervised meta-auxiliary learning to improve the performance of deblurring by integrating both external and internal learning, which is able to exploit the internal information at test time via the auxiliary task to enhance the performance.

...read moreread less

Abstract: In this paper, we tackle the problem of dynamic scene deblurring. Most existing deep end-to-end learning approaches adopt the same generic model for all unseen test images. These solutions are sub-optimal, as they fail to utilize the internal information within a specific image. On the other hand, a self-supervised approach, SelfDeblur, enables internal training within a test image from scratch, but it does not fully take advantage of large external datasets. In this work, we propose a novel self-supervised meta-auxiliary learning to improve the performance of deblurring by integrating both external and internal learning. Concretely, we build a self-supervised auxiliary reconstruction task that shares a portion of the network with the primary deblurring task. The two tasks are jointly trained on an external dataset. Furthermore, we propose a meta-auxiliary training scheme to further optimize the pretrained model as a base learner, which is applicable for fast adaptation at test time. During training, the performance of both tasks is coupled. Therefore, we are able to exploit the internal information at test time via the auxiliary task to enhance the performance of deblurring. Extensive experimental results across evaluation datasets demonstrate the effectiveness of test-time adaptation of the proposed method.

...read moreread less

30 citations

Proceedings Article•DOI•

DualSR: Zero-Shot Dual Learning for Real-World Super-Resolution

[...]

Mohammad Emad¹, Maurice Peemen², Henk Corporaal¹•Institutions (2)

Eindhoven University of Technology¹, Thermo Fisher Scientific²

01 Jan 2021

TL;DR: DualSR as discussed by the authors proposes a dual-path architecture that learns an image-specific low-to-high resolution mapping using only patches of the input test image, where a downsampler learns the degradation process using a generative adversarial network, and an up-sampler learns to super-resolve that specific image.

...read moreread less

Abstract: Advanced methods for single image super-resolution (SISR) based upon Deep learning have demonstrated a remarkable reconstruction performance on downscaled images. However, for real-world low-resolution images (e.g. images captured straight from the camera) they often generate blurry images and highlight unpleasant artifacts. The main reason is the training data that does not reflect the real-world super-resolution problem. They train the net-work using images downsampled with an ideal (usually bicubic) kernel. However, for real-world images the degradation process is more complex and can vary from image to image. This paper proposes a new dual-path architecture (DualSR) that learns an image-specific low-to-high resolution mapping using only patches of the input test image. For every image, a downsampler learns the degradation process using a generative adversarial network, and an up-sampler learns to super-resolve that specific image. In the DualSR architecture, the upsampler and downsampler are trained simultaneously and they improve each other using cycle consistency losses. For better visual quality and eliminating undesired artifacts, the upsampler is constrained by a masked interpolation loss. On standard benchmarks with unknown degradation kernels, DualSR outperforms recent blind and non-blind super-resolution methods in term of SSIM and generates images with higher perceptual quality. On real-world LR images it generates visually pleasing and artifact-free results.

...read moreread less

29 citations

Posted Content•

Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction

[...]

Baris Gecer¹, Stylianos Ploumpis¹, Irene Kotsia, Stefanos Zafeiriou¹•Institutions (1)

Imperial College London¹

16 May 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper harnesses the power of Generative Adversarial Networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images and achieves for the first time, to the best of the authors' knowledge, facial texture reconstruction with high-frequency details.

...read moreread less

Abstract: A lot of work has been done towards reconstructing the 3D facial structure from single images by capitalizing on the power of Deep Convolutional Neural Networks (DCNNs). In the recent works, the texture features either correspond to components of a linear texture space or are learned by auto-encoders directly from in-the-wild images. In all cases, the quality of the facial texture reconstruction is still not capable of modeling facial texture with high-frequency details. In this paper, we take a radically different approach and harness the power of Generative Adversarial Networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images. That is, we utilize GANs to train a very powerful facial texture prior \edit{from a large-scale 3D texture dataset}. Then, we revisit the original 3D Morphable Models (3DMMs) fitting making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. In order to be robust towards initialisation and expedite the fitting process, we propose a novel self-supervised regression based approach. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, to the best of our knowledge, facial texture reconstruction with high-frequency details.

...read moreread less

27 citations

Journal Article•DOI•

Autoencoder based self-supervised test-time adaptation for medical image analysis.

[...]

Yufan He¹, Aaron Carass¹, Lianrui Zuo¹, Blake E. Dewey¹, Jerry L. Prince¹ - Show less +1 more•Institutions (1)

Johns Hopkins University¹

19 Jun 2021-Medical Image Analysis

TL;DR: In this article, a self-domain-adapted network (SDA-Net) is proposed, which consists of three parts, which are all neural networks: a task model, which performs the image analysis task like segmentation; a set of autoencoders, and adaptors, which transform the test image and its features to minimize the domain shift.

...read moreread less

24 citations

Proceedings Article•DOI•

Face Detection in Real Time Live Video Using Yolo Algorithm Based on Vgg16 Convolutional Neural Network

[...]

Htet Aung¹, Alexander V. Bobkov¹, Nyan Lin Tun¹•Institutions (1)

Bauman Moscow State Technical University¹

17 May 2021

TL;DR: In this paper, the authors combine YOLO (You Only Look Once) algorithm with the VGG16 pre-trained convolutional neural network to propose an improvement for face detection systems.

...read moreread less

Abstract: Face detection is not only one of the most studied topics in the computer vision field but also a very important task in many applications, such as security access control systems, video surveillance, human-computer interface, and image database management. Nowadays, various methods were developed for face detection systems like Viola-Jones, RCNN, SSD, and so on. Many researchers are still trying to improve face detection systems with various illustrations, poses, skin colors, and real-time detection. This paper intends to combine YOLO (You Only Look Once) algorithm with the VGG16 pre-trained convolutional neural network to propose an improvement for face detection systems. Experimental results show that proposed method has detected the test image set with over 95 % of average precision. Also, our proposed method considerably increased face detection speed in real-time live video. The experiment of this work was using the Image Processing Toolbox and the Deep Learning Toolbox in MATLAB.

...read moreread less

20 citations

Journal Article•DOI•

Chest disease radiography in twofold: using convolutional neural networks and transfer learning

[...]

Prakash Choudhary¹, Abhishek Hazra²•Institutions (2)

National Institute of Technology, Hamirpur¹, Indian Institutes of Technology²

01 Jun 2021-Evolving Systems

TL;DR: Convolutional neural network architecture consisting of five types of layers, convolutional layer, an activation layer, Pooling layer and Fully connected layer followed by a Softmax layer which gives the probability of the output for every genre is proposed.

...read moreread less

Abstract: Computer-aided diagnosis and design in the medical province is an exciting domain owing to drastic growth in Medical images. Earlier handcraft feature learning techniques failed to achieve the targeted result in practical aspects. In this paper, we have adopted a deep learning artifice to reduce the semantic gap which exists between the low-level information captured by imaging devices and high-level information preserved by a human. The proposed work has twofold: first, we propose convolutional neural network architecture consisting of five types of layers, convolutional layer, an activation layer, Pooling layer and Fully connected layer followed by a Softmax layer which gives the probability of the output for every genre. The second contribution towards this paper is to find the solution of an unsolved problem in medical image analysis: “Uses of a pretrained model with adequate fine-tuning to eliminate the extra effort of making a new CNN architecture from scratch”. To address this puzzle, we employed a pretrained VGG-16 model (a famous CNN architecture trained on Image Net dataset) to train the same dataset. Grad-CAM is used for visualizing the model performance with respect to a test image. The proposed methods are evaluated on famous publicly available NIH dataset called Chest X-Ray 14 and have created a new benchmark performance that has achieved state-of-the-art results 83.671% (scratch CNN) and 97.81% (transfer learning), which are much higher as compared to the other methods. Moreover, we also introduce in-depth comparison with the current existing works.

...read moreread less

18 citations

Journal Article•DOI•

Skin Lesion Classification by Ensembles of Deep Convolutional Networks and Regularly Spaced Shifting

[...]

Karl Thurnhofer-Hemsi¹, Ezequiel López-Rubio¹, Enrique Domínguez¹, David Elizondo²•Institutions (2)

University of Málaga¹, De Montfort University²

09 Aug 2021-IEEE Access

TL;DR: In this paper, an ensemble of improved convolutional neural networks combined with a test-time regularly spaced shifting technique was proposed for skin lesion classification, which showed a significant improvement on the well-known HAM10000 dataset in terms of accuracy and F-score.

...read moreread less

Abstract: Skin lesions are caused due to multiple factors, like allergies, infections, exposition to the sun, etc. These skin diseases have become a challenge in medical diagnosis due to visual similarities, where image classification is an essential task to achieve an adequate diagnostic of different lesions. Melanoma is one of the best-known types of skin lesions due to the vast majority of skin cancer deaths. In this work, we propose an ensemble of improved convolutional neural networks combined with a test-time regularly spaced shifting technique for skin lesion classification. The shifting technique builds several versions of the test input image, which are shifted by displacement vectors that lie on a regular lattice in the plane of possible shifts. These shifted versions of the test image are subsequently passed on to each of the classifiers of an ensemble. Finally, all the outputs from the classifiers are combined to yield the final result. Experiment results show a significant improvement on the well-known HAM10000 dataset in terms of accuracy and F-score. In particular, it is demonstrated that our combination of ensembles with test-time regularly spaced shifting yields better performance than any of the two methods when applied alone.

...read moreread less

Journal Article•DOI•

Vulnerability in Deep Transfer Learning Models to Adversarial Fast Gradient Sign Attack for COVID-19 Prediction from Chest Radiography Images

[...]

Pal, Biprodip, Gupta, Debashis, Rashed-Al-Mahfuz,, Salem A. Alyami, Mohammad Ali Moni

07 May 2021-Applied Sciences

TL;DR: The potential vulnerability of pre-trained convolutional neural network algorithms to the FGSM attack in terms of two frequently used models, VGG16 and Inception-v3 is explored, showing that correct class probability of any test image can drop for both considered models and with increased perturbation.

...read moreread less

Abstract: The COVID-19 pandemic requires the rapid isolation of infected patients. Thus, high-sensitivity radiology images could be a key technique to diagnose patients besides the polymerase chain reaction approach. Deep learning algorithms are proposed in several studies to detect COVID-19 symptoms due to the success in chest radiography image classification, cost efficiency, lack of expert radiologists, and the need for faster processing in the pandemic area. Most of the promising algorithms proposed in different studies are based on pre-trained deep learning models. Such open-source models and lack of variation in the radiology image-capturing environment make the diagnosis system vulnerable to adversarial attacks such as fast gradient sign method (FGSM) attack. This study therefore explored the potential vulnerability of pre-trained convolutional neural network algorithms to the FGSM attack in terms of two frequently used models, VGG16 and Inception-v3. Firstly, we developed two transfer learning models for X-ray and CT image-based COVID-19 classification and analyzed the performance extensively in terms of accuracy, precision, recall, and AUC. Secondly, our study illustrates that misclassification can occur with a very minor perturbation magnitude, such as 0.009 and 0.003 for the FGSM attack in these models for X-ray and CT images, respectively, without any effect on the visual perceptibility of the perturbation. In addition, we demonstrated that successful FGSM attack can decrease the classification performance to 16.67% and 55.56% for X-ray images, as well as 36% and 40% in the case of CT images for VGG16 and Inception-v3, respectively, without any human-recognizable perturbation effects in the adversarial images. Finally, we analyzed that correct class probability of any test image which is supposed to be 1, can drop for both considered models and with increased perturbation; it can drop to 0.24 and 0.17 for the VGG16 model in cases of X-ray and CT images, respectively. Thus, despite the need for data sharing and automated diagnosis, practical deployment of such program requires more robustness.

...read moreread less

Journal Article•DOI•

Segmentation of experimental datasets via convolutional neural networks trained on phase field simulations

[...]

Jiwon Yeom¹, Tiberiu Stan², Seungbum Hong¹, Peter W. Voorhees²•Institutions (2)

KAIST¹, Northwestern University²

01 Aug 2021-Acta Materialia

TL;DR: It is shown that it is possible to segment experimental materials science data using a SegNet-based CNN that was trained only using simple phase field simulations, and the CNN trained on phase field images segmented the experimental test image with 99% accuracy.

...read moreread less

Journal Article•DOI•

No-Reference Image Quality Assessment with Global Statistical Features

[...]

Domonkos Varga¹•Institutions (1)

Budapest University of Technology and Economics¹

05 Feb 2021-Journal of Imaging

TL;DR: In this article, a no-reference image quality assessment method is proposed using a set of quality-aware features which globally characterizes the statistics of a given test image, such as extended local fractal dimension distribution feature, extended first digit distribution features using different domains, Bilaplacian features, image moments, and a wide variety of perceptual features.

...read moreread less

Abstract: The perceptual quality of digital images is often deteriorated during storage, compression, and transmission The most reliable way of assessing image quality is to ask people to provide their opinions on a number of test images However, this is an expensive and time-consuming process which cannot be applied in real-time systems In this study, a novel no-reference image quality assessment method is proposed The introduced method uses a set of novel quality-aware features which globally characterizes the statistics of a given test image, such as extended local fractal dimension distribution feature, extended first digit distribution features using different domains, Bilaplacian features, image moments, and a wide variety of perceptual features Experimental results are demonstrated on five publicly available benchmark image quality assessment databases: CSIQ, MDID, KADID-10k, LIVE In the Wild, and KonIQ-10k

...read moreread less

Journal Article•DOI•

Blind Image Quality Assessment Based on Multi-scale KLT

[...]

Chao Yang¹, Xinfeng Zhang², Ping An¹, Liquan Shen¹, C.-C. Jay Kuo³ - Show less +1 more•Institutions (3)

Shanghai University¹, Chinese Academy of Sciences², University of Southern California³

01 Jan 2021-IEEE Transactions on Multimedia

TL;DR: An unsupervised feature extraction approach for BIQA based on Karhunen-Loéve transform (KLT), where a normalization operation is firstly applied to the test image by calculating its mean subtracted contrast normalized (MSCN) coefficient, and generalized Gaussian distribution is employed to model the KLT coefficients distribution in different spectral components as quality relevant features.

...read moreread less

Abstract: Blind image quality assessment (BIQA) plays an important role in image services as independent of the reference image. Herein, the perceptual relevant feature design is the core of BIQA methods, but their performance is still not satisfied at present. In this work, we propose an unsupervised feature extraction approach for BIQA based on Karhunen-Loeve transform (KLT). Specifically, a normalization operation is firstly applied to the test image by calculating its mean subtracted contrast normalized (MSCN) coefficient. Then, KLT is employed as a data-driven feature extraction approach to extract image structural features, wherein kernels with different sizes are utilized to perform multi-scale analysis. Finally, generalized Gaussian distribution (GGD) is employed to model the KLT coefficients distribution in different spectral components as quality relevant features. Extensive experiments conducted on four widely utilized IQA databases have demonstrated that the proposed Multi-scale KLT (MsKLT) BIQA metric compares favorably with existing BIQA methods in terms of high accordance with human subjective scores on both common and uncommon distortion types.

...read moreread less

Proceedings Article•DOI•

Enhanced Hyperspectral Image Super-Resolution Via RGB Fusion and TV-TV Minimization

[...]

Marija Vella, Bowen Zhang, Wei Chen, Joao F. C. Mota

19 May 2021

TL;DR: Experimental results show that the proposed framework produces images of superior spatial and spectral resolution compared to the current leading methods, whether model-or DL-based.

...read moreread less

Abstract: Hyperspectral (HS) images contain detailed spectral information that has proven crucial in applications like remote sensing, surveillance, and astronomy. However, because of hardware limitations of HS cameras, the captured images have low spatial resolution. To improve them, the low-resolution hyperspectral images are fused with conventional high-resolution RGB images via a technique known as fusion based HS image super-resolution. Currently, the best performance in this task is achieved by deep learning (DL) methods. Such methods, however, cannot guarantee that the input measurements are satisfied in the recovered image, since the learned parameters by the network are applied to every test image. Conversely, model-based algorithms can typically guarantee such measurement consistency. Inspired by these observations, we propose a framework that integrates learning and model based methods. Experimental results show that our method produces images of superior spatial and spectral resolution compared to the current leading methods, whether model- or DL-based.

...read moreread less

Posted Content•DOI•

Biological convolutions improve DNN robustness to noise and generalisation

[...]

Benjamin D. Evans¹, Gaurav Malhotra¹, Jeffrey S. Bowers¹•Institutions (1)

University of Bristol¹

18 Feb 2021-bioRxiv

TL;DR: In this article, fixed biological filter banks, in particular banks of Gabor filters, are used to constrain the networks to avoid reliance on shortcuts, making them develop more structured internal representations and more tolerant to noise.

...read moreread less

Abstract: Deep Convolutional Neural Networks (DNNs) have achieved superhuman accuracy on standard image classification benchmarks. Their success has reignited significant interest in their use as models of the primate visual system, bolstered by claims of their architectural and representational similarities. However, closer scrutiny of these models suggests that they rely on various forms of shortcut learning to achieve their impressive performance, such as using texture rather than shape information. Such superficial solutions to image recognition have been shown to make DNNs brittle in the face of more challenging tests such as noise-perturbed or out-of-domain images, casting doubt on their similarity to their biological counterparts. In the present work, we demonstrate that adding fixed biological filter banks, in particular banks of Gabor filters, helps to constrain the networks to avoid reliance on shortcuts, making them develop more structured internal representations and more tolerant to noise. Importantly, they also gained around 20 − 30% improved accuracy when generalising to our novel out-of-domain test image sets over standard end-to-end trained architectures. We take these findings to suggest that these properties of the primate visual system should be incorporated into DNNs to make them more able to cope with real-world vision and better capture some of the more impressive aspects of human visual perception such as generalisation.

...read moreread less

Journal Article•DOI•

Performance Comparison of Oil Spill and Ship Classification from X-Band Dual- and Single-Polarized SAR Image Using Support Vector Machine, Random Forest, and Deep Neural Network

[...]

Won-Kyung Baek, Hyung-Sup Jung

12 Aug 2021-Remote Sensing

TL;DR: In this paper, the authors compared the performance of support vector machine (SVM), random forest (RF), and deep neural network (DNN) models on a single-and dual-polarized X-band SAR image.

...read moreread less

Abstract: It is well known that the polarization characteristics in X-band synthetic aperture radar (SAR) image analysis can provide us with additional information for marine target classification and detection. Normally, dual-and single-polarized SAR images are acquired by SAR satellites, and then we must determine how accurate the marine mapping performance from dual-polarized (pol) images is versus the marine mapping performance from the single-pol images in a given machine learning model. The purpose of this study is to compare the performance of single- and dual-pol SAR image classification achieved by the support vector machine (SVM), random forest (RF), and deep neural network (DNN) models. The test image is a TerraSAR-X dual-pol image acquired from the 2007 Kerch Strait oil spill event. For this, 824,026 pixels and 1,648,051 pixels were extracted from the image for the training and test, respectively, and sea, ship, oil, and land objects were classified from the image by using the three machine learning methods. The mean f1-scores of the SVM, RF, and DNN models resulting from the single-pol image were approximately 0.822, 0.882, and 0.889, respectively, and those from the dual-pol image were about 0.852, 0.908, and 0.898, respectively. The performance improvement achieved by dual-pol was about 3.6%, 2.9%, and 1% in SVM, RF, and DNN, respectively. The DNN model had the best performance (0.889) in the single-pol test while the RF model was best (0.908) in the dual-pol test. The performance improvement was approximately 2.1% and not noticeable. If the condition that dual-pol images have two-times lower spatial resolution versus single-pol images in the azimuth direction is considered, a small improvement may not be valuable. Therefore, the results show that the performance improvement by X-band dual-pol image may be not remarkable when classifying the sea, ships, oil spills, and sea and land surfaces.

...read moreread less

Journal Article•DOI•

A new medical image encryption algorithm based on the 1D logistic map associated with pseudo-random numbers

[...]

Manish Kumar¹, Prateek Gupta¹•Institutions (1)

Birla Institute of Technology and Science¹

01 May 2021-Multimedia Tools and Applications

TL;DR: In this article, a fast and secure encryption algorithm for medical images based on the 1D logistic map associated with pseudo-random numbers has been proposed, which has been tested for robustness and effectiveness using the standard tests available.

...read moreread less

Abstract: A new, fast, and secure encryption algorithm for medical images based on the 1D logistic map associated with pseudo-random numbers has been proposed. Initial values and parameters of the logistic map play an important role (as secret keys) to generate key matrices for shuffling and substituting pixels in the image. The proposed algorithm has been designed to provide the user control over the level of security required by increasing or decreasing the number of rounds of the encryption process. During the encryption process, two pseudo-random rows and two pseudo-random columns have been inserted on each side of the original image to counter chosen and known plain-image attacks. The proposed algorithm has been tested for robustness and effectiveness using the standard tests available. Further, differential and noise attacks have also been analyzed. Cryptanalysis of the proposed algorithm has been performed by testing it against most of the frequently used attacks, such as known and chosen plain-image attacks. The run time for different images has been recorded to check the efficiency of the proposed algorithm. The tests were performed on 50 grayscale and 50 RGB images. The average entropy and NPCR of encrypted images were approximately 7.99 and 99.6%, respectively, for the selected images. Some medical images, such as the human brain, MRI, and lungs, have been selected to demonstrate the output of the proposed algorithm. Similarly, the proposed algorithm has been tested for a standard non-medical test image as well. The obtained results have also been compared with existing competing algorithms. The proposed algorithm can be apt for practical use.

...read moreread less

Journal Article•DOI•

[...]

Zhengqi Zhang¹, Li Zhang¹, Meng Zhang¹•Institutions (1)

Soochow University (Suzhou)¹

01 Apr 2021-The Visual Computer

TL;DR: A novel distance measurement scheme for NNC and applies it to SSFR, called dissimilarity-based nearest neighbor classifier (DNNC), first segments each image into non-overlapping patches with a given size and then generates an ordered image patch set.

...read moreread less

Abstract: In single-sample face recognition (SSFR) tasks, the nearest neighbor classifier (NNC) is the most popular method for its simplicity in implementation. However, in complex situations with light, posture, expression, and obscuration, NNC cannot achieve good recognition performance when applying common distance measurements, such as the Euclidean distance. Thus, this paper proposes a novel distance measurement scheme for NNC and applies it to SSFR. The proposed method, called dissimilarity-based nearest neighbor classifier (DNNC), first segments each (training or test) image into non-overlapping patches with a given size and then generates an ordered image patch set. The dissimilarities between the given test image patch set and the training image patch sets are computed and taken as the distance measurement of NNC. The smaller the dissimilarity of image patch sets is, the closer is the distance from the test image to the training image. Therefore, the category of the test image can be determined according to the smallest dissimilarity. Extensive experiments on the AR face database demonstrate the effectiveness of DNNC, especially for the case of obscuration.

...read moreread less

Book Chapter•DOI•

Culprit-Prune-Net: Efficient Continual Sequential Multi-domain Learning with Application to Skin Lesion Classification

[...]

Nourhan Bayasi¹, Ghassan Hamarneh², Rafeef Garbi¹•Institutions (2)

University of British Columbia¹, Simon Fraser University²

27 Sep 2021

TL;DR: In this article, the authors proposed a new pruning criterion that allows a fixed network to learn new data domains sequentially over time, without requiring access to their training data, while simultaneously avoiding catastrophic forgetting and maintaining accurate performance.

...read moreread less

Abstract: Despite recent advances in deep learning based medical image computing, clinical implementations in patient-care settings have been limited with lack of sufficiently diverse data during training remaining a pivotal impediment to robust real-life model performance. Continual learning (CL) offers a desirable property of deep neural network models (DNNs), namely the ability to continually learn from new data to accumulate knowledge whilst retaining what has been previously learned. In this work we present a simple and effective CL approach for sequential multi-domain learning (MDL) and showcase its utility in the skin lesion image classification task. Specifically, we propose a new pruning criterion that allows for a fixed network to learn new data domains sequentially over time. Our MDL approach incrementally builds on knowledge gained from previously learned domains, without requiring access to their training data, while simultaneously avoiding catastrophic forgetting and maintaining accurate performance on all domain data learned. Our new pruning criterion detects culprit units associated with wrong classification in each domain and releases these units so they are dedicated for subsequent learning on new domains. To reduce the computational cost associated with retraining the network post pruning, we implement MergePrune, which efficiently merges the pruning and training stages into one step. Furthermore, at inference time, instead of using a test-time oracle, we design a smart gate using Siamese networks to assign a test image to the most appropriate domain and its corresponding learned model. We present extensive experiments on 6 skin lesion image databases, representing different domains with varying levels of data bias and class imbalance, including quantitative comparisons against multiple baselines and state-of-the-art methods, which demonstrate superior performance and efficient computations of our proposed method.

...read moreread less

Book Chapter•DOI•

Implicit Field Learning for Unsupervised Anomaly Detection in Medical Images

[...]

Sergio Naval Marimont¹, Giacomo Tarroni¹•Institutions (1)

City University London¹

27 Sep 2021

Abstract: We propose a novel unsupervised out-of-distribution detection method for medical images based on implicit fields image representations. In our approach, an auto-decoder feed-forward neural network learns the distribution of healthy images in the form of a mapping between spatial coordinates and probabilities over a proxy for tissue types. At inference time, the learnt distribution is used to retrieve, from a given test image, a restoration, i.e. an image maximally consistent with the input one but belonging to the healthy distribution. Anomalies are localized using the voxel-wise probability predicted by our model for the restored image. We tested our approach in the task of unsupervised localization of gliomas on brain MR images and compared it to several other VAE-based anomaly detection methods. Results show that the proposed technique substantially outperforms them (average DICE 0.640 vs 0.518 for the best performing VAE-based alternative) while also requiring considerably less computing time.

...read moreread less

Journal Article•DOI•

Robust and Secure Zero-Watermarking Algorithm for Medical Images Based on Harris-SURF-DCT and Chaotic Map

[...]

Cheng Gong¹, Jingbing Li¹, Uzair Aslam Bhatti², Ming Gong, Jixin Ma³, Mengxing Huang¹ - Show less +2 more•Institutions (3)

Hainan University¹, Nanjing Normal University², University of Greenwich³

03 Nov 2021-Security and Communication Networks

TL;DR: Wang et al. as mentioned in this paper proposed a robust watermarking algorithm for medical images based on Harris-SURF-DCT, which can extract the watermark from the test image without the original image.

...read moreread less

Abstract: To protect the patient information in medical images, this article proposes a robust watermarking algorithm for medical images based on Harris-SURF-DCT. First, the corners of the medical image are extracted using the Harris corner detection algorithm, and then, the previously extracted corners are described using the method of describing feature points in the SURF algorithm to generate the feature descriptor matrix. Then, the feature descriptor matrix is processed through the perceptual hash algorithm to obtain the feature vector of the medical image, which is a binary feature vector with a size of 32 bits. Secondly, to enhance the security of the watermark information, the logistic map algorithm is used to encrypt the watermark before embedding the watermark. Finally, with the help of cryptography knowledge, third party, and zero-watermarking technology, the algorithm can embed the watermark without modifying the medical image. When extracting the watermark, the algorithm can extract the watermark from the test image without the original image. In addition, the algorithm has strong robustness to conventional attacks and geometric attacks. Especially under geometric attacks, the algorithm performs better.

...read moreread less

Journal Article•DOI•

2D Pose-Invariant Face Recognition Using Single Frontal-View Face Database

[...]

Chayanut Petpairote¹, Suthep Madarasmi, Kosin Chamnongthai¹•Institutions (1)

King Mongkut's University of Technology Thonburi¹

01 Jun 2021-Wireless Personal Communications

TL;DR: A method of 2D pose-invariant face recognition that assumes the search database contains only frontal view faces, and performs with acceptable and similar accuracy to conventional methods, while using only frontal faces in the test database.

...read moreread less

Abstract: Personal identification systems that use face recognition work well for test images with frontal view face, but often fail when the input face is a pose view. Most face databases come from picture ID sources such as passports or driver’s licenses. In such databases, only the frontal view is available. This paper proposes a method of 2D pose-invariant face recognition that assumes the search database contains only frontal view faces. Given a non-frontal view of a test face, the pose-view angle is first calculated by matching the test image with a database of canonical faces with head rotations to find the best matched image. This database of canonical faces is used only to find the head rotation. The database does not contain images of the test face itself, but has a selection of template faces, each face having rotation images of − 45°, − 30°, − 15°, 0°, 15°, 30°, and 45°. The landmark features in the best matched rotated canonical face such as say rotation 15° and it’s corresponding frontal face of rotation 0° are used to create a warp transformation to convert the 15° rotated test face to a frontal face. This warp will introduce some distortion artifacts since some features of the non-frontal input face are not visible due to self-occlusion. The warped image is, therefore, enhanced by mixing intensities using the left/right facial symmetry assumption. The enhanced synthesized frontal face image is then used to find the best match target in the frontal face database. We test our approach using CMU Multi-PIE database images. Our method performs with acceptable and similar accuracy to conventional methods, while using only frontal faces in the test database.

...read moreread less

Journal Article•DOI•

Perceptual Image Compression with Block-Level Just Noticeable Difference Prediction

[...]

Tao Tian¹, Hanli Wang¹, Sam Kwong², C.-C. Jay Kuo³•Institutions (3)

Tongji University¹, City University of Hong Kong², University of Southern California³

28 Jan 2021-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: In this article, a block-level perceptual image compression framework is proposed, including a blocklevel just noticeable difference (JND) prediction model and a preprocessing scheme, which is able to achieve 16.75% bit saving as compared to the state-of-the-art method with similar subjective quality.

...read moreread less

Abstract: A block-level perceptual image compression framework is proposed in this work, including a block-level just noticeable difference (JND) prediction model and a preprocessing scheme. Specifically speaking, block-level JND values are first deduced by utilizing the OTSU method based on the variation of block-level structural similarity values between two adjacent picture-level JND values in the MCL-JCI dataset. After the JND value for each image block is generated, a convolutional neural network–based prediction model is designed to forecast block-level JND values for a given target image. Then, a preprocessing scheme is devised to modify the discrete cosine transform coefficients during JPEG compression on the basis of the distribution of block-level JND values of the target test image. Finally, the test image is compressed by the max JND value across all of its image blocks in the light of the initial quality factor setting. The experimental results demonstrate that the proposed block-level perceptual image compression method is able to achieve 16.75% bit saving as compared to the state-of-the-art method with similar subjective quality. The project page can be found at https://mic.tongji.edu.cn/43/3f/c9778a148287/page.htm.

...read moreread less

Journal Article•DOI•

A Histogram-Based Low-Complexity Approach for the Effective Detection of COVID-19 Disease from CT and X-ray Images

[...]

Michele Scarpiniti, Sima Sarv Ahrabi, Enzo Baccarelli, Lorenzo Piazzo, Alireza Momenzadeh - Show less +1 more

23 Sep 2021-Applied Sciences

TL;DR: An approach based on the evaluation of the histogram from a common class of images that is considered as the target, which shows that, at least when the images of the considered datasets are homogeneous enough, it is not really needed to resort to complex-to-implement DL techniques, in order to attain an effective detection of the COVID-19 disease.

...read moreread less

Abstract: The global COVID-19 pandemic certainly has posed one of the more difficult challenges for researchers in the current century. The development of an automatic diagnostic tool, able to detect the disease in its early stage, could undoubtedly offer a great advantage to the battle against the pandemic. In this regard, most of the research efforts have been focused on the application of Deep Learning (DL) techniques to chest images, including traditional chest X-rays (CXRs) and Computed Tomography (CT) scans. Although these approaches have demonstrated their effectiveness in detecting the COVID-19 disease, they are of huge computational complexity and require large datasets for training. In addition, there may not exist a large amount of COVID-19 CXRs and CT scans available to researchers. To this end, in this paper, we propose an approach based on the evaluation of the histogram from a common class of images that is considered as the target. A suitable inter-histogram distance measures how this target histogram is far from the histogram evaluated on a test image: if this distance is greater than a threshold, the test image is labeled as anomaly, i.e., the scan belongs to a patient affected by COVID-19 disease. Extensive experimental results and comparisons with some benchmark state-of-the-art methods support the effectiveness of the developed approach, as well as demonstrate that, at least when the images of the considered datasets are homogeneous enough (i.e., a few outliers are present), it is not really needed to resort to complex-to-implement DL techniques, in order to attain an effective detection of the COVID-19 disease. Despite the simplicity of the proposed approach, all the considered metrics (i.e., accuracy, precision, recall, and F-measure) attain a value of 1.0 under the selected datasets, a result comparable to the corresponding state-of-the-art DNN approaches, but with a remarkable computational simplicity.

...read moreread less

Proceedings Article•DOI•

A Combinatorial Approach to Testing Deep Neural Network-based Autonomous Driving Systems

[...]

Jaganmohan Chandrasekaran¹, Yu Lei¹, Raghu N. Kacker², D. Richard Kuhn²•Institutions (2)

University of Texas at Arlington¹, National Institute of Standards and Technology²

12 Apr 2021

TL;DR: In this paper, a combinatorial approach is proposed to generate test images by applying a set of combinations of some basic image transformation operations to a seed image, and then design an input parameter model based on the valid transformations and generate a t-way (t=2) test set.

...read moreread less

Abstract: Recent advancements in the field of deep learning have enabled its application in Autonomous Driving Systems (ADS). A Deep Neural Network (DNN) model is often used to perform tasks such as pedestrian detection, object detection, and steering control in ADS. Unfortunately, DNN models could exhibit incorrect or unexpected behavior in real-world scenarios. There is a need to rigorously test these models with real-world driving scenarios so that safety-critical bugs can be detected before their deployment in the real world.In this paper, we propose a combinatorial approach to testing DNN models. Our approach generates test images by applying a set of combinations of some basic image transformation operations to a seed image. First, we identify a set of valid transformation operations or simply transformations. Next, we design an input parameter model based on the valid transformations and generate a t-way (t=2) combinatorial test set. Each test represents a combination of transformations, and can be used to produce a test image. We execute the test images on a DNN model and distinguish between consistent and inconsistent behavior using a relation. We conducted an experimental evaluation of our approach on three DNN models that are used in the Udacity challenge. Our results suggest that test images generated by our approach can effectively identify inconsistent behaviors and can significantly increase neuron coverage. To the best of our knowledge, our work is the first effort to use a combinatorial testing approach to generating test images based on image transformations for testing DNNs used in ADS.

...read moreread less

DOI•

Robustface Recognition by Fusing Fuzzy Type 2 Induced Multiple Facial Fused Image

[...]

Manas Ghosh, Aniruddha Dey, Rabindra Nath Shaw¹, Ankush Ghosh•Institutions (1)

Galgotias University¹

24 Sep 2021

TL;DR: In this paper, a new decision making model for face recognition from the original image fused with their true and partial diagonal images by integrating the type-2 fuzzy set based approach to mitigate the factors that pretend the face recognition accuracy.

...read moreread less

Abstract: This paper projects a new decision making model for face recognition from the original image fused with their true and partial diagonal images by integrating the type-2 fuzzy set based approach to mitigate the factors that pretend the face recognition accuracy. The G2DFLD based feature vectors corresponding to a test image are given input to neural network based classifier trained with the feature vectors of the fused images to generate the merit weights with respect to different classes (subjects) under consideration. A new scheme has been introduced in the present approach to generate a score by employing a fuzzy type-2 set based treatment. These scores with respect to each of the classes under consideration are rendered from the feature vectors of the test image and those of the diagonally fused training samples. For each class, the score is fused weighted by the corresponding merit weights to generate the concluding score. These class-wise concluding scores are deliberated in recognizing the test face image. Faces from the well-known databases (gallery) with varied pose, illumination and occlusion are used to evaluate the performance of the model. It has been found that our model exhibits more accurate classification performance than existing similar kind of image level fusion method.

...read moreread less

Journal Article•DOI•

Super-Resolved Recognition of License Plate Characters

[...]

Sung-Jin Lee, Seok Bong Yoo

05 Oct 2021

TL;DR: Zhang et al. as mentioned in this paper designed and developed an integrated object recognition and super-resolution framework by proposing an image superresolution technique that improves object recognition accuracy. But in actual object recognition processes, recognition accuracy is often degraded due to resolution mismatches between training and test image data.

...read moreread less

Abstract: Object detection and recognition are crucial in the field of computer vision and are an active area of research. However, in actual object recognition processes, recognition accuracy is often degraded due to resolution mismatches between training and test image data. To solve this problem, we designed and developed an integrated object recognition and super-resolution framework by proposing an image super-resolution technique that improves object recognition accuracy. In detail, we collected a number of license plate training images through web-crawling and artificial data generation, and the image super-resolution artificial neural network was trained by defining an objective function to be robust to image flips. To verify the performance of the proposed algorithm, we experimented with the trained image super-resolution and recognition on representative test images and confirmed that the proposed super-resolution technique improves the accuracy of character recognition. For character recognition with the 4× magnification, the proposed method remarkably increased the mean average precision by 49.94% compared to the existing state-of-the-art method.

...read moreread less

Journal Article•DOI•

Person re-identification using adversarial haze attack and defense: A deep learning framework

[...]

Shansa Kanwal¹, Jamal Hussain Shah¹, Muhammad Attique Khan², Maryam Nisa¹, Seifedine Kadry, Muhammad Sharif¹, Mussarat Yasmin¹, M. Maheswari³ - Show less +4 more•Institutions (3)

COMSATS Institute of Information Technology¹, HITEC University², Sathyabama University³

01 Dec 2021-Computers & Electrical Engineering

TL;DR: The adversarial haze attack problem is addressed using the dark channel prior (DCP) de-hazing method, and a feature fusion model is proposed to fuse handcrafted features and a pre-trained network model to obtain robust and discriminative features.

...read moreread less