scispace - formally typeset
Search or ask a question

Showing papers in "Iet Image Processing in 2019"


Journal ArticleDOI
TL;DR: A supervised deep learning-based method is presented for change detection in synthetic aperture radar (SAR) images and experimental results indicated that the proposed method had an acceptable implementation time in addition to its desirable performance and high accuracy.
Abstract: In solving change detection problem, unsupervised methods are usually preferred to their supervised counterparts due to the difficulty of producing labelled data. Nevertheless, in this paper, a supervised deep learning-based method is presented for change detection in synthetic aperture radar (SAR) images. A Deep Belief Network (DBN) was employed as the deep architecture in the proposed method, and the training process of this network included unsupervised feature learning followed by supervised network fine-tuning. From a general perspective, the trained DBN produces a change detection map as the output. Studies on DBNs demonstrate that they do not produce ideal output without a proper dataset for training. Therefore, the proposed method in this study provided a dataset with an appropriate data volume and diversity for training the DBN using the input images and those obtained from applying the morphological operators on them. The great computational volume and the time-consuming nature of simulation are the drawbacks of deep learning-based algorithms. To overcome such disadvantages, a method was introduced to greatly reduce computations without compromising the performance of the trained DBN. Experimental results indicated that the proposed method had an acceptable implementation time in addition to its desirable performance and high accuracy.

104 citations


Journal ArticleDOI
TL;DR: The main objective of this study is to survey the recently conducted studies on depth perception in VR, augmented reality (AR), and mixed reality (MR).
Abstract: Depth perception is one of the important elements in virtual reality (VR). The perceived depth is influenced by the head mounted displays that inevitability decreases the virtual content's depth perception. While several questions within this area are still under research; the main objective of this study is to survey the recently conducted studies on depth perception in VR, augmented reality (AR), and mixed reality (MR). First, depth perception in the human visual system is discussed including the different visual cues involved in depth perception. Second, research performed to understand and confirm depth perception issue is examined. The contributions made to improve depth perception and specifically distance perception will be discussed with their main proposed design key, advantages, and limitations. Most of the contributions were based on using one or two depth cues to improve depth perception in VR, AR, and MR.

61 citations


Journal ArticleDOI
TL;DR: A novel feature learning framework, hard pentaplet and identity loss network (HPILN), is proposed, which outperforms all existing ones in terms of cumulative match characteristic curve and mean average precision.
Abstract: Most video surveillance systems use both RGB and infrared cameras, making it a vital technique to re-identify a person cross the RGB and infrared modalities. This task can be challenging due to both the cross-modality variations caused by heterogeneous images in RGB and infrared, and the intra-modality variations caused by the heterogeneous human poses, camera position, light brightness etc. To meet these challenges, a novel feature learning framework, hard pentaplet and identity loss network (HPILN), is proposed. In the framework existing single-modality re-identification models are modified to fit for the cross-modality scenario, following which specifically designed hard pentaplet loss and identity loss are used to increase the accuracy of the modified cross-modality re-identification models. Based on the benchmark of the SYSU-MM01 dataset, extensive experiments have been conducted, showing that the authors’ method outperforms all existing ones in terms of cumulative match characteristic curve and mean average precision.

58 citations


Journal ArticleDOI
TL;DR: A new chaotic map based on a real-time variable logistic map with a randomly selected decimal is proposed and applied to image encryption, showing that the new encryption algorithm can obtain a safely encrypted image at a minimal time complexity.
Abstract: Chaotic mapping has been widely used in image encryption given the unpredictability, ergodicity, and sensitivity of parameters and initial values and the high correspondence with cryptography. The logistic map has the disadvantages of uneven distribution, low security, and small parameter space. In order to overcome these disadvantages, in this article, a new chaotic map based on a real-time variable logistic map with a randomly selected decimal is proposed. Furthermore, this chaotic mapping is applied to image encryption. Several simulation experiments show that the new encryption algorithm can obtain a safely encrypted image at a minimal time complexity.

58 citations


Journal ArticleDOI
TL;DR: An algorithm with Laplacian pyramid built on 1D discrete wavelet transform (DWT) for multi-sensor medical image fusion which was found to be efficient for n -level decomposition and can work with all mother wavelets with less computational complexity.
Abstract: Generally, image fusions are carried out on two-dimensional (2D) images, but the operations with the 2D images increase the computational complexity than vector data. This study presents an algorithm with Laplacian pyramid built on 1D discrete wavelet transform (DWT) called modified multi-resolution DWT (MMDWT) for multi-sensor medical image fusion which was found to be efficient for n -level decomposition and can work with all mother wavelets with less computational complexity. The MMDWT methodology is compared with the discrete cosine transform, DWT, stationary WT, curvelet transform, principal component analysis, fuzzy and neurofuzzy technique and the performance measure is analysed. The performance evaluation of the MMDWT technique is illustrated using several sets of medical images provided by Health Care Global Enterprises Ltd. (HCG) Hospital Bangalore, based on the subjective and objective analyses. The result is validated by radiologists from HCG for subjective evaluation. The outcome of the MMDWT methodology is analysed with existing fusion algorithms and reveals the supremacy of the final fusion results.

54 citations


Journal ArticleDOI
TL;DR: Due to the high sensitivity introduced by the hyper digital chaos, a huge key space is provided for the encrypted image to ensure the high security level, thus the encryption algorithm has a strong secure capability against the brute-force attacks.
Abstract: A digital image encryption algorithm based on dynamic deoxyribonucleic acid coding and chaotic operations using hyper digital chaos in frequency-domain is proposed and demonstrated, where both the amplitude and phase components in frequency-domain are diffused and scrambled. The proposed encryption algorithm is evaluated through various evaluations of key parameters such as histogram uniformity, entropy, and correlation. Excellent performance of the encrypted image is achieved to resist the statistical attacks, which implies that the statistical properties of the original image are completely destroyed. In the encryption procedure, each cipher pixel is affected by all of the plain-pixels as well as cipher-pixels, due to the implementation of chaotic diffusion and scrambling operations, which increases the sensitivity of the encrypted image to the plain-text, and improves the security against any differential attacks. Moreover, due to the high sensitivity introduced by the hyper digital chaos, a huge key space is provided for the encrypted image to ensure the high security level, thus the encryption algorithm has a strong secure capability against the brute-force attacks.

51 citations


Journal ArticleDOI
TL;DR: A deep pixel-to-pixel networks model for underwater image enhancement is proposed by designing an encoding-decoding framework that outperforms the state-of-the-art image restoration methods in underwater image defogging, denoising and colour enhancement.
Abstract: Turbid underwater environment poses great difficulties for the applications of vision technologies. One of the biggest challenges is the complicated noise distribution of the underwater images due to the serious scattering and absorption. To alleviate this problem, this work proposes a deep pixel-to-pixel networks model for underwater image enhancement by designing an encoding-decoding framework. It employs the convolution layers as encoding to filter the noise, while uses deconvolution layers as decoding to recover the missing details and refine the image pixel by pixel. Moreover, skip connection is introduced in the networks model in order to avoid low-level features losing while accelerating the training process. The model achieves the image enhancement in a self-adaptive data-driven way rather than considering the physical environment. Several comparison experiments are carried out with different datasets. Results show that it outperforms the state-of-the-art image restoration methods in underwater image defogging, denoising and colour enhancement.

50 citations


Journal ArticleDOI
TL;DR: A biometric-based on an efficient medical image watermarking in E-healthcare application is proposed, which produces a system for authentication, confidentiality, and reliability of the system.
Abstract: Information hiding is particularly used for security applications to protect the secret message from an unauthorised person. Due to the tremendous development of the Internet and its usage, the issue of protection over the internet is increasing. Under such a condition, transforming the information from the transmitter to the receiver requires more security. Accordingly, in my previous research, an efficient medical image watermarking technique in E-healthcare application using a combination of compression and cryptography algorithm was proposed. The system only gives confidentiality and reliability. To overcome the problem, the authors propose a biometric-based on an efficient medical image watermarking in E-healthcare application, which produces a system for authentication, confidentiality, and reliability of the system. The proposed system utilises the fingerprint biometric for authentication, cryptography process for confidentiality, and reversible watermarking for the integrity. Basically, the proposed system consists of two stages such as (i) watermark embedding process and (ii) watermark extraction process. The experiments were carried out on the different medical images with electronic health record and the effectiveness of the proposed algorithm is analysed with the help of peak signal-to-noise ratio and normalised correlation.

49 citations


Journal ArticleDOI
TL;DR: To increase the accuracy for measuring features of images a hybrid and concatenation approach has been presented in the proposed research work which outperforms the existing methods of glaucoma detection.
Abstract: Glaucoma is a class of eye disorder; it causes progressive deterioration of optic nerve fibres. Discrete wavelet transforms (DWTs) and empirical wavelet transforms (EWTs) are widely used methods in the literature for feature extraction using image decomposition. However, to increase the accuracy for measuring features of images a hybrid and concatenation approach has been presented in the proposed research work. DWT decomposes images into approximate and detail coefficients and EWT decomposes images into its sub band images. The concatenation approach employs the combination of all features obtained using DWT and EWT and their combination. Extracted features from each of DWT, EWT, DWTEWT and EWTDWT are concatenated. Concatenated features are normalised, ranked and fed to singular value decomposition to find robust features. Fourteen robust features are used by support vector machine classifier. The obtained accuracy, sensitivity and specificity are 83.57, 86.40 and 80.80%, respectively, for tenfold cross validation which outperforms the existing methods of glaucoma detection.

44 citations


Journal ArticleDOI
TL;DR: This study presents a hybrid active contour model with a novel preprocessing technique to segment the retinal blood vessel in different fundus images by calculating a wide range of proven parameters to prove its robustness.
Abstract: In the present scenario, retinal image processing is toiling hard to get an efficient algorithm for de-noising and segmenting the blood vessel confined inside the closed curvature boundary. On this ground, this study presents a hybrid active contour model with a novel preprocessing technique to segment the retinal blood vessel in different fundus images. Contour driven black top-hat transformation and phase-based binarisation method have been implemented to preserve the edge and corner details of the vessels. In the proposed work, gradient vector flow (GVF)-based snake and balloon method are combined to achieve better accuracy over different existing active contour models. In the earlier active contour models, the snake cannot enter inside the closed curvature resulting loss of tiny blood vessels. To circumvent this problem, an inflation term F inf(balloon) with GVF-based snake is incorporated together to achieve the new internal energy of snake for effective vessel segmentation. The evaluation parameters are calculated over four publically available databases: STARE, DRIVE, CHASE, and VAMPIRE. The proposed model outperforms its competitors by calculating a wide range of proven parameters to prove its robustness. The proposed method achieves an accuracy of 0.97 for DRIVE & CHASE and 0.96 for STARE & VAMPIRE datasets.

43 citations


Journal ArticleDOI
TL;DR: This study provides a comprehensive study of state-of-the-art image denoising methods using CNN and shows PDNN shows the best result in terms of PSNR for both BSD-68 and Set-12 datasets.
Abstract: Convolutional neural networks (CNNs) are deep neural networks that can be trained on large databases and show outstanding performance on object classification, segmentation, image denoising etc. In the past few years, several image denoising techniques have been developed to improve the quality of an image. The CNN based image denoising models have shown improvement in denoising performance as compared to non-CNN methods like block-matching and three-dimensional (3D) filtering, contemporary wavelet and Markov random field approaches etc. which had remained state-of-the-art for years. This study provides a comprehensive study of state-of-the-art image denoising methods using CNN. The literature associated with different CNNs used for image restoration like residual learning based models (DnCNN-S, DnCNN-B, IDCNN), non-locality reinforced (NN3D), fast and flexible network (FFDNet), deep shrinkage CNN (SCNN), a model for mixed noise reduction, denoising prior driven network (PDNN) are reviewed. DnCNN-S and PDNN remove Gaussian noise of fixed level, whereas DnCNN-B, IDCNN, NN3D and SCNN are used for blind Gaussian denoising. FFDNet is used for spatially variant Gaussian noise. The performance of these CNN models is analysed on BSD-68 and Set-12 datasets. PDNN shows the best result in terms of PSNR for both BSD-68 and Set-12 datasets.

Journal ArticleDOI
TL;DR: A new illumination boost algorithm is proposed in this study, in which it can improve the brightness, ameliorate contrast and process the colours of nighttime images properly and outperformed the comparison algorithms in terms of scored accuracy and visual quality.
Abstract: Nighttime images are often obtained with low brightness, deficient contrast, and latent colours. Thus, it is important to improve such aspects in order to obtain acceptable quality images. Hence, a new illumination boost algorithm is proposed in this study, in which it can improve the brightness, ameliorate contrast and process the colours of nighttime images properly. Accordingly, the proposed algorithm utilises only a small number of steps and uses several processing concepts to achieve the desired results. Intensive experiments and tests with various natural-degraded nighttime images are made to validate the performance of the proposed algorithm. In addition, it is compared with eight contemporary algorithms, and the obtained results from these comparisons are evaluated using two specialised image quality assessment metrics. Using the results of the achieved experiments and comparisons, it became evident that the proposed algorithm can provide satisfactory outcomes, in which it provided visually pleasing results and outperformed the comparison algorithms in terms of scored accuracy and visual quality.

Journal ArticleDOI
TL;DR: A deep depthwise separable residual convolutional algorithm is introduced to perform binary melanoma classification on a dermoscopic skin lesion image dataset and dynamic effectiveness of the model is shown through its performance in multiple skin lesions image datasets.
Abstract: Melanoma is one of the four major types of skin cancers caused by malignant growth in the melanocyte cells. It is the rarest one, accounting to only 1% of all skin cancer cases. However, it is the deadliest among all the skin cancer types. Owing to its rarity, efficient diagnosis of the disease becomes rather difficult. Here, a deep depthwise separable residual convolutional algorithm is introduced to perform binary melanoma classification on a dermoscopic skin lesion image dataset. Prior to training the model with the dataset noise removal from the images using non-local means filter is performed followed by enhancement using contrast-limited adaptive histogram equilisation over discrete wavelet transform algorithm. Images are fed to the model as multi-channel image matrices with channels chosen across multiple color spaces based on their ability to optimize the performance of the model. Proper lesion detection and classification ability of the model are tested by monitoring the gradient weighted class activation maps and saliency maps, respectively. Dynamic effectiveness of the model is shown through its performance in multiple skin lesion image datasets. The proposed model achieved an ACC of 99.50% on international skin imaging collaboration (ISIC), 96.77% on PH2, 94.44% on DermIS and 95.23% on MED-NODE datasets.

Journal ArticleDOI
TL;DR: This study presents a detailed survey of the fundus IQA research with its significance, present status, limitations, and future scope, and the methodologies used have been analysed.
Abstract: Various ocular diseases, such as cataract, diabetic retinopathy, and glaucoma have affected a large proportion of the population worldwide. In ophthalmology, fundus photography is used for the diagnosis of such retinal disorders. Nowadays, the set-up of fundus image acquisition has changed from a fixed position to portable devices, making acquisition more vulnerable to distortions. However, a trustworthy diagnosis solely relies upon the quality of the fundus image. In recent years, fundus image quality assessment (IQA) has drawn much attention from researchers. This study presents a detailed survey of the fundus IQA research. The survey covers a comprehensive discussion on the factors affecting the fundus image quality and the real-time distortions. The fundus IQA algorithms have been analysed on the basis of the methodologies used and divided into three classes, namely: (i) similarity-based, (ii) segmentation-based, and (iii) machine learning based. In addition, limitations of state of the art in this research field are also presented with the possible solutions. The objective of this study is to provide a detailed information about the fundus IQA research with its significance, present status, limitations, and future scope. To the best of the authors' knowledge, this is the first survey paper on the fundus IQA research.

Journal ArticleDOI
TL;DR: This study presents a detailed study on FER techniques, classifiers and datasets used for analysing the efficacy of the recognition techniques and presents the challenges encountered by FER system along with the future direction.
Abstract: Over the past decades, facial expression recognition (FER) has become an interesting research area and achieved substantial progress in computer vision. FER is to detect human emotional state related to biometric traits. Developing a machine based human FER system is a quite challenging task. Various FER systems are developed by analysing facial muscle motion and skin deformation based algorithms. In conventional FER system, the developed algorithms work on the constrained database. In the unconstrained environment, the efficacy of existing algorithms is limited due to certain issues during image acquisition. This study presents a detailed study on FER techniques, classifiers and datasets used for analysing the efficacy of the recognition techniques. Moreover, this survey will assist researchers in understanding the strategies and innovative methods that address the issues in a real-time application. Finally, the review presents the challenges encountered by FER system along with the future direction.

Journal ArticleDOI
TL;DR: A novel and more accurate method for automated glaucoma detection using quasi-bivariate variational mode decomposition (QB-VMD) from digital fundus images is presented, which may become a suitable method for ophthalmologists to examine eye disease more accurately usingfundus images.
Abstract: Glaucoma is a critical and irreversible neurodegenerative eye disorder caused by damaging optical nerve head due to increased intra-ocular pressure within the eye. Detection of glaucoma is a critical job for ophthalmologists. This study presents a novel and more accurate method for automated glaucoma detection using quasi-bivariate variational mode decomposition (QB-VMD) from digital fundus images. In total, 505 fundus images are decomposed using QB-VMD method which gives band limited sub-band images (SBIs) centred around a particular frequency. These SBIs are smooth and free from mode mixing problems. The glaucoma detection accuracy depends on the most useful features as it captured appropriate information. Seventy features are extracted from QB-VMD SBIs. Extracted features are normalised and selected using ReliefF method. Selected features are then fed to singular value decomposition to reduce their dimensionality. Finally, the reduced features are classified using least square support vector machine classifier. The obtained glaucoma detection accuracies are 85.94 and 86.13% using three- and ten-fold cross validation, respectively. Obtained results are better than the existing. It may become a suitable method for ophthalmologists to examine eye disease more accurately using fundus images.

Journal ArticleDOI
TL;DR: Results indicated that there is no co-occurrence-matrix-based descriptor that overcomes all the others in all data sets, Nevertheless, some descriptors have shown a remarkably higher performance than the others.
Abstract: Grey-level co-occurrence matrix (GLCM) is a widely used texture feature descriptor that is extracted from grey-level images. A considerable amount of work in the literature has been done trying to combine colour with GLCM features. The aim of this study is to examine the effect of integrating colour with GLCM. To this end, 14 descriptors that have been proposed in literature based on different colour combining approaches, four Convolutional Neural Networks-based, two aggregation-based, and five other handcrafted descriptors have been engaged in this comparison. Five widely known data sets have been used to experimentally evaluate each descriptor. The evaluation has been conducted for image classification and image retrieval. Results indicated that there is no co-occurrence-matrix-based descriptor that overcomes all the others in all data sets. Nevertheless, some descriptors have shown a remarkably higher performance than the others.

Journal ArticleDOI
TL;DR: The authors propose a robust feature descriptor named regional adaptive affinitive patterns (RADAP) for facial expression recognition which computes positional adaptive thresholds in the local neighbourhood and encodes multi-distance magnitude features which are robust to intra-class variations and irregular illumination variation in an image.
Abstract: Automated facial expression recognition plays a significant role in the study of human behaviour analysis. In this study, the authors propose a robust feature descriptor named regional adaptive affinitive patterns (RADAP) for facial expression recognition. The RADAP computes positional adaptive thresholds in the local neighbourhood and encodes multi-distance magnitude features which are robust to intra-class variations and irregular illumination variation in an image. Furthermore, they established cross-distance co-occurrence relations in RADAP by using logical operators. They proposed XRADAP, ARADAP, and DRADAP using xor, adder and decoder, respectively. The XRADAP engrains the quality of robustness to intra-class variations in RADAP features using pairwise co-occurrence. Similarly, ARADAP and DRADAP extract more stable and illumination invariant features and capture the minute expression features which are usually missed by regular descriptors. The performance of the proposed methods is evaluated by conducting experiments on nine benchmark datasets Cohn-Kanade+ (CK+), Japanese female facial expression (JAFFE), Multimedia Understanding Group (MUG), MMI, OULU-CASIA, Indian spontaneous expression database, DISFA, AFEW and Combined (CK+, JAFFE, MUG, MMI & GEMEP-FERA) database in both person dependent and person independent setup. The experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches.

Journal ArticleDOI
Yan Xing1, Jian Xu1, Jieqing Tan1, Daolun Li1, Wenshu Zha1 
TL;DR: This study utilises CNN with the multi-layer structure for the removal of salt and pepper noise, which contains padding, batch normalisation and rectified linear unit, and obtains competitive results.
Abstract: Image denoising is a common problem during image processing. Salt and pepper noise may contaminate an image by randomly converting some pixel values into 255 or 0. The traditional image denoising algorithm is based on filter design or interpolation algorithm. There exists no work using the convolutional neural network (CNN) to directly remove salt and pepper noise to the authors’ knowledge. In this study, they utilise CNN with the multi-layer structure for the removal of salt and pepper noise, which contains padding, batch normalisation and rectified linear unit. In training, they divide images into three parts: training set, validation set and test set. Experimental results demonstrate that the architecture can effectively remove salt and pepper noise for the various noisy images. In addition, their model can remove high-density noise well due to the extensive local receptive fields of the deep neural networks. Finally, extensive experimental results show that their denoiser is effective for those images with a large number of interference pixels which may cause misjudgement. In a word, they generalise the application of CNN to salt and pepper noise removal and obtain competitive results.

Journal ArticleDOI
TL;DR: This work addresses the problem of searching and retrieving similar textual images based on the detected text and opens the new directions for textual image retrieval and shows the dominancy of text is efficient and valuable for image retrieval specifically for textual images.
Abstract: This work addresses the problem of searching and retrieving similar textual images based on the detected text and opens the new directions for textual image retrieval. For image retrieval, several methods have been proposed to extract visual features and social tags; however, to extract embedded and scene text within images and use that text as automatic keywords/tags is still a young research field for text-based and content-based image retrieval applications. The automatic text detection retrieval is an emerging technology for robotics and artificial intelligence. In this study, the authors have proposed a novel approach to detect the text in an image and exploit it as keywords and tags for automatic text-based image retrieval. First, text regions are detected using maximally stable extremal region algorithm. Second, unwanted false positive text regions are eliminated based on geometric properties and stroke width transform. Next, the true text regions are proceeded into optical character recognition for recognition. Third, keywords are formed using a neural probabilistic language model. Finally, the textual images are indexed and retrieved based on the detected keywords. The experimental results on two benchmark datasets show the dominancy of text is efficient and valuable for image retrieval specifically for textual images.

Journal ArticleDOI
TL;DR: A method based on discrete orthonormal Stockwell transform and statistical features for discriminating between normal and diseased retinal images and compared with existing algorithms shows that the algorithm detects DR with high veracity.
Abstract: Microaneurysms (MAs) are the earliest pre-eminent indicators of diabetic retinopathy (DR) and are hard to distinguish for ophthalmologists on standard fundus images. This study proposes a method based on discrete orthonormal Stockwell transform and statistical features for discriminating between normal and diseased retinal images. Feature extraction by the two different approaches is consolidated and a total of 24 features are extracted for classifier models. Training and testing of the proposed method have been accomplished using 1140 retinal colour photographs. A comparative study using eight best-known classifiers is showcased for detection of MAs and the performance of the classifiers is evaluated using retinal images by performing ten-fold cross-validation procedure. Simulation results demonstrate the efficiency and adequacy of the proposed method which mainly characterise the textural features. The proposed method is compared with existing algorithms and the results show that the algorithm detects DR with high veracity. With the high accuracy and positive prediction, the proposed system assures promising results in early diagnosis of DR.

Journal ArticleDOI
TL;DR: The experimental results indicate that the final enhanced image using the proposed method outperforms other methods, providing a more effective and accurate basis for medical workers to diagnose diseases.
Abstract: Medical image quality requirements have been increasingly stringent with the recent developments of medical technology. To meet clinical diagnosis needs, an effective medical image enhancement method based on convolutional neural networks (CNNs) and frequency band broadening (FBB) is proposed. Curvelet transform is used to deal with medical data by obtaining the curvelet coefficient in each scale and direction, and the generalised cross-validation is implemented to select the optimal threshold for performing denoising processing. Meanwhile, the cycle spinning scheme is used to wipe off the visible ringing effects along the edges of medical images. Then, FBB and a new CNN model based on the retinex model are used to improve the processed image resolution. Eventually, pixel-level fusion is made between two enhanced medical images from CNN and FBB. In the authors’ study, 50 groups of medical magnetic resonance imaging, X-ray, and computed tomography images in total have been studied. The experimental results indicate that the final enhanced image using the proposed method outperforms other methods. The resolution and the edge details of the processed image are significantly enhanced, providing a more effective and accurate basis for medical workers to diagnose diseases.

Journal ArticleDOI
TL;DR: An end-to-end deep denoising model is designed to remove the noise of SAR images with the help of abundant simulated SAR images, and this model is trained effectively to estimate the noise component.
Abstract: The intrinsic noise of synthetic aperture radar (SAR) images has a big influence to the image processing performance, especially in change detection (CD). Image denoising is an important branch of image restoration which aims at enhancing the quality of images. The detection accuracy of CD depends greatly on the quality of red difference image (DI), therefore image denoising can be regarded as a vital step in SAR CD. However, few researches focused on this problem. In this study, an end-to-end deep denoising model is first designed to remove the noise of SAR images. With the help of abundant simulated SAR images, deep denoising model is trained effectively to estimate the noise component. Then clean image can be achieved by removing this noise component from the original SAR image. After denoising, the new image pair will generate a clean DI. At last, DI is classified into changed and unchanged areas by a three-layer Convolutional Neural Network (CNN). Three real SAR image pairs demonstrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: It is observed that the DDLA architecture with LR classifier achieves higher accuracy in comparable computation time with other approaches and the observed accuracy for DDLA + LR is higher compared to other approaches.
Abstract: Plant species recognition is performed using a dual deep learning architecture (DDLA) approach. DDLA consists of MobileNet and DenseNet-121 architectures. The feature vectors obtained from individual architectures are concatenated to form a final feature vector. The extracted features are then classified using machine learning (ML) classifiers such as linear discriminant analysis, multinomial logistic regression (LR), Naive Bayes, classification and regression tree, k -nearest neighbour, random forest classifier, bagging classifier and multi-layer perceptron. The dataset considered in the studies is standard (Flavia, Folio, and Swedish Leaf) and custom collected (Leaf-12) dataset. The MobileNet and DenseNet-121 architectures are also used as a feature extractor and a classifier. It is observed that the DDLA architecture with LR classifier produced the highest accuracies of 98.71, 96.38, 99.41, and 99.39% for Flavia, Folio, Swedish leaf, and Leaf-12 datasets. The observed accuracy for DDLA + LR is higher compared with other approaches (DDLA + ML classifiers, MobileNet + ML classifiers, DenseNet-121 + ML classifiers, MobileNet + fully connected layer (FCL), DenseNet-121 + FCL). It is also observed that the DDLA architecture with LR classifier achieves higher accuracy in comparable computation time with other approaches.

Journal ArticleDOI
TL;DR: A new approach to vehicle license plate location based on new model YOLO-L and plate pre-identification and k-means++ clustering algorithm improves in two aspects to precisely locate the area of license plate.
Abstract: Currently, the conventional license plate location method fails to detect the license plate under complex road environments such as severe weather conditions and viewpoint changes. Besides, it is difficult for license plate location method based on machine learning to precisely locate the area of license plate. Moreover, license plate location method may incorrectly detect similar objects such as billboards and road signs as license plates. To alleviate these problems, this article proposes a new approach to vehicle license plate location based on new model YOLO-L and plate pre-identification. The new model improves in two aspects to precisely locate the area of license plate. First, it uses k-means++ clustering algorithm to select the best number and size of plate candidate boxes. Second, it modifies the structure and depth of YOLOv2 model. Plate pre-identification algorithm can effectively distinguish license plates from similar objects. The experimental results show that authors' proposed method not only achieves a precision of 98.86% and a recall of 98.86%, which outperforms the existing methods, but also has high efficiency in real time.

Journal ArticleDOI
TL;DR: A CAD system for automatic detection of ulcer in WCE images is proposed based on hidden Markov model using the classification scores of the conventional methods as observations and a new one has been proposed for the recognition of the segmented regions.
Abstract: Wireless capsule endoscopy (WCE) has revolutionised the diagnosis and treatment of gastrointestinal tract, especially the small intestine which is unreachable by traditional endoscopies. The drawback of the WCE is that it produces a large number of images to be inspected by the clinicians. Hence, the design of a computer-aided diagnosis (CAD) system will have a great potential to help reduce the diagnosis time and improve the detection accuracy. To address this problem, the authors propose a CAD system for automatic detection of ulcer in WCE images. Firstly, they enhance the input images to be better exploited in the main steps of the proposed method. Afterward, segmentation using saliency map-based texture and colour is applied to the WCE images in order to highlight ulcerous regions. Then, inspired by the existing feature extraction approaches, a new one has been proposed for the recognition of the segmented regions. Finally, a new recognition scheme is proposed based on hidden Markov model using the classification scores of the conventional methods (support vector machine, multilayer perceptron and random forest) as observations. Experimental results with two different datasets show that the proposed method gives promising results.

Journal ArticleDOI
TL;DR: This study proposes a novel network pipeline called convolutional neural network in network (which is deeper than the existing approaches) by jointly utilising the spatial and spectral information and produces high-level features from the original HSI.
Abstract: Classification is a principle technique in hyperspectral images (HSIs), where a label is assigned to each pixel based on its characteristics. However, due to lack of labelled training instances in HSIs and also its ultra-high dimensionality, deep learning approaches need a special consideration for HSI classification. As one of the first works in the HSI classification, this study proposes a novel network pipeline called convolutional neural network in network (which is deeper than the existing approaches) by jointly utilising the spatial and spectral information and produces high-level features from the original HSI. This can occur by using spatial-spectral relationships of individual pixel vector at the initial component of the proposed pipeline; the extracted features are then combined to form a joint spatial-spectral feature map. Finally, a recurrent neural network is trained on the extracted features which contain wealthy spectral and spatial properties of the HSI to predict the corresponding label of each vector. The model has been tested on two large scale hyperspectral datasets in terms of classification accuracy, training error, and computational time.

Journal ArticleDOI
TL;DR: This study introduces a novel approach to detect face-spoofing, by extracting the local features local binary pattern (LBP) and simplified weber local descriptor (SWLD) encoded convolutional neural network (CNN) models, WLD and LBP features are combined together to ensure the preservation of the local intensity information and the orientations of the edges.
Abstract: Automatically recognising people by their biometric characteristics is a well-established research area. Biometric systems are vulnerable to many different types of presentation attacks made by persons showing photo, video, or mask to spoof the real identity. This study introduces a novel approach to detect face-spoofing, by extracting the local features local binary pattern (LBP) and simplified weber local descriptor (SWLD) encoded convolutional neural network (CNN) models, WLD and LBP features are combined together to ensure the preservation of the local intensity information and the orientations of the edges. These two components are complementary to each other. Specifically, differential excitation preserves the local intensity information but omits the orientations of edges. On the contrary, LBP describes the orientations of the edges but ignore the intensity information, the proposed approach presents a very low degree of complexity which makes it suitable for real-time applications, Finally, a non-linear support vector machine (SVM) classifier with kernel function was used for determining whether the input image corresponds to a live face or not. Authors’ experimental analysis on two publicly available databases REPLAY-ATTACK and CASIA face anti-spoofing showed that their approach performs better than state-of-the-art techniques following the provided evaluation protocols of each database.

Journal ArticleDOI
TL;DR: A novel sub-image approach is proposed for extremely fast and highly accurate detecting of the duplicated forged objects in colour images and exhibits high robustness against different attacks such as additive white Gaussian noise, JPEG compression, scaling, and rotation.
Abstract: Most of the existing copy-move forgery detection (CMFD) methods utilised time-consuming overlapped block-based approach. Here, a novel sub-image approach is proposed for extremely fast and highly accurate detecting of the duplicated forged objects in colour images. The proposed approach consists of few steps. The input coloured images are converted into the hue-saturation-value (HSV) colour model. Then, the edges of all objects in the forged image are detected using the Sobel operator. Morphological opening operator and median filter are used in removing unnecessary small objects. The boundaries of the duplicated objects are accurately detected. A bounding rectangle is drawn around the detected object to form a sub-image. The features of this sub-image are extracted by using the quaternion polar complex exponential transform moments (QPCETMs) and their invariants to rotation, scaling, and translation. Finally, the duplicated regions are matched via calculating the Euclidian distances and the correlation between the feature vectors. Experiments are performed using different types of duplicated regions. The obtained results of the proposed method are much accurate when compared with the results of the existing methods. Also, the proposed method exhibits high robustness against different attacks such as additive white Gaussian noise, JPEG compression, scaling, and rotation.

Journal ArticleDOI
TL;DR: This study reviews different algorithms and methods, developed in the past two decades, to give clearer ideas on the techniques present in the image restoration process, specifically for underwater images.
Abstract: Underwater images are susceptible to various distortions compared to images taken on land, due to the nature of the water environment. These images often suffer from diffraction, polarisation, absorption, scattering, colour loss and attenuation of light. Each part of the ocean will have its own sources of distortions, due to flickers caused by direct sunlight, marine snow, the fluorescence of biological objects, the presence of macroscopical organisms, loss of stability in divers, loss of light, artificial lighting and floating dust particles present in the water. There are numerous techniques and algorithms that may be used to restore these underwater images. This study reviews different algorithms and methods, developed in the past two decades, to give clearer ideas on the techniques present in the image restoration process, specifically for underwater images.