scispace - formally typeset
Search or ask a question

Showing papers in "Signal Processing-image Communication in 2021"


Journal ArticleDOI
TL;DR: This method is the first to use HSV color space for underwater image enhancement based on deep learning and efficiently and effectively integrate both RGB Color Space and HSV Color Space in one single CNN.
Abstract: Underwater image enhancement has attracted much attention due to the rise of marine resource development in recent years. Benefit from the powerful representation capabilities of Convolution Neural Networks(CNNs), multiple underwater image enhancement algorithms based on CNNs have been proposed in the past few years. However, almost all of these algorithms employ RGB color space setting, which is insensitive to image properties such as luminance and saturation. To address this problem, we proposed Underwater Image Enhancement Convolution Neural Network using 2 Color Space (UICE^2-Net) that efficiently and effectively integrate both RGB Color Space and HSV Color Space in one single CNN. To our best knowledge, this method is the first one to use HSV color space for underwater image enhancement based on deep learning. UIEC^2-Net is an end-to-end trainable network, consisting of three blocks as follow: a RGB pixel-level block implements fundamental operations such as denoising and removing color cast, a HSV global-adjust block for globally adjusting underwater image luminance, color and saturation by adopting a novel neural curve layer, and an attention map block for combining the advantages of RGB and HSV block output images by distributing weight to each pixel. Experimental results on synthetic and real-world underwater images show that the proposed method has good performance in both subjective comparisons and objective metrics. The code is available at https://github.com/BIGWangYuDong/UWEnhancement .

78 citations


Journal ArticleDOI
TL;DR: Simulation experiments and performance analysis show that the algorithm based on a four-wing hyperchaotic system combined with compressed sensing and DNA coding has good performance and security.
Abstract: An image encryption scheme based on a four-wing hyperchaotic system combined with compressed sensing and DNA coding is proposed The scheme uses compressed sensing (CS) to reduce the image according to a certain scale in the encryption process The measurement matrix is constructed by combining the Kronecker product (KP) and chaotic system KP is used to extend the low-dimensional seed matrix to the high-dimensional measurement matrix The dimensional seed matrix is generated by a four-wing chaotic system At the same time, the chaotic sequence generated by the chaotic system dynamically controls the DNA coding and then performs the XOR operation Simulation experiments and performance analysis show that the algorithm has good performance and security

74 citations


Journal ArticleDOI
TL;DR: Comparison with state-of-the-art methods show that the proposed method outputs high-quality underwater images with qualitative and quantitative evaluation well.
Abstract: Underwater captured images often suffer from color cast and low visibility due to light is scattered and absorbed while it traveling in water. In this paper, we proposed a novel method of color correction and Bi-interval contrast enhancement to improve the quality of underwater images. Firstly, a simple and effective color correction method based on sub-interval linear transformation is employed to address color distortion. Then, a Gaussian low-pass filter is applied to the L channel to decompose the low- and high-frequency components. Finally, the low- and high-frequency components are enhanced by Bi-interval histogram based on optimal equalization threshold strategy and S-shaped function to enhancement image contrast and highlight image details. Inspired by the multi-scale fusion, we employed a simple linear fusion to integrate the enhanced high- and low-frequency components. Comparison with state-of-the-art methods show that the proposed method outputs high-quality underwater images with qualitative and quantitative evaluation well.

73 citations


Journal ArticleDOI
TL;DR: This survey introduces a review of existing relatively mature and representative underwater image processing models, which are classified into seven categories including enhancement, fog removal, noise reduction, segmentation, salient object detection, colour constancy and restoration.
Abstract: With increasing attentions being drawn to the underwater observation and utilization of marine resources in recent years, underwater image processing and analysis have become an active research hotspot. Different from the general images, marine environment is usually faced with some complicated situations such as underwater turbulence and diffusion, severe absorption and scattering of water body, various noises, low contrast, uniform illumination, monotonous color, complex underwater-background. In response to these typical challenges, a large body of works in underwater image processing has been exploited in recent years. This survey introduces a review of existing relatively mature and representative underwater image processing models, which are classified into seven categories including enhancement, fog removal, noise reduction, segmentation, salient object detection, color constancy and restoration. We then objectively evaluate the current situations and future development trend of underwater image processing, and provide some insights into the prospective research directions to promote the development of underwater vision and beyond.

66 citations


Journal ArticleDOI
TL;DR: A more comprehensive color measurement in spatial domain and frequency domain is designed by combining the colorfulness, contrast, and sharpness cues, inspired by the different sensibility of humans to high-frequency and low-frequency information.
Abstract: Owing to the complexity of the underwater environment and the limitations of imaging devices, the quality of underwater images varies differently, which may affect the practical applications in modern military, scientific research, and other fields. Thus, achieving subjective quality assessment to distinguish different qualities of underwater images has an important guiding role for subsequent tasks. In this paper, considering the underwater image degradation effect and human visual perception scheme, an effective reference-free underwater image quality assessment metric is designed by combining the colorfulness, contrast, and sharpness cues. Specifically, inspired by the different sensibility of humans to high-frequency and low-frequency information, we design a more comprehensive color measurement in spatial domain and frequency domain. In addition, for the low contrast caused by the backward scattering, we propose a dark channel prior weighted contrast measure to enhance the discrimination ability of the original contrast measurement. The sharpness measurement is used to evaluate the blur effect caused by the forward scattering of the underwater image. Finally, these three measurements are combined by the weighted summation, where the weighed coefficients are obtained by multiple linear regression. Moreover, we collect a large dataset for underwater image quality assessment for testing and evaluating different methods. Experiments on this dataset demonstrate the superior performance both qualitatively and quantitatively.

37 citations


Journal ArticleDOI
TL;DR: This paper utilizes an effective and public underwater benchmark dataset including diverse underwater degradation scenes to enlarge the test scale and proposes a fusion adversarial network for enhancing real underwater images.
Abstract: Underwater image enhancement algorithms have attracted much attention in underwater vision task. However, these algorithms are mainly evaluated on different datasets and metrics. In this paper, we utilize an effective and public underwater benchmark dataset including diverse underwater degradation scenes to enlarge the test scale and propose a fusion adversarial network for enhancing real underwater images. Meanwhile, the multiple inputs and well-designed multi-term adversarial loss can not only introduce multiple input image features, but also balance the impact of multi-term loss functions. The proposed network tested on the benchmark dataset achieves better or comparable performance than the other state-of-the-art methods in terms of qualitative and quantitative evaluations. Moreover, the ablation study experimentally validates the contributions of each component and hyper-parameter setting of loss functions.

29 citations


Journal ArticleDOI
TL;DR: Evaluation of the effect of a Fully Convolutional Neural Network (FCNN), as a denoising attack, on watermarked images shows that such type of denoises attack preserves the image quality while breaking the robustness of all evaluated watermarked schemes.
Abstract: Digital image watermarking has justified its suitability for copyright protection and copy control of digital images. In the past years, various watermarking schemes were proposed to enhance the fidelity and the robustness of watermarked images against different types of attacks such as additive noise, filtering, and geometric attacks. It is highly important to guarantee a sufficient level of robustness of watermarked images against such type of attacks. Recently, Deep learning and neural networks achieved noticeable development and improvement, especially in image processing, segmentation, and classification. Therefore, in this paper, we studied the effect of a Fully Convolutional Neural Network (FCNN), as a denoising attack, on watermarked images. This deep architecture improves the training process and denoising performance, through which the encoder–decoder remove the noise while preserving the detailed structure of the image. FCNNDA outperforms the other types of attacks because it destroys the watermarks while preserving a good quality of the attacked images. Spread Transform Dither Modulation (STDM) and Spread Spectrum (SS) are used as watermarking schemes to embed the watermarks in the images using several scenarios. This evaluation shows that such type of denoising attack preserves the image quality while breaking the robustness of all evaluated watermarked schemes. It could also be considered a deleterious attack.

28 citations


Journal ArticleDOI
TL;DR: A new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampled these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network is presented.
Abstract: We present a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network. ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.0.1), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bjonegaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM.

25 citations


Journal ArticleDOI
TL;DR: An approach based on the combined use of machine learning and eye tracking information is presented, showing that the considered features allow to identify children affected by autism spectrum disorder and typically developing ones.
Abstract: Autism Spectrum Disorder is a developmental disorder characterized by a deficit in social behaviour and specific interactions such as reduced eye contact and body gestures. Recent advancements in software and hardware multimedia technologies provide the tools for early detecting the presence of this disorder. In this paper we present an approach based on the combined use of machine learning and eye tracking information. More specifically, features are extracted from image content and viewing behaviour, such as the presence of objects and fixations towards the centre of a scene. Those features are used to train a machine learning-based classifier. The obtained results show that the considered features allow to identify children affected by autism spectrum disorder and typically developing ones.

25 citations


Journal ArticleDOI
TL;DR: The experimental results show that the proposed blind color digital image watermarking method not only has better watermark invisibility and larger watermark capacity, but also has higher security and stronger robustness against geometric attacks.
Abstract: With the rapid development of Internet technology, the copyright protection of color images has become more and more important. In order to fulfill this purpose, this paper designs a blind color digital image watermarking method based on image correction and eigenvalue decomposition (EVD). Firstly, all the eigenvalues of the pixel block in the color host image are obtained by EVD. Then, the sum of the absolute value of the eigenvalues is quantified by the variable quantization steps to embed the color watermark image that encrypted by affine transform and encoded by the Hamming code. If the watermarked image is processed by geometric attack, then the attacked image can be corrected by using the geometric attributes. Finally, the inverse embedding process is performed to extract the color watermark. Moreover, the advantages of the proposed method are shown as follows: (1) all Peak Signal-to-noise Ratio (PSNR) values are greater than 42 dB; (2) the average Structural Similarity Index Metric (SSIM) values are greater than 0.97; (3) the maximum embedded capacity is 0.25bpp; (4) whole running-time is less than 20 s; (5) the key space is more than 2450; (6) most Normalized Cross-correlation (NC) values are more than 0.9. Compared with the related methods, the experimental results show that the proposed method not only has better watermark invisibility and larger watermark capacity, but also has higher security and stronger robustness against geometric attacks.

21 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a lossy hyperspectral image compression algorithm based on the concept of autoencoders, which uses a combination of the convolution layer and max-pooling layer to reduce the dimensions of the input image and generate a compressed image.
Abstract: The large size of hyperspectral imaging poses a significant threat to its potential use in real life due to the abundant information stored in it The use of deep learning for such data processing is visible in recent applications In this work, we propose a lossy hyperspectral image compression algorithm based on the concept of autoencoders It uses a combination of the convolution layer and max-pooling layer to reduce the dimensions of the input image and generate a compressed image The original image with some loss of information is reconstructed using transpose convolution layer that uses reverse of the procedure used by the encoder The compressed image has been entropy coded using an adaptive arithmetic coder for transmission or storage application The method provides an improvement of 28% in PSNR with 21 times increment in the compression ratio The effect of compression on classification has also been evaluated in the experiment using state of art classification algorithm Negligible difference in classification accuracy was obtained that proves the effectiveness of the proposed algorithm

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a deep neural network to perform multi-modal relation reasoning in multi-scales, which successfully constructs a regional attention scheme to focus on informative and question-related regions for better answering.
Abstract: The goal of Visual Question Answering (VQA) is to answer questions about images. For the same picture, there are often completely different types of questions. Therefore, the main difficulty of VQA task lies in how to properly reason relationships among multiple visual objects according to types of input questions. To solve this difficulty, this paper proposes a deep neural network to perform multi-modal relation reasoning in multi-scales, which successfully constructs a regional attention scheme to focus on informative and question-related regions for better answering. Specifically, we firstly design regional attention scheme to select regions of interest based on informative evaluation computed by a question-guided soft attention module. Afterwards, features computed by regional attention scheme are fused in scaled combinations, thus generating more distinctive features with scalable information. Due to designs of regional attention and multi-scale property, the proposed method is capable to describe scaled relationships from multi-modal inputs to offer accurate question-guided answers. By conducting experiments on VQA v1 and VQA v2 datasets, we show that the proposed method has superior efficiencies than most of the existing methods.

Journal ArticleDOI
TL;DR: The experimental results show that the efficiency of the proposed system for detecting all inter-frame forgeries, even when the forged videos have undergone additional post-processing operations such as Gaussian noise, Gaussian blurring, brightness modifications and compression, is high.
Abstract: Surveillance cameras are widely used to provide protection and security; also their videos are used as strong evidences in the courts. Through the availability of video editing tools, it has become easy to distort these evidences. Sometimes, to hide the traces of forgery, some post-processing operations are performed after editing. Hence, the authenticity and integrity of surveillance videos have become urgent to scientifically validate. In this paper, we propose inter-frame forgeries (frame deletion, frame insertion, and frame duplication) detection system using 2D convolution neural network (2D-CNN) of spatiotemporal information and fusion for deep automatically feature extraction; Gaussian RBF multi-class support vector machine (RBF-MSVM) is used for classification process. The experimental results show that the efficiency of the proposed system for detecting all inter-frame forgeries, even when the forged videos have undergone additional post-processing operations such as Gaussian noise, Gaussian blurring, brightness modifications and compression.

Journal ArticleDOI
TL;DR: The main challenges, practical aspects and mathematical core concepts of the video stabilization techniques are focused on and some new research directions to overcome the limitations of the existing methods are discussed.
Abstract: Video Stabilization (VS) has been an active area of research in the last two decades. Many approaches have been successfully proposed and it is time to take a step back and offer a glimpse and a critical look to this hot topic. Among the questions that should be answered : ”is VS a solved problem”?. Is there still room for further improvement and at which level and how? The main purpose of this contribution is to answer such questions and to provide a fairly unifying framework to allow a better understanding of the progress of this research subject with appreciable industrial and academic benefits. In this paper we focus on the main challenges, practical aspects and mathematical core concepts of the video stabilization techniques. We also put a special attention to the Video Stabilization Quality Assessment (VSQA) by introducing new methodology inspired by the research results on Image Quality Assessment (IQA) in its broad sense. Finally we also discuss some new research directions to overcome the limitations of the existing methods.

Journal ArticleDOI
TL;DR: In this article, a multi-label semantics preserving based deep cross-modal hashing (MLSPH) method is proposed to improve the accuracy of cross-mode hashing retrieval by fully exploring the semantic relevance based on multiple labels of training data.
Abstract: Due to the storage and retrieval efficiency of hashing, as well as the highly discriminative feature extraction by deep neural networks, deep cross-modal hashing retrieval has been attracting increasing attention in recent years. However, most of existing deep cross-modal hashing methods simply employ single-label to directly measure the semantic relevance across different modalities, but neglect the potential contributions from multiple category labels. With the aim to improve the accuracy of cross-modal hashing retrieval by fully exploring the semantic relevance based on multiple labels of training data, in this paper, we propose a multi-label semantics preserving based deep cross-modal hashing (MLSPH) method. MLSPH firstly utilizes multi-labels of instances to calculate semantic similarity of the original data. Subsequently, a memory bank mechanism is introduced to preserve the multiple labels semantic similarity constraints and enforce the distinctiveness of learned hash representations over the whole training batch. Extensive experiments on several benchmark datasets reveal that the proposed MLSPH surpasses prominent baselines and reaches the state-of-the-art performance in the field of cross-modal hashing retrieval. Code is available at: https://github.com/SWU-CS-MediaLab/MLSPH .

Journal ArticleDOI
TL;DR: A new NR-IQA/BIQA model that operates on natural scene statistics in the contourlet domain is proposed that has high linearity against human subjective perception, and outperforms the state-of-the-art NR- IQA models.
Abstract: No-reference/blind image quality assessment (NR-IQA/BIQA) algorithms play an important role in image evaluation, as they can assess the quality of an image automatically, only using the distorted image whose quality is being assessed. Among the existing NR-IQA/BIQA methods, natural scene statistic (NSS) models which can be expressed in different bandpass domains show good consistency with human subjective judgments of quality. In this paper, we create new ‘quality-aware’ features: the energy differences of the sub-band coefficients across scales via contourlet transform, and propose a new NR-IQA/BIQA model that operates on natural scene statistics in the contourlet domain. Prior to applying the contourlet transform, we apply two preprocessing steps that help to create more information-dense, low-entropy representations. Specifically, we transform the picture into the CIELAB color space and gradient magnitude map. Then, a number of ‘quality-aware’ features are discovered in the contourlet transform domain: the energy of the sub-band coefficients within scales, and the energy differences between scales, as well as measurements of the statistical relationships of pixels across scales. A detailed analysis is conducted to show how different distortions affect the statistical characteristics of these features, and then features are fed to a support vector regression (SVR) model which learns to predict image quality. Experimental results show that the proposed method has high linearity against human subjective perception, and outperforms the state-of-the-art NR-IQA models.

Journal ArticleDOI
TL;DR: In this article, an improved Single Shot multi-box detector based on feature fusion and dilated convolution (FD-SSD) is proposed to solve the problem that small objects are difficult to detect.
Abstract: Objects that occupy a small portion of an image or a frame contain fewer pixels and contains less information. This makes small object detection a challenging task in computer vision. In this paper, an improved Single Shot multi-box Detector based on feature fusion and dilated convolution (FD-SSD) is proposed to solve the problem that small objects are difficult to detect. The proposed network uses VGG-16 as the backbone network, which mainly includes a multi-layer feature fusion module and a multi-branch residual dilated convolution module. In the multi-layer feature fusion module, the last two layers of the feature map are up-sampled, and then they are concatenated at the channel level with the shallow feature map to enhance the semantic information of the shallow feature map. In the multi-branch residual dilated convolution module, three dilated convolutions with different dilated ratios based on the residual network are combined to obtain the multi-scale context information of the feature without losing the original resolution of the feature map. In addition, deformable convolution is added to each detection layer to better adapt to the shape of small objects. The proposed FD-SSD achieved 79.1% mAP and 29.7% mAP on PASCAL VOC2007 dataset and MS COCO dataset respectively. Experimental results show that FD-SSD can effectively improve the utilization of multi-scale information of small objects, thus significantly improve the effect of the small object detection.

Journal ArticleDOI
TL;DR: Visual and quantitative experimental results show that the proposed PMG prior can achieve excellent performance and is superior to state-of-the-art methods in terms of computational efficiency and recovery quality in various specific scenarios such as natural, face, saturated, and text images.
Abstract: In this study, we propose a patch-wise maximum gradient (PMG) prior for effective blind image deblurring. Our work is motivated by the fact that the maximum gradient values of non-overlapping local patches are significantly diminished by blurring; we demonstrate this inherent property both theoretically and using real data. Based on this, we propose a blur kernel estimation model using an L 0 -regularized PMG prior and L 0 -regularized gradient prior. Compared with previous image priors, our PMG prior exhibits a stronger ability to distinguish between clear and blurred images. It also has a deeper sparseness, which significantly reduces the computational cost. To solve the proposed PMG and L 0 -regularized gradient terms, we design an efficient optimization algorithm by introducing a linear operator and improving the iteration strategy. Visual and quantitative experimental results show that our method can achieve excellent performance and is superior to state-of-the-art methods in terms of computational efficiency and recovery quality in various specific scenarios such as natural, face, saturated, and text images.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a deep neural networks (DNN) based method for dynamical multiple histograms generation, which is able to establish the histograms with different sizes for a better redundancy exploitation.
Abstract: In the previous multiple histograms modification (MHM) based reversible data hiding (RDH) method, the prediction-error histograms are generated by a fixed manner, which may constrain the performance owing to the lack of adaptivity. In order to compensate this, we propose a deep neural networks (DNN) based method for dynamical multiple histograms generation. Through learning the prior knowledge, DNN is able to establish the histograms with different sizes for a better redundancy exploitation. For each histogram, two optimal expansion bins will be determined to minimize the distortion caused by the modification. Besides, the strategy consisted of the memo technique and the entropy measurement are applied to accelerate the parameter optimization. Experimental results show that the proposed method outperforms some of state-of-the-art RDH methods.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a hybrid image encryption and compression scheme that allows compression in the encryption domain, which is based on Chaos theory and is carried out in two steps, i.e., permutation and substitution.
Abstract: Compression and encryption are often performed together for image sharing and/or storage. The order in which the two operations are carried out affects the overall efficiency of digital image services. For example, the encrypted data has less or no compressibility. On the other hand, it is challenging to ensure reasonable security without downgrading the compression performance. Therefore, incorporating one requirement into another is an interesting approach. In this study, we propose a novel hybrid image encryption and compression scheme that allows compression in the encryption domain. The encryption is based on Chaos theory and is carried out in two steps, i.e., permutation and substitution. The lossless compression is performed on the shuffled image and then the compressed bitstream is grouped into 8-bit elements for substitution stage. The lossless nature of the proposed method makes it suitable for medical image compression and encryption applications. The experimental results shows that the proposed method achieves the necessary level of security and preserves the compression efficiency of a lossless algorithm. In addition, to improve the performance of the entropy encoder of the compression algorithm, we propose a data-to-symbol mapping method based on number theory to represent adjacent pixel values as a block. With such representation, the compression saving is improved on average from 5.76% to 15.45% for UCID dataset.

Journal ArticleDOI
TL;DR: A novel variational convex optimization model for the single SAR image SR reconstruction with speckle noise is proposed that is one of the first works in this field and the split Bregman algorithm is employed efficiently.
Abstract: Super resolution (SR) is an attractive issue in image processing. In the synthetic aperture radar (SAR) image, speckle noise is a crucial problem that is multiplicative. Therefore, numerous custom SR methods considering additive Gaussian noise cannot respond to this image degradation model. The main contribution of this paper is to propose a novel variational convex optimization model for the single SAR image SR reconstruction with speckle noise that is one of the first works in this field. Employing maximum a posteriori (MAP) estimator and proposing an effective regularization based on combination of sparse representation, total variation (TV) and a novel feature space based soft projection tool to use merits of them is the main idea. To solve the proposed model, the split Bregman algorithm is employed efficiently. Experimental results for the multiple synthetic and realistic SAR images show the effectiveness of proposed method in terms of both fidelity and visual perception.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed two machine learning methods, synthetic saccade approach and image based approach, to automatically classify ASD given children's eye gaze data collected from free-viewing tasks of natural images.
Abstract: As early intervention is highly effective for young children with autism spectrum disorder (ASD), it is imperative to make accurate diagnosis as early as possible. ASD has often been associated with atypical visual attention and eye gaze data can be collected at a very early age. An automatic screening tool based on eye gaze data that could identify ASD risk offers the opportunity for intervention before the full set of symptoms is present. In this paper, we propose two machine learning methods, synthetic saccade approach and image based approach, to automatically classify ASD given children's eye gaze data collected from free-viewing tasks of natural images. The first approach uses a generative model of synthetic saccade patterns to represent the baseline scan-path from a typical non-ASD individual and combines it with the real scan-path as well as other auxiliary data as inputs to a deep learning classifier. The second approach adopts a more holistic image-based approach by feeding the input image and a sequence of fixation maps into a convolutional or recurrent neural network. Using a publicly-accessible collection of children's gaze data, our experiments indicate that the ASD prediction accuracy reaches 67.23% accuracy on the validation dataset and 62.13% accuracy on the test dataset.

Journal ArticleDOI
TL;DR: In the proposed method, the hidden bits can still be extracted correctly under some typical content-preserving operations, such as JPEG/JPEG2000 compression and additive Gaussian noise.
Abstract: This paper proposes a robust and reversible watermarking scheme for the encrypted image by using Paillier cryptosystem. In the proposed method, the original image is divided into a number of non-overlapping blocks sized by 8 × 8 and Paillier cryptosystem is applied to encrypt the pixels in each block. Firstly, a data hider can calculate the statistical values of encrypted blocks by employing modular multiplicative inverse (MMI) method and looking for a mapping table. Then a watermark sequence can be embedded into the encrypted image by shifting the histogram of the statistical values. On the receiver side, the shifted histogram can be obtained from both the encrypted image and the decrypted image. Furthermore, the embedded watermark can be extracted from the shifted histogram. The encrypted original image can be restored by employing inverse operations of histogram shifting. This is followed by a decryption operation to restore the original image. In the proposed method, the hidden bits can still be extracted correctly under some typical content-preserving operations, such as JPEG/JPEG2000 compression and additive Gaussian noise. Compared with the previous reversible watermarking methods in plaintext domain, the proposed method has satisfactory performance in image quality and robustness. Experimental results have shown the validity of the proposed method.

Journal ArticleDOI
TL;DR: A new secret sharing scheme based on Chinese remainder theorem for polynomial ring to design a RESIS scheme, which uses polynomials over F 2 [ x ] to divide secret image.
Abstract: In a secret image sharing (SIS) scheme, a secret image is divided into n shadow images. Then, any t or more shadow images can recover the secret image. Different from pure SIS, extended SIS (ESIS) further embeds shadow images into a cover image to generate stego images. In this way, ESIS can be more secure because shadow images are always noise-like images which might arouse suspicion of attackers, while stego images are meaningful images. Furthermore, if both secret image and cover image can be recovered from enough stego images in an ESIS scheme, the scheme is called reversible ESIS (RESIS). This paper utilizes a new secret sharing scheme based on Chinese remainder theorem for polynomial ring to design a RESIS scheme, which uses polynomials over F 2 [ x ] to divide secret image. Besides, least significant bit substitution technology is employed to hide shadow images and generate stego images. The proposed RESIS scheme can guarantee both the secret and cover image totally lossless reconstruction. Besides, experimental results show that the shadow images have satisfactory quality which is not affected by different cover images.

Journal ArticleDOI
TL;DR: A deep edge map guided depth SR method, which includes an edge prediction subnetwork and an SR subnetwork, which takes advantage of the hierarchical representation of color and depth images to produce accurate edge maps, which promote the performance of SR sub network.
Abstract: Accurate edge reconstruction is critical for depth map super resolution (SR). Therefore, many traditional SR methods utilize edge maps to guide depth SR. However, it is difficult to predict accurate edge maps from low resolution (LR) depth maps. In this paper, we propose a deep edge map guided depth SR method, which includes an edge prediction subnetwork and an SR subnetwork. The edge prediction subnetwork takes advantage of the hierarchical representation of color and depth images to produce accurate edge maps, which promote the performance of SR subnetwork. The SR subnetwork is a disentangling cascaded network to progressively upsample SR result, where every level is made up of a weight sharing module and an adaptive module. The weight sharing module extracts the general features in different levels, while the adaptive module transfers the general features to the specific features to adapt to different degraded inputs. Quantitative and qualitative evaluations on various datasets with different magnification factors demonstrate the effectiveness and promising performance of the proposed method. In addition, we construct a benchmark dataset captured by Kinect-v2 to facilitate research on real-world depth map SR.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a dynamic threshold color correction method based on image global information combined with the loss law of light propagation in water, which can restore the color information, remove blurring and boost detail for underwater images.
Abstract: Underwater image processing has played an important role in various fields such as submarine terrain scanning, submarine communication cable laying, underwater vehicles, underwater search and rescue However, there are many difficulties in the process of acquiring underwater images Specifically, the water body will selectively absorb part of the light when light travels through the water, resulting in color degradation of underwater images At the same time, due to the influence of floating substances in the water, the light has a certain degree of scattering, which will bring serious problems such as blurred details and low contrast to underwater images Therefore, using image processing technology to restore the real appearance of underwater images has a high practical value In order to solve the above problems, we combine the color correction method with the deblurring network to improve the quality of underwater images in this paper Firstly, aiming at the problem of insufficient number and diversity of underwater image samples, a network combined with depth image reconstruction and underwater image generation is proposed to simulate underwater images based on the style transfer method Secondly, for the problem of color distortion, we propose a dynamic threshold color correction method based on image global information combined with the loss law of light propagation in water Finally, in order to solve the problem of image blurring caused by scattering and further improve the overall image clarity, the color-corrected image is reconstructed by a multi-scale recursive convolutional neural network Experiment results show that we can obtain images closer to underwater style with shorter training time Compared with several latest underwater image processing methods, the proposed method has obvious advantages in multiple underwater scenes Simultaneously, we can restore the color information, remove blurring and boost detail for underwater images

Journal ArticleDOI
TL;DR: In this article, the authors employed genetic algorithms (GA) to discover an optimal solution in order to facilitate the computational process in producing better recognition results, which achieved an accuracy of 85.9% and F1-score of 83.7%.
Abstract: In recent years, numerous facial expression recognition related applications had been commercialized in the market. Many of them achieved promising and reliable performance results in real-world applications. In contrast, the automated micro-expression recognition system relevant research analysis is still greatly lacking. This is because of the nature of the micro-expression that is usually appeared with relatively lesser duration and lower intensity. However, due to its uncontrollable, subtlety, and spontaneity properties, it is capable to reveal one’s concealed genuine feelings. Therefore, this paper attempts to improve the performance of current micro-expression recognition systems by introducing an efficient and effective algorithm. Particularly, we employ genetic algorithms (GA) to discover an optimal solution in order to facilitate the computational process in producing better recognition results. Prior to the GA implementation, the benchmark preprocessing method and feature extractors are directly adopted herein. Succinctly, the complete proposed framework composes three main steps: the apex frame acquisition, optical flow approximation, and feature extraction with CNN architecture. Experiments are conducted on the composite dataset that is made up of three publicly available databases, viz, CASME II, SMIC, and SAMM. The recognition performance tends to prevail the state-of-the-art methods by attaining an accuracy of 85.9% and F1-score of 83.7%.

Journal ArticleDOI
TL;DR: A convolutional neural network is proposed for lane offset estimation and lane line detection in a complex road environment, which transforms the problems oflane line detection into the instance’s segmentation and forms its example to each line.
Abstract: Deep learning has made remarkable progress in the field of image classification and object detection. Nevertheless, in the autonomous driving research, the real-time lane line detection and lane offset estimation in complex traffic scenes have always been challenging and difficult tasks. Traditional detection methods need manual adjustment of parameters, they face many problems and difficulties and are still highly susceptible to interference caused by obstructing objects, illumination changes, and pavement wear. It is still challenging to design a robust lane detection and lane offset estimation algorithm. In this paper, we propose a convolutional neural network for lane offset estimation and lane line detection in a complex road environment, which transforms the problems of lane line detection into the instance’s segmentation. In response to a change in the method of lane processing, the network will form its example to each line. The global scale perception optimization mechanism is designed to solve the issue, especially where the lane line width is gradually narrowing at the vanishing point of the lane. At the same time, to realize multi-tasking processing and improve performance, and end-to-end lane offset estimation network is used in addition to the lane line detection network.

Journal ArticleDOI
TL;DR: In this article, the authors present the state-of-the-art machine learning-based models for offline signature verification systems using five aspects like datasets, preprocessing techniques, feature extraction methods, machine learningbased verification models and performance evaluation metrics.
Abstract: The offline signatures are the most widely adopted biometric authentication techniques in banking systems, administrative and financial applications due to its simplicity and uniqueness. Several automated techniques have been developed to anticipate the genuineness of the offline signature. However, the recapitulate of the existing literature on machine learning-based offline signature verification (OfSV) systems are available in a few review studies only. The objective of this systematic review is to present the state-of-the-art machine learning-based models for OfSV systems using five aspects like datasets, preprocessing techniques, feature extraction methods, machine learning-based verification models and performance evaluation metrics. Thus, five research questions were identified and analysed in this context. This review covers the articles published between January 2014 and October 2019. A systematic approach has been adopted to select the 56 articles. This systematic review revealed that recently, the deep learning-based neural network attained the most promising results for the OfSV systems on public datasets. This review consolidates the state-of-the-art OfSV systems performances in selected studies on five public datasets (CEDAR, GPDS, MCYT-75, UTSig and BHSig260). Finally, fifteen open research issues were identified for future development.

Journal ArticleDOI
TL;DR: The Grand Challenge “Saliency4ASD: Visual attention modeling for Autism Spectrum Disorder”, organized at IEEE ICME’19, is presented, aiming at supporting the research on VA modeling towards this healthcare societal challenge.
Abstract: The recent studies showing that gaze features can be useful in the identification of Autism Spectrum Disorder (ASD), have opened a new domain where Visual Attention (VA) modeling could be of great help. In this sense, this paper presents a report of the Grand Challenge “Saliency4ASD: Visual attention modeling for Autism Spectrum Disorder”, organized at IEEE ICME’19, aiming at supporting the research on VA modeling towards this healthcare societal challenge. In particular, this paper describes the workflow, obtained results, and datasets and tools that were used within this activity, in order to help on the development and evaluation of two types of VA models: (1) to predict saliency maps that fit gaze behavior of people with ASD, and (2) to identify individuals with ASD from typical development.