scispace - formally typeset
Search or ask a question

Showing papers on "JPEG published in 2021"


Journal ArticleDOI
Tong Chen1, Haojie Liu1, Zhan Ma1, Qiu Shen1, Xun Cao1, Yao Wang2 
TL;DR: An end-to-end learnt lossy image compression approach, which is built on top of the deep nerual network (DNN)-based variational auto-encoder (VAE) structure with Non-Local Attention optimization and Improved Context modeling (NLAIC).
Abstract: This article proposes an end-to-end learnt lossy image compression approach, which is built on top of the deep nerual network (DNN)-based variational auto-encoder (VAE) structure with Non-Local Attention optimization and Improved Context modeling (NLAIC). Our NLAIC 1) embeds non-local network operations as non-linear transforms in both main and hyper coders for deriving respective latent features and hyperpriors by exploiting both local and global correlations, 2) applies attention mechanism to generate implicit masks that are used to weigh the features for adaptive bit allocation, and 3) implements the improved conditional entropy modeling of latent features using joint 3D convolutional neural network (CNN)-based autoregressive contexts and hyperpriors. Towards the practical application, additional enhancements are also introduced to speed up the computational processing (e.g., parallel 3D CNN-based context prediction), decrease the memory consumption (e.g., sparse non-local processing) and reduce the implementation complexity (e.g., a unified model for variable rates without re-training). The proposed model outperforms existing learnt and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, on both Kodak and Tecnick datasets with the state-of-the-art compression efficiency, for both PSNR and MS-SSIM quality measurements. We have made all materials publicly accessible at https://njuvision.github.io/NIC for reproducible research.

142 citations


Journal ArticleDOI
TL;DR: A novel method to significantly enhance the transformation-based compression standards like JPEG by transmitting much fewer data of one image at the sender's end and a two-step method by combining the state-of-the-art signal processing based recovery method with a deep residual learning model to recover the original data is proposed.
Abstract: With the development of big data and network technology, there are more use cases, such as edge computing, that require more secure and efficient multimedia big data transmission. Data compression methods can help achieving many tasks like providing data integrity, protection, as well as efficient transmission. Classical multimedia big data compression relies on methods like the spatial-frequency transformation for compressing with loss. Recent approaches use deep learning to further explore the limit of the data compression methods in communication constrained use cases like the Internet of Things (IoT). In this article, we propose a novel method to significantly enhance the transformation-based compression standards like JPEG by transmitting much fewer data of one image at the sender's end. At the receiver's end, we propose a two-step method by combining the state-of-the-art signal processing based recovery method with a deep residual learning model to recover the original data. Therefore, in the IoT use cases, the sender like edge device can transmit only 60% data of the original JPEG image without any additional calculation steps but the image quality can still be recovered at the receiver's end like cloud servers with peak signal-to-noise ratio over 31 dB.

104 citations


Posted Content
TL;DR: Palette as mentioned in this paper is a simple and general framework for image-to-image translation using conditional diffusion models, which outperforms strong GAN and regression baselines, and establishes a new state of the art.
Abstract: We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. On four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG decompression), Palette outperforms strong GAN and regression baselines, and establishes a new state of the art. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of using $L_2$ vs. $L_1$ loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, and report several sample quality scores including FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against reference images for various baselines. We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a single generalist Palette model trained on 3 tasks (colorization, inpainting, JPEG decompression) performs as well or better than task-specific specialist counterparts.

73 citations


Proceedings Article
03 Mar 2021
TL;DR: A new simple approach for image compression: instead of storing the RGB values for each pixel of an image, the weights of a neural network overfitted to the image are stored, and this approach outperforms JPEG at low bit-rates, even without entropy coding or learning a distribution over weights.
Abstract: We propose a new simple approach for image compression: instead of storing the RGB values for each pixel of an image, we store the weights of a neural network overfitted to the image. Specifically, to encode an image, we fit it with an MLP which maps pixel locations to RGB values. We then quantize and store the weights of this MLP as a code for the image. To decode the image, we simply evaluate the MLP at every pixel location. We found that this simple approach outperforms JPEG at low bit-rates, even without entropy coding or learning a distribution over weights. While our framework is not yet competitive with state of the art compression methods, we show that it has various attractive properties which could make it a viable alternative to other neural data compression approaches.

73 citations


Proceedings ArticleDOI
28 Mar 2021
TL;DR: Li et al. as mentioned in this paper designed an Invertible Image Signal Processing (InvISP) pipeline, which not only enables rendering visually appealing sRGB images but also allows recovering nearly perfect RAW data.
Abstract: Unprocessed RAW data is a highly valuable image format for image editing and computer vision. However, since the file size of RAW data is huge, most users can only get access to processed and compressed sRGB images. To bridge this gap, we design an Invertible Image Signal Processing (InvISP) pipeline, which not only enables rendering visually appealing sRGB images but also allows recovering nearly perfect RAW data. Due to our framework’s inherent reversibility, we can reconstruct realistic RAW data instead of synthesizing RAW data from sRGB images without any memory overhead. We also integrate a differentiable JPEG compression simulator that empowers our framework to re-construct RAW data from JPEG images. Extensive quantitative and qualitative experiments on two DSLR demonstrate that our method obtains much higher quality in both rendered sRGB images and reconstructed RAW data than alternative methods.

68 citations


Proceedings ArticleDOI
30 Apr 2021
TL;DR: The NTIRE 2021 Challenge on Image Deblurring as mentioned in this paper focused on image deblurring, where both the tracks aim to recover a high-quality clean image from a blurry image, different artifacts are jointly involved.
Abstract: Motion blur is a common photography artifact in dynamic environments that typically comes jointly with the other types of degradation. This paper reviews the NTIRE 2021 Challenge on Image Deblurring. In this challenge report, we describe the challenge specifics and the evaluation results from the 2 competition tracks with the proposed solutions. While both the tracks aim to recover a high-quality clean image from a blurry image, different artifacts are jointly involved. In track 1, the blurry images are in a low resolution while track 2 images are compressed in JPEG format. In each competition, there were 338 and 238 registered participants and in the final testing phase, 18 and 17 teams competed. The winning methods demonstrate the state-of-the-art performance on the image deblurring task with the jointly combined artifacts.

65 citations


Journal ArticleDOI
TL;DR: The experimental results prove that the DWFCAT is highly efficient compared with the various state-of-the-art approaches for authentication and tamper localization of industrial images and can withstand a range of hybrid signal processing and geometric attacks.
Abstract: The image data received through various sensors are of significant importance in Industry 4.0. Unfortunately, these data are highly vulnerable to various malicious attacks during its transit to the destination. Although the use of pervasive edge computing (PEC) with the Internet of Things (IoT) has solved various issues, such as latency, proximity, and real-time processing, but the security and authentication of data between the nodes is still a significant concern in PEC-based industrial-IoT scenarios. In this article, we present “DWFCAT,” a dual watermarking framework for content authentication and tamper localization for industrial images. The robust and fragile watermarks along with overhead bits related to the cover image for tamper localization are embedded in different planes of the cover image. We have used discrete cosine transform coefficients and exploited their energy compaction property for robust watermark embedding. We make use of a four-point neighborhood to predict the value of a predefined pixel and use it for embedding the fragile watermark bits in the spatial domain. Chaotic and deoxyribonucleic acid encryption is used to encrypt the robust watermark before embedding to enhance its security. The results indicate that DWFCAT can withstand a range of hybrid signal processing and geometric attacks, such as Gaussian noise, salt and pepper, joint photographic experts group (JPEG) compression, rotation, low-pass filtering, resizing, cropping, sharpening, and histogram equalization. The experimental results prove that the DWFCAT is highly efficient compared with the various state-of-the-art approaches for authentication and tamper localization of industrial images.

50 citations


Journal ArticleDOI
TL;DR: Comparisons with prior state-of-the-art schemes demonstrate that the proposed robust JPEG steganographic algorithm can provide a more robust performance and statistical security.
Abstract: Social networks are everywhere and currently transmitting very large messages. As a result, transmitting secret messages in such an environment is worth researching. However, the images used in transmitting messages are usually compressed with a JPEG compression channel, which is lossy and damages the transmitted data. Therefore, to prevent secret messages from being damaged, a robust JPEG steganography is urgently needed. In this paper, a secure robust JPEG steganographic scheme based on an autoencoder with an adaptive BCH encoding (Bose-Chaudhuri-Hocquenghem encoding) is proposed. In particular, the autoencoder is first pretrained to fit the transformation relationship between the JPEG image before and after compression by the compression channel. In addition, the BCH encoding is adaptively utilized according to the content of cover image to decrease the error rate of secret message extraction. The DCT (Discrete Cosine Transformation) coefficient adjustment based on practical JPEG channel characteristics further improves the robustness and statistical security. Comparisons with prior state-of-the-art schemes demonstrate that the proposed robust JPEG steganographic algorithm can provide a more robust performance and statistical security.

44 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: CAT-Net as discussed by the authors is an end-to-end fully convolutional neural network including RGB and DCT streams, which learns forensic features of compression artifacts on RGB and DCNN domains jointly.
Abstract: Detecting and localizing image splicing has become essential to fight against malicious forgery. A major challenge to localize spliced areas is to discriminate between authentic and tampered regions with intrinsic properties such as compression artifacts. We propose CAT-Net, an end-to-end fully convolutional neural network including RGB and DCT streams, to learn forensic features of compression artifacts on RGB and DCT domains jointly. Each stream considers multiple resolutions to deal with spliced object’s various shapes and sizes. The DCT stream is pretrained on double JPEG detection to utilize JPEG artifacts. The proposed method outperforms state-of-the-art neural networks for localizing spliced regions in JPEG or non-JPEG images.

37 citations


Journal ArticleDOI
TL;DR: A novel RDH scheme for JPEG images based on multiple histogram modification (MHM) and rate-distortion optimization is proposed and can yield better embedding performance compared with state-of-the-art methods in terms of both visual quality and file size preservation.
Abstract: Most current reversible data hiding (RDH) techniques are designed for uncompressed images. However, JPEG images are more commonly used in our daily lives. Up to now, several RDH methods for JPEG images have been proposed, yet few of them investigated the adaptive data embedding as the lack of accurate measurement for the embedding distortion. To realize adaptive embedding and optimize the embedding performance, in this article, a novel RDH scheme for JPEG images based on multiple histogram modification (MHM) and rate-distortion optimization is proposed. Firstly, with selected coefficients, the RDH for JPEG images is generalized into a MHM embedding framework. Then, by estimating the embedding distortion, the rate-distortion model is formulated, so that the expansion bins can be adaptively determined for different histograms and images. Finally, to optimize the embedding performance in real time, a greedy algorithm with low computation complexity is proposed to derive the nearly optimal embedding efficiently. Experiments show that the proposed method can yield better embedding performance compared with state-of-the-art methods in terms of both visual quality and file size preservation.

34 citations


Proceedings ArticleDOI
17 Oct 2021
TL;DR: Wang et al. as discussed by the authors proposed an end-to-end training architecture, which utilizes Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness.
Abstract: Based on the powerful feature extraction ability of deep learning architecture, recently, deep-learning based watermarking algorithms have been widely studied. The basic framework of such algorithm is the auto-encoder like end-to-end architecture with an encoder, a noise layer and a decoder. The key to guarantee robustness is the adversarial training with the differential noise layer. However, we found that none of the existing framework can well ensure the robustness against JPEG compression, which is non-differential but is an essential and important image processing operation. To address such limitations, we proposed a novel end-to-end training architecture, which utilizes Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness. Precisely, for different mini-batches, we randomly choose one of real JPEG, simulated JPEG and noise-free layer as the noise layer. Besides, we suggest to utilize the Squeeze-and-Excitation blocks which can learn better feature in embedding and extracting stage, and propose a "message processor" to expand the message in a more appreciate way. Meanwhile, to improve the robustness against crop attack, we propose an additive diffusion block into the network. The extensive experimental results have demonstrated the superior performance of the proposed scheme compared with the state-of-the-art algorithms. Under the JPEG compression with quality factor $Q=50$, our models achieve a bit error rate less than 0.01% for extracted messages, with PSNR larger than 36 for the encoded images, which shows the well-enhanced robustness against JPEG attack. Besides, under many other distortions such as Gaussian filter, crop, cropout and dropout, the proposed framework also obtains strong robustness. The code implemented by PyTorch is avaiable in https://github.com/jzyustc/MBRS.

Proceedings ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an end-to-end training architecture, which utilizes Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness.
Abstract: Based on the powerful feature extraction ability of deep learning architecture, recently, deep-learning based watermarking algorithms have been widely studied. The basic framework of such algorithm is the auto-encoder like end-to-end architecture with an encoder, a noise layer and a decoder. The key to guarantee robustness is the adversarial training with the differential noise layer. However, we found that none of the existing framework can well ensure the robustness against JPEG compression, which is non-differential but is an essential and important image processing operation. To address such limitations, we proposed a novel end-to-end training architecture, which utilizes Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness. Precisely, for different mini-batches, we randomly choose one of real JPEG, simulated JPEG and noise-free layer as the noise layer. Besides, we suggest to utilize the Squeeze-and-Excitation blocks which can learn better feature in embedding and extracting stage, and propose a "message processor" to expand the message in a more appreciate way. Meanwhile, to improve the robustness against crop attack, we propose an additive diffusion block into the network. The extensive experimental results have demonstrated the superior performance of the proposed scheme compared with the state-of-the-art algorithms. Under the JPEG compression with quality factor Q=50, our models achieve a bit error rate less than 0.01% for extracted messages, with PSNR larger than 36 for the encoded images, which shows the well-enhanced robustness against JPEG attack. Besides, under many other distortions such as Gaussian filter, crop, cropout and dropout, the proposed framework also obtains strong robustness. The code implemented by PyTorch \cite{2011torch7} is avaiable in this https URL.

Proceedings ArticleDOI
17 Jun 2021
TL;DR: In this paper, the EfficientNet family pre-trained on ImageNet was used for steganalysis using transfer learning, and the modified models were evaluated by their detection accuracy, the number of parameters, the memory consumption and the total floating point operations (FLOPs) on the ALASKA II dataset.
Abstract: In this paper, we study the EfficientNet family pre-trained on ImageNet when used for steganalysis using transfer learning. We show that certain "surgical modifications" aimed at maintaining the input resolution in EfficientNet architectures significantly boost their performance in JPEG steganalysis, establishing thus new benchmarks. The modified models are evaluated by their detection accuracy, the number of parameters, the memory consumption, and the total floating point operations (FLOPs) on the ALASKA II dataset. We also show that, surprisingly, EfficientNets in their "vanilla form" do not perform as well as the SRNet in BOSSbase+BOWS2. This is because, unlike ALASKA II images, BOSSbase+BOWS2 contains aggressively subsampled images with more complex content. The surgical modifications in EfficientNet remedy this underperformance as well.

Journal ArticleDOI
Yingqiang Qiu1, Zhenxing Qian2, Han He1, Hui Tian1, Xinpeng Zhang2 
TL;DR: This paper proposes a new framework of lossless data hiding (LDH) in JPEG images that contains two algorithms, i.e., the optimized basic LDH and the relay transfer based extension, which provide better performances than previous arts.
Abstract: This paper proposes a new framework of lossless data hiding (LDH) in JPEG images. The proposed framework contains two algorithms, i.e., the optimized basic LDH and the relay transfer based extension. In the basic algorithm, we aim to preserve the filesize after data embedding. The data hiding process is optimized by variable-length-code (VLC) mapping, combination and permutation. In the extended algorithm, we focus on embedding more bits into the bitstream with a condition that the filesize increment is allowed. To decrease the filesize increment, we propose a relay transfer based algorithm to preprocess the JPEG bitstream. Subsequently, we embed data into the processed bitstream using the basic LDH. Both algorithms provide better performances than previous arts. After lossless data hiding, the marked JPEG bitstream is compliant to common JPEG decoders. Since all operations are implemented on VLCs and the Huffman codes, no distortion is generated on the image pixels. Experimental results demonstrate that the proposed approach outperforms previous methods.

Journal ArticleDOI
TL;DR: A dual-stream recursive residual network (STRRN) which consists of structure and texture streams for separately reducing the specific artifacts related to high-frequency or low-frequency image components and which reduces the total number of training parameters significantly.
Abstract: JPEG is the most widely used lossy image compression standard. When using JPEG with high compression ratios, visual artifacts cannot be avoided. These artifacts not only degrade the user experience but also negatively affect many low-level image processing tasks. Recently, convolutional neural network (CNN)-based compression artifact removal approaches have achieved significant success, however, at the cost of high computational complexity due to an enormous number of parameters. To address this issue, we propose a dual-stream recursive residual network (STRRN) which consists of structure and texture streams for separately reducing the specific artifacts related to high-frequency or low-frequency image components. The outputs of these streams are combined and fed into an aggregation network to further enhance the restored images. By using parameter sharing, the proposed network reduces the total number of training parameters significantly. Moreover, experiments conducted on five commonly used datasets confirm that the proposed STRRN can efficiently reduce the compression artifacts, while using up to 4.6 times less training parameters and 5 times less running time compared to the state-of-the-art approaches.

Proceedings ArticleDOI
10 Jan 2021
TL;DR: The authors' results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied, and there is a correlation between architectures employing an encoder-decoder pipeline and those that demonstrate resilience to lossy image compression.
Abstract: Recent advances in generalized image understanding have seen a surge in the use of deep convolutional neural networks (CNN) across a broad range of image-based detection, classification and prediction tasks. Whilst the reported performance of these approaches is impressive, this study investigates the hitherto unapproached question of the impact of commonplace image and video compression techniques on the performance of such deep learning architectures. Focusing on the JPEG and H.264 (MPEG-4 AVC) as a representative proxy for contemporary lossy image/video compression techniques that are in common use within network-connected image/video devices and infrastructure, we examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation. As such, within this study we include a variety of network architectures and domains spanning end-to-end convolution, encoder-decoder, region-based CNN (R-CNN), dual-stream, and generative adversarial networks (GAN). Our results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied. Notably, performance decreases significantly below a JPEG quality (quantization) level of 15% and a H.264 Constant Rate Factor (CRF) of 40. However, retraining said architectures on pre-compressed imagery conversely recovers network performance by up to 78.4% in some cases. Furthermore, there is a correlation between architectures employing an encoder-decoder pipeline and those that demonstrate resilience to lossy image compression. The characteristics of the relationship between input compression to output task performance can be used to inform design decisions within future image/video devices and infrastructure.

Journal ArticleDOI
TL;DR: The authors have concluded that the Accuracy rate of Lens Aberration based detection techniques deteriorates when the different source camera from same brand were under consideration and the performance of color filter array Based Detection techniques dropped when the post processing operation were used on images.
Abstract: Images are acquired and stored digitally these days. Image forensics is a science which is concerned with revealing the underlying facts about an image. The universal approaches provide a general strategy to perform image forensics irrespective of the type of manipulation. Identification of acquisition device is one of the significant universal approach. This review paper aims at analyzing the different types of device identification approaches. All research papers aiming camera and mobile detection using image analysis were acquired and then finally 60 most suitable papers were included. Out of these, 32 states of art papers were critically analyzed and compared. As every research starts with the literature review such analysis is significant. This is the first attempt for source camera and source mobile detection evaluation as per the authors knowledge. The authors have concluded that the Accuracy rate of Lens Aberration based detection techniques deteriorates when the different source camera from same brand were under consideration. The performance of color filter array Based Detection techniques dropped when the post processing operation were used on images. These techniques were vulnerable to high compression rate for JPEG images.

Journal ArticleDOI
TL;DR: This paper proposes a dual-image RDH method based on a modification of the discrete cosine transform (DCT) coefficients with high embedding capacity and satisfactory visual quality while suppressing file size expansion.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: Wang et al. as discussed by the authors used the DeepQA as a baseline model on a challenge database that includes various distortions and further improved the baseline model by dividing it into three parts and modifying each: distortion encoding network, sensitivity generation network, and score regression.
Abstract: Previous full-reference image quality assessment methods aim to evaluate the quality of images impaired by traditional distortions such as JPEG, white noise, Gaussian blur, and so on. However, there is a lack of research measuring the quality of images generated by various image processing algorithms, including super-resolution, denoising, restoration, etc. Motivated by the previous model that predicts the distortion sensitivity maps, we use the DeepQA as a baseline model on a challenge database that includes various distortions. We have further improved the baseline model by dividing it into three parts and modifying each: 1) distortion encoding network, 2) sensitivity generation network, and 3) score regression. Through rigorous experiments, the proposed model achieves better prediction accuracy on the challenge database than other methods. Also, the proposed method shows better visualization results compared to the baseline model. We submitted our model in NTIRE 2021 Perceptual Image Quality Assessment Challenge and won 12th in the main score.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a model-driven deep unfolding method for JPEG artifacts removal, with interpretable network structures, where each module corresponds to a specific operation of the iterative algorithm.
Abstract: Deep learning-based methods have achieved notable progress in removing blocking artifacts caused by lossy JPEG compression on images However, most deep learning-based methods handle this task by designing black-box network architectures to directly learn the relationships between the compressed images and their clean versions These network architectures are always lack of sufficient interpretability, which limits their further improvements in deblocking performance To address this issue, in this article, we propose a model-driven deep unfolding method for JPEG artifacts removal, with interpretable network structures First, we build a maximum posterior (MAP) model for deblocking using convolutional dictionary learning and design an iterative optimization algorithm using proximal operators Second, we unfold this iterative algorithm into a learnable deep network structure, where each module corresponds to a specific operation of the iterative algorithm In this way, our network inherits the benefits of both the powerful model ability of data-driven deep learning method and the interpretability of traditional model-driven method By training the proposed network in an end-to-end manner, all learnable modules can be automatically explored to well characterize the representations of both JPEG artifacts and image content Experiments on synthetic and real-world datasets show that our method is able to generate competitive or even better deblocking results, compared with state-of-the-art methods both quantitatively and qualitatively

Journal ArticleDOI
TL;DR: The challenges related to the standardization of compression technologies for holographic content and associated test methodologies are addressed.
Abstract: JPEG Pleno is a standardization framework addressing the compression and signaling of plenoptic modalities. While the standardization of solutions to handle light field content is currently reaching its final stage, the Joint Photographic Experts Group (JPEG) committee is now preparing for the standardization of solutions targeting point cloud and holographic modalities. This paper addresses the challenges related to the standardization of compression technologies for holographic content and associated test methodologies.

Journal ArticleDOI
TL;DR: An adaptive statistical model-based detector designed for detecting JPEG steganography, involving the channel-selected or non-channel-selected algorithm, and based on the strategy of assigning weights for DCT channels is proposed.
Abstract: In current steganalysis, relying on a large scale of samples, widely-adopted supervised schemes require the training stage while few studies focus on the design of a training-free unsupervised adaptive detector with high efficiency. To fill the gap, we investigate an adaptive statistical model-based detector designed for detecting JPEG steganography. First, in virtue of hypothesis testing theory, together with the distribution of quantized DCT coefficients, we establish the general framework of the statistical model-based detector. Second, based on the framework, we mainly analyze the performance of the detector relying on the selection of the statistical model, parameters estimation, and less significant payload prediction. Third, to improve the reliability of detection, based on the strategy of assigning weights for DCT channels, the novel adaptive statistical model-based detectors are proposed to aim at detecting JPEG steganography, involving the channel-selected or non-channel-selected algorithm. Extensive experiments highlight the effectiveness of the proposed methodology. Moreover, when detecting JPEG images adopted by two steganographic schemes with the small payload, the experimental results show the Area Under Curve (AUC) of our proposed optimal adaptive detector can achieve as high as 0.9567 and 0.9895 respectively, which are both better than that of non-adaptive detector.

Journal ArticleDOI
TL;DR: A unified framework to detect the double JPEG compression in the scenario whether the quantization matrix is different or not is proposed, which means this approach can be applied in more practical Web forensics tasks.
Abstract: Recently, the double joint photographic experts group (JPEG) compression detection tasks have been paid much more attention in the field of Web image forensics. Although there are several useful methods proposed for double JPEG compression detection when the quantization matrices are different in the primary and secondary compression processes, it is still a difficult problem when the quantization matrices are the same. Moreover, those methods for the different or the same quantization matrices are implemented in independent ways. The paper aims to build a new unified framework for detecting the doubly JPEG compression.,First, the Y channel of JPEG images is cut into 8 × 8 nonoverlapping blocks, and two groups of features that characterize the artifacts caused by doubly JPEG compression with the same and the different quantization matrices are extracted on those blocks. Then, the Riemannian manifold learning is applied for dimensionality reduction while preserving the local intrinsic structure of the features. Finally, a deep stack autoencoder network with seven layers is designed to detect the doubly JPEG compression.,Experimental results with different quality factors have shown that the proposed approach performs much better than the state-of-the-art approaches.,To verify the integrity and authenticity of Web images, the research of double JPEG compression detection is increasingly paid more attentions.,This paper aims to propose a unified framework to detect the double JPEG compression in the scenario whether the quantization matrix is different or not, which means this approach can be applied in more practical Web forensics tasks.

Journal ArticleDOI
TL;DR: This paper proposes an embedding mechanism to perform NS in the JPEG domain after linear developments by explicitly computing the correlations between DCT coefficients before quantization by developing the matrix representation of demosaicking, luminance averaging, pixel section, and 2D-DCT.
Abstract: In order to achieve high practical security, Natural Steganography (NS) uses cover images captured at ISO sensitivity $ISO_{1}$ and generates stego images mimicking ISO sensitivity $ISO_{2}>ISO_{1}$ . This is achieved by adding a stego signal to the cover that mimics the sensor photonic noise. This paper proposes an embedding mechanism to perform NS in the JPEG domain after linear developments by explicitly computing the correlations between DCT coefficients before quantization. In order to compute the covariance matrix of the photonic noise in the DCT domain, we first develop the matrix representation of demosaicking, luminance averaging, pixel section, and 2D-DCT. A detailed analysis of the resulting covariance matrix is done in order to explain the origins of the correlations between the coefficients of $3\times 3$ DCT blocks. An embedding scheme is then presented that takes into account all the correlations. It employs 4 sub-lattices and 64 lattices per sub-lattices. The modification probabilities of each DCT coefficient are then derived by computing conditional probabilities computed from a multivariate Gaussian distribution using the Cholesky decomposition of the covariance matrix. This derivation is also used to compute the embedding capacity of each image. Using a specific database called E1Base , we show that in the JPEG domain NS (J-Cov-NS) enables to achieve high capacity (more than 2 bits per non-zero AC DCT) and with high practical security ( $P_{\mathrm {E}}\simeq 40\%$ using DCTR and $P_{\mathrm {E}}\simeq 32\%$ using SRNet) from QF 75 to QF 100).

Journal ArticleDOI
TL;DR: The presented model of quantitative assessment of medical image quality may be helpful in determining the thresholds for irreversible image post-processing algorithms parameters (i.e. quality factor in JPEG) in order to avoid misdiagnosis.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, two effective novel blocks are developed: analysis and synthesis block that employs the convolution layer and Generalized Divisive Normalization (GDN) in the variable-rate encoder and decoder side.
Abstract: Image compression is a method to remove spatial redundancy between adjacent pixels and reconstruct a high-quality image. In the past few years, deep learning has gained huge attention from the research community and produced promising image reconstruction results. Therefore, recent methods focused on developing deeper and more complex networks, which significantly increased network complexity. In this paper, two effective novel blocks are developed: analysis and synthesis block that employs the convolution layer and Generalized Divisive Normalization (GDN) in the variable-rate encoder and decoder side. Our network utilizes a pixel RNN approach for quantization. Furthermore, to improve the whole network, we encode a residual image using LSTM cells to reduce unnecessary information. Experimental results demonstrated that the pro-posed variable-rate framework with novel blocks outperforms existing methods and standard image codecs, such as George’s [11] and JPEG in terms of image similarity. The project page along with code and models are available at https://github.com/khawar512/cvpr_image_compress

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a JPEG image steganography payload location method based on optimal estimation of cover co-frequency sub-image, which estimates the cover JPEG image based on the Markov model of co-image.
Abstract: The excellent cover estimation is very important to the payload location of JPEG image steganography. But it is still hard to exactly estimate the quantized DCT coefficients in cover JPEG image. Therefore, this paper proposes a JPEG image steganography payload location method based on optimal estimation of cover co-frequency sub-image, which estimates the cover JPEG image based on the Markov model of co-frequency sub-image. The proposed method combines the coefficients of the same position in each 8 × 8 block in the JPEG image to obtain 64 co-frequency sub-images and then uses the maximum a posterior (MAP) probability algorithm to find the optimal estimations of cover co-frequency sub-images by the Markov model. Then, the residual of each DCT coefficient is obtained by computing the absolute difference between it and the estimated cover version of it, and the average residual over coefficients in the same position of multiple stego images embedded along the same path is used to estimate the stego position. The experimental results show that the proposed payload location method can significantly improve the locating accuracy of the stego positions in low frequencies.

Proceedings ArticleDOI
17 Oct 2021
TL;DR: In this paper, the authors disentangle the forward and backward propagation of an attack simulation layer to make the pipeline compatible with non-differentiable and/or black-box distortion, such as lossy compression and photoshop effects.
Abstract: Data hiding is one widely used approach for proving ownership through blind watermarking. Deep learning has been widely used in data hiding, for which inserting an attack simulation layer (ASL) after the watermarked image has been widely recognized as the most effective approach for improving the pipeline robustness against distortions. Despite its wide usage, the gain of enhanced robustness from ASL is usually interpreted through the lens of augmentation, while our work explores this gain from a new perspective by disentangling the forward and backward propagation of such ASL. We find that the main influential component is forward propagation instead of backward propagation. This observation motivates us to use forward ASL to make the pipeline compatible with non-differentiable and/or black-box distortion, such as lossy (JPEG) compression and photoshop effects. Extensive experiments demonstrate the efficacy of our simple approach.

Journal ArticleDOI
TL;DR: A generic clustering-based framework to improve the performance of the existing estimation methods for quantization step estimation and might provide inspiration for other forensic tasks to alleviate their performance issues induced by sample insufficiency.
Abstract: Quantization plays a pivotal role in JPEG compression with respect to the tradeoff between image fidelity and storage size, and the blind estimation of quantization parameters has attracted considerable interest in the fields of image steganalysis and forensics. Existing estimation methods have made great progress, but they usually suffer a sharp decline in accuracy when addressing small-size JPEG decompressed bitmaps due to the insufficiency of coefficients. Aiming to alleviate this issue, this paper proposes a generic clustering-based framework to improve the performance of the existing methods. The core idea is to gather as many coefficients as possible by clustering subbands before feeding them into a step estimator. The proposed framework is implemented using hierarchical clustering with two kinds of histogram-like features. Extensive experiments are conducted to validate the effectiveness of the proposed framework on a variety of images of different sizes and quality factors, and the results show that notable improvements can be achieved. In addition to quantization step estimation, we believe the idea behind the proposed framework might provide inspiration for other forensic tasks to alleviate their performance issues induced by sample insufficiency.

Journal ArticleDOI
TL;DR: The proposed work primarily aims to investigate the internet communication as well as deter any unwanted happenings, which could occur because of the covert communication in the digital mass media using the technique of steganalysis.
Abstract: The spectacular progress of technology related to the information and communication arena throughout the past epoch made the internet a powerful media for faster communication of data. Though this technology is being admired at one side, there equally exists a challenge for safeguarding the data and privacy of information of a personal without any leak in the data and corresponding mistreatment. Hence, the proposed work primarily aims to investigate the internet communication as well as deter any unwanted happenings, which could occur because of the covert communication. The probable presence of hidden messages is inspected in the digital mass media using the technique of steganalysis. The distinctive features are to be identified, chosen and extracted for universal (blind) steganalysis and are decided by the format of image and its transformation. In this paper, the analysis is carried out in JPEG format images and 10% embedding with 10 fold cross validation. The technique of calibration is used to obtain an estimate of the cover image. Four embedded techniques that have been applied for stegananlysis are Least Significant Bit Matching, LSB Replacement, Pixel Value Differencing (PVD) and F5 respectively. Four different sampling like linear, shuffle, stratified and automatic are considered in this paper. The classifiers used for a comparative study are Support Vector Machine (SVM) and SVM- Particle Swarm Optimization (SVM-PSO). Several kernels namely linear, epanechnikov, multi-quadratic, radial, ANOVA and polynomial are used in classification. The classifier is trained to examine every single coefficient as a separate unit for analysis and the outcome of this analysis helps in finding the decision of steganalysis.