Showing papers on "Quantization (image processing) published in 2019"

PDF

Open Access

Proceedings Article•DOI•

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

[...]

Zihao Liu¹, Qi Liu¹, Tao Liu¹, Nuo Xu¹, Xue Lin², Yanzhi Wang², Wujie Wen¹ - Show less +3 more•Institutions (2)

Florida International University¹, Northeastern University²

15 Jun 2019

TL;DR: The experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency, accuracy of benign images after defense, and processing time per image.

...read moreread less

Abstract: Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely “feature distillation”, to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a twostep method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from ∼ 20% to ∼ 90% on adversarial examples), accuracy of benign images after defense (≤ 1% accuracy degradation), and processing time per image (∼ 259× Speedup). Moreover, our solution also can provide the best defense efficiency (∼ 60% accuracy) against the latest BPDA attack with least accuracy reduction (∼ 1%) on benign images among all other input-transformation based defense methods.

...read moreread less

185 citations

Journal Article•DOI•

Gray-level invariant Haralick texture features.

[...]

Tommy Löfstedt¹, Patrik Brynolfsson¹, Thomas Asklund¹, Tufve Nyholm¹, Anders Garpebring¹ - Show less +1 more•Institutions (1)

Umeå University¹

22 Feb 2019-PLOS ONE

TL;DR: Using the invariant Haralick features, an image pattern will give the same texture feature values independent of image quantization, by redefining the gray-level co-occurrence matrix (GLCM) as a discretized probability density function, it becomes asymptotically invariant to the quantization.

...read moreread less

Abstract: Haralick texture features are common texture descriptors in image analysis. To compute the Haralick features, the image gray-levels are reduced, a process called quantization. The resulting features depend heavily on the quantization step, so Haralick features are not reproducible unless the same quantization is performed. The aim of this work was to develop Haralick features that are invariant to the number of quantization gray-levels. By redefining the gray-level co-occurrence matrix (GLCM) as a discretized probability density function, it becomes asymptotically invariant to the quantization. The invariant and original features were compared using logistic regression classification to separate two classes based on the texture features. Classifiers trained on the invariant features showed higher accuracies, and had similar performance when training and test images had very different quantizations. In conclusion, using the invariant Haralick features, an image pattern will give the same texture feature values independent of image quantization.

...read moreread less

97 citations

Journal Article•DOI•

Overview and evaluation of the JPEG XT HDR image compression standard

[...]

Alessandro Artusi¹, Rafal Mantiuk², Thomas Richter³, Philippe Hanhart⁴, Pavel Korshunov⁵, Massimiliano Agostinelli, Arkady Ten⁶, Touradj Ebrahimi⁴ - Show less +4 more•Institutions (6)

University of Girona¹, University of Cambridge², University of Stuttgart³, École Polytechnique Fédérale de Lausanne⁴, Idiap Research Institute⁵, Dolby Laboratories⁶

01 Apr 2019-Journal of Real-time Image Processing

TL;DR: The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations, and assess the coding efficiency of each profile extensively through subjective assessments, using 24 naïve subjects to evaluate 20 images and objective evaluations.

...read moreread less

Abstract: Standards play an important role in providing a common set of specifications and allowing inter-operability between devices and systems. Until recently, no standard for high-dynamic-range (HDR) image coding had been adopted by the market, and HDR imaging relies on proprietary and vendor-specific formats which are unsuitable for storage or exchange of such images. To resolve this situation, the JPEG Committee is developing a new coding standard called JPEG XT that is backward compatible to the popular JPEG compression, allowing it to be implemented using standard 8-bit JPEG coding hardware or software. In this paper, we present design principles and technical details of JPEG XT. It is based on a two-layer design, a base layer containing a low-dynamic-range image accessible to legacy implementations, and an extension layer providing the full dynamic range. The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations. We assess the coding efficiency of each profile extensively through subjective assessments, using 24 naive subjects to evaluate 20 images, and objective evaluations, using 106 images with five different tone-mapping operators and at 100 different bit rates. The objective results (based on benchmarking with subjective scores) demonstrate that JPEG XT can encode HDR images at bit rates varying from 1.1 to 1.9 bit/pixel for estimated mean opinion score (MOS) values above 4.5 out of 5, which is considered as fully transparent in many applications. This corresponds to 23-times bitstream reduction compared to lossless OpenEXR PIZ compression.

...read moreread less

65 citations

Journal Article•DOI•

New Rapid and Robust Color Image Watermarking Technique in Spatial Domain

[...]

Qingtang Su¹, Liu Decheng¹, Yuan Zihan¹, Gang Wang¹, Xiaofeng Zhang¹, Beijing Chen², Tao Yao¹ - Show less +3 more•Institutions (2)

Ludong University¹, Nanjing University of Information Science and Technology²

25 Jan 2019-IEEE Access

TL;DR: The experimental results on two publicly available image databases have shown that the proposed method not only has satisfied the needs of invisibility but also has better performance in terms of robustness and real-time feature, which show the proposedmethod has both advantages of spatial domain and frequency domain.

...read moreread less

Abstract: In this paper, a novel spatial domain color image watermarking technique is proposed to rapidly and effectively protect the copyright of the color image. First, the direct current (DC) coefficient of 2D-DFT obtained in the spatial domain is discussed, and the relationship between the change of each pixel in the spatial domain and the change of the DC coefficient in the Fourier transform is proved. Then, the DC coefficient is used to embed and extract watermark in the spatial domain by the proposed quantization technique. The novelties of this paper include three points: 1) the DC coefficient of 2D-DFT is obtained in the spatial domain without of the true 2D-DFT; 2) the relationship between the change of each pixel in the image block and the change of the DC coefficient of 2D-DFT is found, and; 3) the proposed method has the short running time and strong robustness. The experimental results on two publicly available image databases (CVG-UGR and USC-SIPI) have shown that the proposed method not only has satisfied the needs of invisibility but also has better performance in terms of robustness and real-time feature, which show the proposed method has both advantages of spatial domain and frequency domain.

...read moreread less

58 citations

Journal Article•DOI•

Quantization based multiple medical information watermarking for secure e-health

[...]

Digvijay Singh Chauhan¹, Amit Singh², Basant Kumar³, Jai Prakash Saini⁴•Institutions (4)

Feroze Gandhi Institute of Engineering and Technology¹, Jaypee University of Information Technology², Motilal Nehru National Institute of Technology Allahabad³, Bundelkhand Institute of Engineering & Technology⁴

01 Feb 2019-Multimedia Tools and Applications

TL;DR: The experimental results validate the effectiveness of the proposed framework in terms of BER and embedding capacity compared to other state-of-the-art methods and find potential application in prevention of patient identity theft in e-health applications.

...read moreread less

Abstract: In this paper, an improved wavelet based medical image watermarking algorithm is proposed. Initially, the proposed technique decomposes the cover medical image into ROI and NROI regions and embedding three different watermarks into the non-region of interest (NROI) part of the transformed DWT cover image for compact and secure medical data transmission in E-health environment. In addition, the method addressing the problem of channel noise distortion may lead to faulty watermark by applying error correcting codes (ECCs) before embedding them into the cover image. Further, the bit error rates (BER) performance of the proposed method is determined for different kind of attacks including ‘Checkmark’ attacks. Experimental results indicate that the Turbo code performs better than BCH (Bose-Chaudhuri-Hochquenghem) error correction code. Furthermore, the experimental results validate the effectiveness of the proposed framework in terms of BER and embedding capacity compared to other state-of-the-art methods. Therefore, the proposed method finds potential application in prevention of patient identity theft in e-health applications.

...read moreread less

57 citations

Journal Article•DOI•

Filter-based deep-compression with global average pooling for convolutional networks

[...]

Ting-Yun Hsiao¹, Yung-Chang Chang¹, Hsin-Hung Chou¹, Ching-Te Chiu¹•Institutions (1)

National Tsing Hua University¹

01 May 2019-Journal of Systems Architecture

TL;DR: This work proposes an efficient strategy to compress the VGG16 model by introducing global average pooling, performing iterative pruning on the filters with the proposed order-deciding scheme in order to prune more efficiently, applying truncated SVD to the fully-connected layer, and performing quantization.

...read moreread less

54 citations

Journal Article•DOI•

Enhancing Image Watermarking With Adaptive Embedding Parameter and PSNR Guarantee

[...]

Huang Ying¹, Baoning Niu¹, Hu Guan², Shuwu Zhang²•Institutions (2)

Taiyuan University of Technology¹, Chinese Academy of Sciences²

25 Mar 2019-IEEE Transactions on Multimedia

TL;DR: A notion called error threshold is introduced to theoretically analyze the performance of the proposed SSAES and DQAQT watermarking schemes, showing that they outperform the state-of-the-art methods in terms of imperceptibility, robustness, computational cost, and adaptability.

...read moreread less

Abstract: Watermarking plays an important role in identifying the copyright of an image and related issues. The state-of-the-art watermark embedding schemes, spread spectrum and quantization, suffer from host signal interference (HSI) and scaling attacks, respectively. Both of them use a fixed embedding parameter, which is difficult to take both robustness and imperceptibility into account for all images. This paper solves the problems by proposing two novel blind watermarking schemes: a spread spectrum scheme with adaptive embedding strength (SSAES) and a differential quantization scheme with adaptive quantization threshold (DQAQT). Their adaptiveness comes from the proposed adaptive embedding strategy (AEP), which maximizes the embedding strength or quantization threshold by guaranteeing the peak signal-to-noise ratio (PSNR) of the host image after embedding the watermark, and strikes the balance between robustness and imperceptibility. SSAES is HSI free by factoring in the priori knowledge about HSI. In DQAQT, an effective quantization mode is proposed to resist scaling attacks by utilizing the difference between two selected DCT coefficients with high stability. Both SSAES and DQAQT can be easily applied to other watermarking frameworks. We introduce a notion called error threshold to theoretically analyze the performance of our proposed methods in details. The experimental results consistently demonstrate that SSAES and DQAQT outperform the state-of-the-art methods in terms of imperceptibility, robustness, computational cost, and adaptability.

...read moreread less

53 citations

Journal Article•DOI•

Optimized Compression for Implementing Convolutional Neural Networks on FPGA

[...]

Min Zhang, Li Linpeng, Hai Wang, Yan Liu, Qin Hongbo, Wei Zhao - Show less +2 more

01 Mar 2019-Electronics

TL;DR: A reversed-pruning strategy is proposed which reduces the number of parameters of AlexNet by a factor of 13× without accuracy loss on the ImageNet dataset and an efficient storage technique, which aims for the reduction of the whole overhead cache of the convolutional layer and the fully connected layer, is presented.

...read moreread less

Abstract: Field programmable gate array (FPGA) is widely considered as a promising platform for convolutional neural network (CNN) acceleration. However, the large numbers of parameters of CNNs cause heavy computing and memory burdens for FPGA-based CNN implementation. To solve this problem, this paper proposes an optimized compression strategy, and realizes an accelerator based on FPGA for CNNs. Firstly, a reversed-pruning strategy is proposed which reduces the number of parameters of AlexNet by a factor of 13× without accuracy loss on the ImageNet dataset. Peak-pruning is further introduced to achieve better compressibility. Moreover, quantization gives another 4× with negligible loss of accuracy. Secondly, an efficient storage technique, which aims for the reduction of the whole overhead cache of the convolutional layer and the fully connected layer, is presented respectively. Finally, the effectiveness of the proposed strategy is verified by an accelerator implemented on a Xilinx ZCU104 evaluation board. By improving existing pruning techniques and the storage format of sparse data, we significantly reduce the size of AlexNet by 28×, from 243 MB to 8.7 MB. In addition, the overall performance of our accelerator achieves 9.73 fps for the compressed AlexNet. Compared with the central processing unit (CPU) and graphics processing unit (GPU) platforms, our implementation achieves 182.3× and 1.1× improvements in latency and throughput, respectively, on the convolutional (CONV) layers of AlexNet, with an 822.0× and 15.8× improvement for energy efficiency, separately. This novel compression strategy provides a reference for other neural network applications, including CNNs, long short-term memory (LSTM), and recurrent neural networks (RNNs).

...read moreread less

53 citations

Posted Content•

Improved training of binary networks for human pose estimation and image recognition.

[...]

Adrian Bulat, Georgios Tzimiropoulos, Jean Kossaifi, Maja Pantic¹•Institutions (1)

Samsung¹

11 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: A series of novel methodological changes are introduced that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary) and investigate the extent to which network binarization and knowledge distillation can be combined.

...read moreread less

Abstract: Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin. However, under low memory and limited computational power constraints, the accuracy on the same problems drops considerable. In this paper, we propose a series of techniques that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary). We evaluate the proposed improvements on two diverse tasks: fine-grained recognition (human pose estimation) and large-scale image recognition (ImageNet classification). Specifically, we introduce a series of novel methodological changes including: (a) more appropriate activation functions, (b) reverse-order initialization, (c) progressive quantization, and (d) network stacking and show that these additions improve existing state-of-the-art network binarization techniques, significantly. Additionally, for the first time, we also investigate the extent to which network binarization and knowledge distillation can be combined. When tested on the challenging MPII dataset, our method shows a performance improvement of more than 4% in absolute terms. Finally, we further validate our findings by applying the proposed techniques for large-scale object recognition on the Imagenet dataset, on which we report a reduction of error rate by 4%.

...read moreread less

48 citations

Posted Content•

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

[...]

Xiaolong Ma¹, Sheng Lin¹, Shaokai Ye², Zhezhi He³, Linfeng Zhang⁴, Geng Yuan¹, Sia Huat Tan⁴, Zhengang Li¹, Deliang Fan⁵, Xuehai Qian⁶, Xue Lin¹, Kaisheng Ma⁴, Yanzhi Wang¹ - Show less +9 more•Institutions (6)

Northeastern University¹, École Polytechnique Fédérale de Lausanne², Shanghai Jiao Tong University³, Tsinghua University⁴, Arizona State University⁵, University of Southern California⁶

03 Jul 2019-arXiv: Learning

TL;DR: It is concluded that structured pruning has a greater potential compared to non-structured pruning and the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases.

...read moreread less

Abstract: Large deep neural network (DNN) models pose the key challenge to energy efficiency due to the significantly higher energy consumption of off-chip DRAM accesses than arithmetic or SRAM operations. It motivates the intensive research on model compression with two main approaches. Weight pruning leverages the redundancy in the number of weights and can be performed in a non-structured, which has higher flexibility and pruning rate but incurs index accesses due to irregular weights, or structured manner, which preserves the full matrix structure with lower pruning rate. Weight quantization leverages the redundancy in the number of bits in weights. Compared to pruning, quantization is much more hardware-friendly, and has become a "must-do" step for FPGA and ASIC implementations. This paper provides a definitive answer to the question for the first time. First, we build ADMM-NN-S by extending and enhancing ADMM-NN, a recently proposed joint weight pruning and quantization framework. Second, we develop a methodology for fair and fundamental comparison of non-structured and structured pruning in terms of both storage and computation efficiency. Our results show that ADMM-NN-S consistently outperforms the prior art: (i) it achieves 348x, 36x, and 8x overall weight pruning on LeNet-5, AlexNet, and ResNet-50, respectively, with (almost) zero accuracy loss; (ii) we demonstrate the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases. These results provide a strong baseline and credibility of our study. Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency. Thus, we conclude that non-structured pruning is considered harmful. We urge the community not to continue the DNN inference acceleration for non-structured sparsity.

...read moreread less

47 citations

Journal Article•DOI•

Flexible Lossy Compression for Selective Encrypted Image With Image Inpainting

[...]

Chuan Qin¹, Qing Zhou¹, Fang Cao², Jing Dong³, Xinpeng Zhang⁴ - Show less +1 more•Institutions (4)

University of Shanghai for Science and Technology¹, Shanghai Maritime University², Chinese Academy of Sciences³, Fudan University⁴

01 Nov 2019-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Experimental results demonstrate that the proposed lossy compression scheme achieves better rate-distortion performance than some of the state-of-the-art schemes.

...read moreread less

Abstract: In this paper, a novel lossy compression scheme for encrypted image based on image inpainting is proposed. In order to maintain confidentiality, the content owner encrypts the original image through a modulo-256 addition encryption and block permutation to mask image content. Then, the third party, such as a cloud server, can compress the selective encrypted image before transmitting to the receiver. During compression, encrypted blocks are categorized into four sets corresponding to different complexity degrees in plaintext domain without the loss of security. By allocating various bit rates to the encrypted blocks from different sets, flexible compression can be achieved with difference quantization. After parsing and decoding the compressed bit stream, the receiver first recovers partial encrypted pixels and then decrypts them. The other missing pixels are further recovered with the assistance of image inpainting based on a total variation model, and the final reconstructed image can be produced. Experimental results demonstrate that the proposed scheme achieves better rate-distortion performance than some of the state-of-the-art schemes.

...read moreread less

Journal Article•DOI•

Enhanced AMBTC based data hiding method using hamming distance and pixel value differencing

[...]

Rajeev Kumar, Dae-Soo Kim, Ki-Hyun Jung

01 Aug 2019

TL;DR: Experimental results prove that the proposed method is superior to the contemporary AMBTC based data hiding methods in term of data hiding capacity and has comparable image quality.

...read moreread less

Abstract: In this paper, we propose a new Absolute Moment Block Truncation Coding (AMBTC) based data hiding scheme using hamming distance and pixel value differencing methods. The proposed method firstly pre-processes the original image using a smoothing filter so that its quality can be maintained after AMBTC compression. It then applies AMBTC compression on the processed image and uses two thresholds to categorize the image blocks into three categories namely Smooth, Less_Complex, and Highly_Complex blocks so that the complex block can also be used to embed the secret data. The secret data is then embedded into the smooth blocks using simple replacement strategy. Further, the proposed method embeds 8 bits of secret data into the bit plane of Less_Complex blocks using hamming distance calculation and few bits into the quantization levels of Highly_Complex blocks using pixel value differencing (PVD) method to increase the data hiding capacity without having any major impact on image quality. Thus, the proposed method additionally embeds on average 6 more bits in complex blocks. Experimental results also prove that the proposed method is superior to the contemporary AMBTC based data hiding methods in term of data hiding capacity and has comparable image quality.

...read moreread less

Proceedings Article•DOI•

Efficient Super Resolution Using Binarized Neural Network

[...]

Yinglan Ma¹, Hongyu Xiong², Zhe Hu, Lizhuang Ma³•Institutions (3)

Adobe Systems¹, Facebook², East China Normal University³

01 Jun 2019

TL;DR: In this article, the convolutional filters are binarized in residual blocks and a learnable weight for each binary filter is adopted to reduce over-parametrization and large amounts of redundancy.

...read moreread less

Abstract: Deep convolutional neural networks (DCNNs) have recently demonstrated high-quality results in single-image super-resolution (SR). DCNNs often suffer from over-parametrization and large amounts of redundancy, which results in inefficient inference and high memory usage, preventing massive applications on mobile devices. As a way to significantly reduce model size and computation time, binarized neural network has only been shown to excel on semantic-level tasks such as image classification and recognition. However, little effort of network quantization has been spent on image enhancement tasks like SR, as network quantization is usually assumed to sacrifice pixel-level accuracy. In this work, we explore an network-binarization approach for SR tasks without sacrificing much reconstruction accuracy. To achieve this, we binarize the convolutional filters in only residual blocks, and adopt a learnable weight for each binary filter. We evaluate this idea on several state-of-the-art DCNN-based architectures, and show that binarized SR networks achieve comparable qualitative and quantitative results as their real-weight counterparts. Moreover, the proposed binarized strategy could help reduce model size by 80% when applying on SRResNet, and could potentially speed up inference by 5×.

...read moreread less

Journal Article•DOI•

Graph-Based Joint Dequantization and Contrast Enhancement of Poorly Lit JPEG Images

[...]

Xianming Liu¹, Gene Cheung², Xiangyang Ji³, Debin Zhao¹, Wen Gao - Show less +1 more•Institutions (3)

Harbin Institute of Technology¹, York University², Tsinghua University³

01 Mar 2019-IEEE Transactions on Image Processing

TL;DR: This work proposes to jointly dequantize and contrast-enhanceJPEG images captured in poor lighting conditions in a single graph-signal restoration framework, adopting accelerated proximal gradient (APG) algorithms in the transform domain, with backtracking line search for further speedup.

...read moreread less

Abstract: JPEG images captured in poor lighting conditions suffer from both low luminance contrast and coarse quantization artifacts due to lossy compression. Performing dequantization and contrast enhancement in separate back-to-back steps would amplify the residual compression artifacts, resulting in low visual quality. Leveraging on recent development in graph signal processing (GSP), we propose to jointly dequantize and contrast-enhance such images in a single graph-signal restoration framework. Specifically, we separate each observed pixel patch into illumination and reflectance via Retinex theory, where we define generalized smoothness prior and signed graph smoothness prior according to their respective unique signal characteristics. Given only a transform-coded image patch, we compute robust edge weights for each graph via low-pass filtering in the dual graph domain. We compute the illumination and reflectance components for each patch alternately, adopting accelerated proximal gradient (APG) algorithms in the transform domain, with backtracking line search for further speedup. Experimental results show that our generated images outperform the state-of-the-art schemes noticeably in the subjective quality evaluation.

...read moreread less

Journal Article•DOI•

Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video

[...]

You Yang¹, Qiong Liu¹, Xin He¹, Zhen Liu²•Institutions (2)

Huazhong University of Science and Technology¹, Ningbo University²

01 Jan 2019-IEEE Transactions on Image Processing

TL;DR: The results suggest that the proposed cross-view multi-lateral filtering scheme to improve the quality of compressed depth maps/videos within the framework of asymmetric multi-view video with depth compression outperforms state-of-the-art filters and is suitable for use in multi- view color plus depth-based interaction- and remote-oriented applications.

...read moreread less

Abstract: Multi-view depth is crucial for describing positioning information in 3D space for virtual reality, free viewpoint video, and other interaction- and remote-oriented applications. However, in cases of lossy compression for bandwidth limited remote applications, the quality of multi-view depth video suffers from quantization errors, leading to the generation of obvious artifacts in consequent virtual view rendering during interactions. Considerable efforts must be made to properly address these artifacts. In this paper, we propose a cross-view multi-lateral filtering scheme to improve the quality of compressed depth maps/videos within the framework of asymmetric multi-view video with depth compression. Through this scheme, a distorted depth map is enhanced via non-local candidates selected from current and neighboring viewpoints of different time-slots. Specifically, these candidates are clustered into a macro super pixel denoting the physical and semantic cross-relationships of the cross-view, spatial and temporal priors. The experimental results show that gains from static depth maps and dynamic depth videos can be obtained from PSNR and SSIM metrics, respectively. In subjective evaluations, even object contours are recovered from a compressed depth video. We also verify our method via several practical applications. For these verifications, artifacts on object contours are properly managed for the development of interactive video and discontinuous object surfaces are restored for 3D modeling. Our results suggest that the proposed filter outperforms state-of-the-art filters and is suitable for use in multi-view color plus depth-based interaction- and remote-oriented applications.

...read moreread less

Journal Article•DOI•

Compression of recurrent neural networks for efficient language modeling

[...]

Artem M. Grachev¹, Artem M. Grachev², Dmitry I. Ignatov², Andrey V. Savchenko²•Institutions (2)

Samsung¹, National Research University – Higher School of Economics²

01 Jun 2019

TL;DR: A general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling is proposed, showing that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.

...read moreread less

Abstract: Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.

...read moreread less

Journal Article•DOI•

Fast and Robust Watermarking Method Based on DCT Specific Location

[...]

Sung Woo Byun¹, Heui-Su Son¹, Seok-Pil Lee¹•Institutions (1)

Sangmyung University¹

25 Jul 2019-IEEE Access

TL;DR: A novel watermarking method based on a discrete cosine transform (DCT) which guarantees robustness and low computational complexity is proposed, which had the faster and the more robust performance than previous studies.

...read moreread less

Abstract: In many studies related to watermarking, spatial-domain methods have a relatively low information-hiding capacity and limited robustness, and transform-domain methods are not applicable in real-time processes because of their considerably high computational time. In this paper, we propose a novel watermarking method based on a discrete cosine transform (DCT), which guarantees robustness and low computational complexity. First, we calculated the DCT coefficient of a specific location. Then, a variation value was calculated according to the embedding bits and quantization steps to modify the coefficient. Last, we embedded watermark bits by directly modifying the pixel values without full-frame DCT. Tests comparing invisibility, robustness, and computational time were conducted for determining the feasibility of the proposed method. The results showed that the proposed method had the faster and the more robust performance than previous studies.

...read moreread less

Posted Content•

Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach

[...]

Haichuan Yang¹, Shupeng Gui¹, Yuhao Zhu¹, Ji Liu•Institutions (1)

University of Rochester¹

14 Oct 2019-arXiv: Learning

TL;DR: This paper proposes a framework to jointly prune and quantize the DNNs automatically according to a target model size without using any hyper-parameters to manually set the compression ratio for each layer.

...read moreread less

Abstract: Deep Neural Networks (DNNs) are applied in a wide range of usecases. There is an increased demand for deploying DNNs on devices that do not have abundant resources such as memory and computation units. Recently, network compression through a variety of techniques such as pruning and quantization have been proposed to reduce the resource requirement. A key parameter that all existing compression techniques are sensitive to is the compression ratio (e.g., pruning sparsity, quantization bitwidth) of each layer. Traditional solutions treat the compression ratios of each layer as hyper-parameters, and tune them using human heuristic. Recent researchers start using black-box hyper-parameter optimizations, but they will introduce new hyper-parameters and have efficiency issue. In this paper, we propose a framework to jointly prune and quantize the DNNs automatically according to a target model size without using any hyper-parameters to manually set the compression ratio for each layer. In the experiments, we show that our framework can compress the weights data of ResNet-50 to be 836$\times$ smaller without accuracy loss on CIFAR-10, and compress AlexNet to be 205$\times$ smaller without accuracy loss on ImageNet classification.

...read moreread less

Journal Article•DOI•

Wavelet compression of off-axis digital holograms using real/imaginary and amplitude/phase parts

[...]

Pavel A. Cheremkhin¹, E A Kurbatova¹•Institutions (1)

National Research Nuclear University MEPhI¹

17 May 2019-Scientific Reports

TL;DR: Wavelet compression of amplitude/phase and real/imaginary parts of the Fourier spectrum of filtered off-axis digital holograms is compared and the combination of frequency filtering, compression of the obtained spectral components, and extra compression ofThe wavelet decomposition coefficients by threshold processing and quantization is analyzed.

...read moreread less

Abstract: Compression of digital holograms allows one to store, transmit, and reconstruct large sets of holographic data. There are many digital image compression methods, and usually wavelets are used for this task. However, many significant specialties exist for compression of digital holograms. As a result, it is preferential to use a set of methods that includes filtering, scalar and vector quantization, wavelet processing, etc. These methods in conjunction allow one to achieve an acceptable quality of reconstructed images and significant compression ratios. In this paper, wavelet compression of amplitude/phase and real/imaginary parts of the Fourier spectrum of filtered off-axis digital holograms is compared. The combination of frequency filtering, compression of the obtained spectral components, and extra compression of the wavelet decomposition coefficients by threshold processing and quantization is analyzed. Computer-generated and experimentally recorded digital holograms are compressed. The quality of the obtained reconstructed images is estimated. The results demonstrate the possibility of compression ratios of 380 using real/imaginary parts. Amplitude/phase compression allows ratios that are a factor of 2–4 lower for obtaining similar quality of reconstructed objects.

...read moreread less

Journal Article•DOI•

Efficient Lossy Compression for Scientific Data Based on Pointwise Relative Error Bound

[...]

Sheng Di¹, Dingwen Tao², Xin Liang³, Franck Cappello¹•Institutions (3)

Argonne National Laboratory¹, University of Alabama², University of California, Riverside³

01 Feb 2019-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This work proposes two optimized lossy compression strategies under a state-of-the-art three-staged compression framework (prediction + quantization + entropy-encoding) and demonstrates that the two strategies exhibit the best compression qualities on different types of data sets respectively.

...read moreread less

Abstract: An effective data compressor is becoming increasingly critical to today's scientific research, and many lossy compressors are developed in the context of absolute error bounds Based on physical/chemical definitions of simulation fields or multiresolution demand, however, many scientific applications need to compress the data with a pointwise relative error bound (ie, the smaller the data value, the smaller the compression error to tolerate) To this end, we propose two optimized lossy compression strategies under a state-of-the-art three-staged compression framework (prediction + quantization + entropy-encoding) The first strategy (called block-based strategy) splits the data set into many small blocks and computes an absolute error bound for each block, so it is particularly suitable for the data with relatively high consecutiveness in space The second strategy (called multi-threshold-based strategy) splits the whole value range into multiple groups with exponentially increasing thresholds and performs the compression in each group separately, which is particularly suitable for the data with a relatively large value range and spiky value changes We implement the two strategies rigorously and evaluate them comprehensively by using two scientific applications which both require lossy compression with point-wise relative error bound Experiments show that the two strategies exhibit the best compression qualities on different types of data sets respectively The compression ratio of our lossy compressor is higher than that of other state-of-the-art compressors by 172–618 percent on the climate simulation data and 30–210 percent on the N-body simulation data, with the same relative error bound and without degradation of the overall visualization effect of the entire data

...read moreread less

Journal Article•DOI•

Accelerating generalized linear models with MLWeaving: a one-size-fits-all system for any-precision learning

[...]

Zeke Wang¹, Kaan Kara¹, Hantian Zhang¹, Gustavo Alonso¹, Onur Mutlu¹, Ce Zhang¹ - Show less +2 more•Institutions (1)

ETH Zurich¹

01 Mar 2019

TL;DR: ML-Weaving as mentioned in this paper is a data structure and hardware acceleration technique intended to speed up learning of generalized linear models over low precision data, and it provides a compact inmemory representation that enables the retrieval of data at any level of precision.

...read moreread less

Abstract: Learning from the data stored in a database is an important function increasingly available in relational engines. Methods using lower precision input data are of special interest given their overall higher efficiency. However, in databases, these methods have a hidden cost: the quantization of the real value into a smaller number is an expensive step. To address this issue, we present ML-Weaving, a data structure and hardware acceleration technique intended to speed up learning of generalized linear models over low precision data. MLWeaving provides a compact in-memory representation that enables the retrieval of data at any level of precision. MLWeaving also provides a highly efficient implementation of stochastic gradient descent on FPGAs and enables the dynamic tuning of precision, instead of using a fixed precision level during learning. Experimental results show that MLWeaving converges up to 16 x faster than low-precision implementations of first-order methods on CPUs.

...read moreread less

Journal Article•DOI•

Huffman-code based retrieval for encrypted JPEG images

[...]

Haihua Liang¹, Haihua Liang², Xinpeng Zhang¹, Hang Cheng³•Institutions (3)

Shanghai University¹, Changshu Institute of Technology², Fuzhou University³

01 May 2019-Journal of Visual Communication and Image Representation

TL;DR: Experimental results show that the proposed scheme ensures confidentiality, integrity and format compatibility, while image retrieval of different quality factors is still effective.

...read moreread less

Journal Article•DOI•

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

[...]

Lijun Zhao¹, Huihui Bai², Anhong Wang¹, Yao Zhao²•Institutions (2)

Taiyuan University of Science and Technology¹, Beijing Jiaotong University²

01 Aug 2019-Journal of Visual Communication and Image Representation

TL;DR: An end-to-end image compression framework based on convolutional neural network to resolve the problem of non-differentiability of the quantization function in the standard codec and an advanced learning algorithm is proposed to train the deep neural networks for compression.

...read moreread less

Proceedings Article•DOI•

Deep Spherical Quantization for Image Search

[...]

Sepehr Eghbali¹, Ladan Tahvildari¹•Institutions (1)

University of Waterloo¹

15 Jun 2019

TL;DR: Deep Spherical Quantization (DSQ) is put forward, a novel method to make deep convolutional neural networks generate supervised and compact binary codes for efficient image search and an easy-to-implement extension of the quantization technique that enforces sparsity on the codebooks is introduced.

...read moreread less

Abstract: Hashing methods, which encode high-dimensional images with compact discrete codes, have been widely applied to enhance large-scale image retrieval. In this paper, we put forward Deep Spherical Quantization (DSQ), a novel method to make deep convolutional neural networks generate supervised and compact binary codes for efficient image search. Our approach simultaneously learns a mapping that transforms the input images into a low-dimensional discriminative space, and quantizes the transformed data points using multi-codebook quantization. To eliminate the negative effect of norm variance on codebook learning, we force the network to L_2 normalize the extracted features and then quantize the resulting vectors using a new supervised quantization technique specifically designed for points lying on a unit hypersphere. Furthermore, we introduce an easy-to-implement extension of our quantization technique that enforces sparsity on the codebooks. Extensive experiments demonstrate that DSQ and its sparse variant can generate semantically separable compact binary codes outperforming many state-of-the-art image retrieval methods on three benchmarks.

...read moreread less

Journal Article•DOI•

Joint image encryption and compression schemes based on 16 × 16 DCT

[...]

Peiya Li¹, Peiya Li², Kwok-Tung Lo¹•Institutions (2)

Hong Kong Polytechnic University¹, Jinan University²

01 Jan 2019-Journal of Visual Communication and Image Representation

TL;DR: Two new joint encryption and compression schemes are proposed, where one scheme emphasizes compression performance, another highlights protection performance, and performance evaluations using various criteria show that the first scheme has better compression efficiency, while the second scheme hasbetter defense ability against the statistical attack.

...read moreread less

Journal Article•DOI•

Fast Computation of Integer DCT-V, DCT-VIII, and DST-VII for Video Coding

[...]

Woonsung Park¹, Bumshik Lee², Munchurl Kim¹•Institutions (2)

KAIST¹, Chosun University²

22 Feb 2019-IEEE Transactions on Image Processing

TL;DR: This paper presents fast computation methods of N-point D CT-V and DCT-VIII, which reduce the number of addition and multiplication operations by 38% and 80.3%, respectively, in average, compared to the original JEM.

...read moreread less

Abstract: Joint exploration model (JEM) reference codecs of ISO/IEC and ITU-T utilize multiple types of integer transforms based on DCT and DST of various transform sizes for intra- and inter-predictive coding, which has brought a significant improvement in coding efficiency. JEM adopts three types of integer DCTs (DCT-II, DCT-V, and DCT-VIII), and two types of integer DSTs (DST-I and DST-VII). The fast computations of Integer DCT-II and DST-I are well known, but few studies have been performed for the other types such as DCT-V, DCT-VIII, and DST-VII for all transform sizes. In this paper, we present fast computation methods of N-point DCT-V and DCT-VIII. For this, we first decompose the DCT-VIII into a preprocessing matrix, the DST-VII and a post-processing matrix to quickly compute it by using the linear relation between DCT-VIII and DST-VII. Then, we approximate integer kernels of N = 4, 8, 16, and 32 for DCT-V, DCT-VIII, and DST-VII with norm scaling and bit-shift to be compatible with quantization in each stage of multiplications between decomposed matrices for video coding. In various experiments, the proposed fast computation methods have shown to effectively reduce the total complexity of the matrix operations with little loss in BDBR performance. In particular, our methods reduce the number of addition and multiplication operations by 38% and 80.3%, respectively, in average, compared to the original JEM.

...read moreread less

Proceedings Article•DOI•

Unsupervised Neural Quantization for Compressed-Domain Similarity Search

[...]

Stanislav Morozov¹, Artem Babenko¹•Institutions (1)

Yandex¹

01 Oct 2019

TL;DR: In this article, a DNN architecture based on multi-codebook quantization is proposed for unsupervised visual descriptors compression, which is designed to incorporate both fast data encoding and efficient distances computation via lookup tables.

...read moreread less

Abstract: We tackle the problem of unsupervised visual descriptors compression, which is a key ingredient of large-scale image retrieval systems. While the deep learning machinery has benefited literally all computer vision pipelines, the existing state-of-the-art compression methods employ shallow architectures, and we aim to close this gap by our paper. In more detail, we introduce a DNN architecture for the unsupervised compressed-domain retrieval, based on multi-codebook quantization. The proposed architecture is designed to incorporate both fast data encoding and efficient distances computation via lookup tables. We demonstrate the exceptional advantage of our scheme over existing quantization approaches on several datasets of visual descriptors via outperforming the previous state-of-the-art by a large margin.

...read moreread less

Journal Article•DOI•

A modified image selective encryption-compression technique based on 3D chaotic maps and arithmetic coding

[...]

Saad M. Darwish¹•Institutions (1)

Information Technology Institute¹

01 Jul 2019-Multimedia Tools and Applications

TL;DR: A new approach is suggested in this paper for partial image encryption compression that adopts chaotic 3D cat map to de-correlate relations among pixels in conjunction with an adaptive thresholding technique that is utilized as a lossy compression technique instead of using complex quantization techniques.

...read moreread less

Abstract: The advances in digital image processing and communications have created a great demand for real–time secure image transmission over the networks. However, the development of effective, fast and secure dependent image compression encryption systems are still a research problem as the intrinsic features of images such as bulk data capacity and high correlation among pixels hinds the use of the traditional joint encryption compression methods. A new approach is suggested in this paper for partial image encryption compression that adopts chaotic 3D cat map to de-correlate relations among pixels in conjunction with an adaptive thresholding technique that is utilized as a lossy compression technique instead of using complex quantization techniques and also as a substitution technique to increase the security of the cipher image. The proposed scheme is based on employing both of lossless compression with encryption on the most significant part of the image after contourlet transform. However the least significant parts are lossy compressed by employing a simple thresholding rule and arithmetic coding to render the image totally unrecognizable. Due to the weakness of 3D cat map to chosen plain text attack, the suggested scheme incorporates a mechanism to generate random key depending on the contents of the image (context key). Several experiments were done on benchmark images to insure the validity of the proposed technique. The compression analysis and security outcomes indicate that the suggested technique is an efficacious and safe for real time image’s applications.

...read moreread less

Journal Article•DOI•

A Comprehensive Benchmark for Single Image Compression Artifacts Reduction

[...]

Jiaying Liu, Dong Liu, Wenhan Yang, Sifeng Xia, Xiaoshuai Zhang, Yuanying Dai - Show less +2 more

09 Sep 2019-arXiv: Image and Video Processing

TL;DR: A comprehensive study and evaluation of existing single image compression artifacts removal algorithms, using a new 4K resolution benchmark including diversified foreground objects and background scenes with rich structures, is presented in this paper.

...read moreread less

Abstract: We present a comprehensive study and evaluation of existing single image compression artifacts removal algorithms, using a new 4K resolution benchmark including diversified foreground objects and background scenes with rich structures, called Large-scale Ideal Ultra high definition 4K (LIU4K) benchmark. Compression artifacts removal, as a common post-processing technique, aims at alleviating undesirable artifacts such as blockiness, ringing, and banding caused by quantization and approximation in the compression process. In this work, a systematic listing of the reviewed methods is presented based on their basic models (handcrafted models and deep networks). The main contributions and novelties of these methods are highlighted, and the main development directions, including architectures, multi-domain sources, signal structures, and new targeted units, are summarized. Furthermore, based on a unified deep learning configuration (i.e. same training data, loss function, optimization algorithm, etc.), we evaluate recent deep learning-based methods based on diversified evaluation measures. The experimental results show the state-of-the-art performance comparison of existing methods based on both full-reference, non-reference and task-driven metrics. Our survey would give a comprehensive reference source for future research on single image compression artifacts removal and inspire new directions of the related fields.

...read moreread less

Journal Article•DOI•

Improved Cohort Intelligence—A high capacity, swift and secure approach on JPEG image steganography

[...]

Dipti Kapoor Sarmah¹, Anand J. Kulkarni², Anand J. Kulkarni¹•Institutions (2)

Symbiosis International University¹, University of Windsor²

01 Apr 2019

TL;DR: Modified version of cohort intelligence (CI) algorithm, referred to as Improved Cohort Intelligence (CI), was considered as a cryptography technique and implemented to generate optimized cipher text to propose a reversible data hiding scheme.

...read moreread less

Abstract: In the recent high level of information security was attained by combining the concepts of cryptography, steganography along with the nature inspired optimization algorithms. However, in today's world computational speed plays a vital role for the success of any scientific method. The optimization algorithms, such as cohort Intelligence with Cognitive Computing (CICC) and Modified-Multi Random Start Local Search (M-MRSLS) were already implemented and applied for JPEG image steganography for 8 × 8 as well as 16 × 16 quantization table, respectively. Although results were satisfactory in terms of image quality and capacity, the computational time was high for most of the test images. To overcome this challenge, the paper proposes modified version of cohort intelligence (CI) algorithm referred to as Improved Cohort Intelligence (CI). The Improved CI algorithm was considered as a cryptography technique and implemented to generate optimized cipher text. Improved CI algorithm was further employed for JPEG image steganography to propose a reversible data hiding scheme. Experimentation was done on grey scale image, of size 256 × 256; both for 8 × 8 and 16 × 16 quantization table. Results validation of the proposed work exhibited very encouraging improvements in the computational cost.

...read moreread less

Collapse