scispace - formally typeset
Search or ask a question

Showing papers on "Quantization (image processing) published in 2019"


Proceedings ArticleDOI
15 Jun 2019
TL;DR: The experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency, accuracy of benign images after defense, and processing time per image.
Abstract: Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely “feature distillation”, to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a twostep method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from ∼ 20% to ∼ 90% on adversarial examples), accuracy of benign images after defense (≤ 1% accuracy degradation), and processing time per image (∼ 259× Speedup). Moreover, our solution also can provide the best defense efficiency (∼ 60% accuracy) against the latest BPDA attack with least accuracy reduction (∼ 1%) on benign images among all other input-transformation based defense methods.

185 citations


Journal ArticleDOI
22 Feb 2019-PLOS ONE
TL;DR: Using the invariant Haralick features, an image pattern will give the same texture feature values independent of image quantization, by redefining the gray-level co-occurrence matrix (GLCM) as a discretized probability density function, it becomes asymptotically invariant to the quantization.
Abstract: Haralick texture features are common texture descriptors in image analysis. To compute the Haralick features, the image gray-levels are reduced, a process called quantization. The resulting features depend heavily on the quantization step, so Haralick features are not reproducible unless the same quantization is performed. The aim of this work was to develop Haralick features that are invariant to the number of quantization gray-levels. By redefining the gray-level co-occurrence matrix (GLCM) as a discretized probability density function, it becomes asymptotically invariant to the quantization. The invariant and original features were compared using logistic regression classification to separate two classes based on the texture features. Classifiers trained on the invariant features showed higher accuracies, and had similar performance when training and test images had very different quantizations. In conclusion, using the invariant Haralick features, an image pattern will give the same texture feature values independent of image quantization.

97 citations


Journal ArticleDOI
TL;DR: The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations, and assess the coding efficiency of each profile extensively through subjective assessments, using 24 naïve subjects to evaluate 20 images and objective evaluations.
Abstract: Standards play an important role in providing a common set of specifications and allowing inter-operability between devices and systems. Until recently, no standard for high-dynamic-range (HDR) image coding had been adopted by the market, and HDR imaging relies on proprietary and vendor-specific formats which are unsuitable for storage or exchange of such images. To resolve this situation, the JPEG Committee is developing a new coding standard called JPEG XT that is backward compatible to the popular JPEG compression, allowing it to be implemented using standard 8-bit JPEG coding hardware or software. In this paper, we present design principles and technical details of JPEG XT. It is based on a two-layer design, a base layer containing a low-dynamic-range image accessible to legacy implementations, and an extension layer providing the full dynamic range. The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations. We assess the coding efficiency of each profile extensively through subjective assessments, using 24 naive subjects to evaluate 20 images, and objective evaluations, using 106 images with five different tone-mapping operators and at 100 different bit rates. The objective results (based on benchmarking with subjective scores) demonstrate that JPEG XT can encode HDR images at bit rates varying from 1.1 to 1.9 bit/pixel for estimated mean opinion score (MOS) values above 4.5 out of 5, which is considered as fully transparent in many applications. This corresponds to 23-times bitstream reduction compared to lossless OpenEXR PIZ compression.

65 citations


Journal ArticleDOI
TL;DR: The experimental results on two publicly available image databases have shown that the proposed method not only has satisfied the needs of invisibility but also has better performance in terms of robustness and real-time feature, which show the proposedmethod has both advantages of spatial domain and frequency domain.
Abstract: In this paper, a novel spatial domain color image watermarking technique is proposed to rapidly and effectively protect the copyright of the color image. First, the direct current (DC) coefficient of 2D-DFT obtained in the spatial domain is discussed, and the relationship between the change of each pixel in the spatial domain and the change of the DC coefficient in the Fourier transform is proved. Then, the DC coefficient is used to embed and extract watermark in the spatial domain by the proposed quantization technique. The novelties of this paper include three points: 1) the DC coefficient of 2D-DFT is obtained in the spatial domain without of the true 2D-DFT; 2) the relationship between the change of each pixel in the image block and the change of the DC coefficient of 2D-DFT is found, and; 3) the proposed method has the short running time and strong robustness. The experimental results on two publicly available image databases (CVG-UGR and USC-SIPI) have shown that the proposed method not only has satisfied the needs of invisibility but also has better performance in terms of robustness and real-time feature, which show the proposed method has both advantages of spatial domain and frequency domain.

58 citations


Journal ArticleDOI
TL;DR: The experimental results validate the effectiveness of the proposed framework in terms of BER and embedding capacity compared to other state-of-the-art methods and find potential application in prevention of patient identity theft in e-health applications.
Abstract: In this paper, an improved wavelet based medical image watermarking algorithm is proposed. Initially, the proposed technique decomposes the cover medical image into ROI and NROI regions and embedding three different watermarks into the non-region of interest (NROI) part of the transformed DWT cover image for compact and secure medical data transmission in E-health environment. In addition, the method addressing the problem of channel noise distortion may lead to faulty watermark by applying error correcting codes (ECCs) before embedding them into the cover image. Further, the bit error rates (BER) performance of the proposed method is determined for different kind of attacks including ‘Checkmark’ attacks. Experimental results indicate that the Turbo code performs better than BCH (Bose-Chaudhuri-Hochquenghem) error correction code. Furthermore, the experimental results validate the effectiveness of the proposed framework in terms of BER and embedding capacity compared to other state-of-the-art methods. Therefore, the proposed method finds potential application in prevention of patient identity theft in e-health applications.

57 citations


Journal ArticleDOI
TL;DR: This work proposes an efficient strategy to compress the VGG16 model by introducing global average pooling, performing iterative pruning on the filters with the proposed order-deciding scheme in order to prune more efficiently, applying truncated SVD to the fully-connected layer, and performing quantization.

54 citations


Journal ArticleDOI
TL;DR: A notion called error threshold is introduced to theoretically analyze the performance of the proposed SSAES and DQAQT watermarking schemes, showing that they outperform the state-of-the-art methods in terms of imperceptibility, robustness, computational cost, and adaptability.
Abstract: Watermarking plays an important role in identifying the copyright of an image and related issues. The state-of-the-art watermark embedding schemes, spread spectrum and quantization, suffer from host signal interference (HSI) and scaling attacks, respectively. Both of them use a fixed embedding parameter, which is difficult to take both robustness and imperceptibility into account for all images. This paper solves the problems by proposing two novel blind watermarking schemes: a spread spectrum scheme with adaptive embedding strength (SSAES) and a differential quantization scheme with adaptive quantization threshold (DQAQT). Their adaptiveness comes from the proposed adaptive embedding strategy (AEP), which maximizes the embedding strength or quantization threshold by guaranteeing the peak signal-to-noise ratio (PSNR) of the host image after embedding the watermark, and strikes the balance between robustness and imperceptibility. SSAES is HSI free by factoring in the priori knowledge about HSI. In DQAQT, an effective quantization mode is proposed to resist scaling attacks by utilizing the difference between two selected DCT coefficients with high stability. Both SSAES and DQAQT can be easily applied to other watermarking frameworks. We introduce a notion called error threshold to theoretically analyze the performance of our proposed methods in details. The experimental results consistently demonstrate that SSAES and DQAQT outperform the state-of-the-art methods in terms of imperceptibility, robustness, computational cost, and adaptability.

53 citations


Journal ArticleDOI
TL;DR: A reversed-pruning strategy is proposed which reduces the number of parameters of AlexNet by a factor of 13× without accuracy loss on the ImageNet dataset and an efficient storage technique, which aims for the reduction of the whole overhead cache of the convolutional layer and the fully connected layer, is presented.
Abstract: Field programmable gate array (FPGA) is widely considered as a promising platform for convolutional neural network (CNN) acceleration. However, the large numbers of parameters of CNNs cause heavy computing and memory burdens for FPGA-based CNN implementation. To solve this problem, this paper proposes an optimized compression strategy, and realizes an accelerator based on FPGA for CNNs. Firstly, a reversed-pruning strategy is proposed which reduces the number of parameters of AlexNet by a factor of 13× without accuracy loss on the ImageNet dataset. Peak-pruning is further introduced to achieve better compressibility. Moreover, quantization gives another 4× with negligible loss of accuracy. Secondly, an efficient storage technique, which aims for the reduction of the whole overhead cache of the convolutional layer and the fully connected layer, is presented respectively. Finally, the effectiveness of the proposed strategy is verified by an accelerator implemented on a Xilinx ZCU104 evaluation board. By improving existing pruning techniques and the storage format of sparse data, we significantly reduce the size of AlexNet by 28×, from 243 MB to 8.7 MB. In addition, the overall performance of our accelerator achieves 9.73 fps for the compressed AlexNet. Compared with the central processing unit (CPU) and graphics processing unit (GPU) platforms, our implementation achieves 182.3× and 1.1× improvements in latency and throughput, respectively, on the convolutional (CONV) layers of AlexNet, with an 822.0× and 15.8× improvement for energy efficiency, separately. This novel compression strategy provides a reference for other neural network applications, including CNNs, long short-term memory (LSTM), and recurrent neural networks (RNNs).

53 citations


Posted Content
TL;DR: A series of novel methodological changes are introduced that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary) and investigate the extent to which network binarization and knowledge distillation can be combined.
Abstract: Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin. However, under low memory and limited computational power constraints, the accuracy on the same problems drops considerable. In this paper, we propose a series of techniques that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary). We evaluate the proposed improvements on two diverse tasks: fine-grained recognition (human pose estimation) and large-scale image recognition (ImageNet classification). Specifically, we introduce a series of novel methodological changes including: (a) more appropriate activation functions, (b) reverse-order initialization, (c) progressive quantization, and (d) network stacking and show that these additions improve existing state-of-the-art network binarization techniques, significantly. Additionally, for the first time, we also investigate the extent to which network binarization and knowledge distillation can be combined. When tested on the challenging MPII dataset, our method shows a performance improvement of more than 4% in absolute terms. Finally, we further validate our findings by applying the proposed techniques for large-scale object recognition on the Imagenet dataset, on which we report a reduction of error rate by 4%.

48 citations


Posted Content
TL;DR: It is concluded that structured pruning has a greater potential compared to non-structured pruning and the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases.
Abstract: Large deep neural network (DNN) models pose the key challenge to energy efficiency due to the significantly higher energy consumption of off-chip DRAM accesses than arithmetic or SRAM operations. It motivates the intensive research on model compression with two main approaches. Weight pruning leverages the redundancy in the number of weights and can be performed in a non-structured, which has higher flexibility and pruning rate but incurs index accesses due to irregular weights, or structured manner, which preserves the full matrix structure with lower pruning rate. Weight quantization leverages the redundancy in the number of bits in weights. Compared to pruning, quantization is much more hardware-friendly, and has become a "must-do" step for FPGA and ASIC implementations. This paper provides a definitive answer to the question for the first time. First, we build ADMM-NN-S by extending and enhancing ADMM-NN, a recently proposed joint weight pruning and quantization framework. Second, we develop a methodology for fair and fundamental comparison of non-structured and structured pruning in terms of both storage and computation efficiency. Our results show that ADMM-NN-S consistently outperforms the prior art: (i) it achieves 348x, 36x, and 8x overall weight pruning on LeNet-5, AlexNet, and ResNet-50, respectively, with (almost) zero accuracy loss; (ii) we demonstrate the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases. These results provide a strong baseline and credibility of our study. Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency. Thus, we conclude that non-structured pruning is considered harmful. We urge the community not to continue the DNN inference acceleration for non-structured sparsity.

47 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed lossy compression scheme achieves better rate-distortion performance than some of the state-of-the-art schemes.
Abstract: In this paper, a novel lossy compression scheme for encrypted image based on image inpainting is proposed. In order to maintain confidentiality, the content owner encrypts the original image through a modulo-256 addition encryption and block permutation to mask image content. Then, the third party, such as a cloud server, can compress the selective encrypted image before transmitting to the receiver. During compression, encrypted blocks are categorized into four sets corresponding to different complexity degrees in plaintext domain without the loss of security. By allocating various bit rates to the encrypted blocks from different sets, flexible compression can be achieved with difference quantization. After parsing and decoding the compressed bit stream, the receiver first recovers partial encrypted pixels and then decrypts them. The other missing pixels are further recovered with the assistance of image inpainting based on a total variation model, and the final reconstructed image can be produced. Experimental results demonstrate that the proposed scheme achieves better rate-distortion performance than some of the state-of-the-art schemes.

Journal ArticleDOI
01 Aug 2019
TL;DR: Experimental results prove that the proposed method is superior to the contemporary AMBTC based data hiding methods in term of data hiding capacity and has comparable image quality.
Abstract: In this paper, we propose a new Absolute Moment Block Truncation Coding (AMBTC) based data hiding scheme using hamming distance and pixel value differencing methods. The proposed method firstly pre-processes the original image using a smoothing filter so that its quality can be maintained after AMBTC compression. It then applies AMBTC compression on the processed image and uses two thresholds to categorize the image blocks into three categories namely Smooth, Less_Complex, and Highly_Complex blocks so that the complex block can also be used to embed the secret data. The secret data is then embedded into the smooth blocks using simple replacement strategy. Further, the proposed method embeds 8 bits of secret data into the bit plane of Less_Complex blocks using hamming distance calculation and few bits into the quantization levels of Highly_Complex blocks using pixel value differencing (PVD) method to increase the data hiding capacity without having any major impact on image quality. Thus, the proposed method additionally embeds on average 6 more bits in complex blocks. Experimental results also prove that the proposed method is superior to the contemporary AMBTC based data hiding methods in term of data hiding capacity and has comparable image quality.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: In this article, the convolutional filters are binarized in residual blocks and a learnable weight for each binary filter is adopted to reduce over-parametrization and large amounts of redundancy.
Abstract: Deep convolutional neural networks (DCNNs) have recently demonstrated high-quality results in single-image super-resolution (SR). DCNNs often suffer from over-parametrization and large amounts of redundancy, which results in inefficient inference and high memory usage, preventing massive applications on mobile devices. As a way to significantly reduce model size and computation time, binarized neural network has only been shown to excel on semantic-level tasks such as image classification and recognition. However, little effort of network quantization has been spent on image enhancement tasks like SR, as network quantization is usually assumed to sacrifice pixel-level accuracy. In this work, we explore an network-binarization approach for SR tasks without sacrificing much reconstruction accuracy. To achieve this, we binarize the convolutional filters in only residual blocks, and adopt a learnable weight for each binary filter. We evaluate this idea on several state-of-the-art DCNN-based architectures, and show that binarized SR networks achieve comparable qualitative and quantitative results as their real-weight counterparts. Moreover, the proposed binarized strategy could help reduce model size by 80% when applying on SRResNet, and could potentially speed up inference by 5×.

Journal ArticleDOI
TL;DR: This work proposes to jointly dequantize and contrast-enhanceJPEG images captured in poor lighting conditions in a single graph-signal restoration framework, adopting accelerated proximal gradient (APG) algorithms in the transform domain, with backtracking line search for further speedup.
Abstract: JPEG images captured in poor lighting conditions suffer from both low luminance contrast and coarse quantization artifacts due to lossy compression. Performing dequantization and contrast enhancement in separate back-to-back steps would amplify the residual compression artifacts, resulting in low visual quality. Leveraging on recent development in graph signal processing (GSP), we propose to jointly dequantize and contrast-enhance such images in a single graph-signal restoration framework. Specifically, we separate each observed pixel patch into illumination and reflectance via Retinex theory, where we define generalized smoothness prior and signed graph smoothness prior according to their respective unique signal characteristics. Given only a transform-coded image patch, we compute robust edge weights for each graph via low-pass filtering in the dual graph domain. We compute the illumination and reflectance components for each patch alternately, adopting accelerated proximal gradient (APG) algorithms in the transform domain, with backtracking line search for further speedup. Experimental results show that our generated images outperform the state-of-the-art schemes noticeably in the subjective quality evaluation.

Journal ArticleDOI
TL;DR: The results suggest that the proposed cross-view multi-lateral filtering scheme to improve the quality of compressed depth maps/videos within the framework of asymmetric multi-view video with depth compression outperforms state-of-the-art filters and is suitable for use in multi- view color plus depth-based interaction- and remote-oriented applications.
Abstract: Multi-view depth is crucial for describing positioning information in 3D space for virtual reality, free viewpoint video, and other interaction- and remote-oriented applications. However, in cases of lossy compression for bandwidth limited remote applications, the quality of multi-view depth video suffers from quantization errors, leading to the generation of obvious artifacts in consequent virtual view rendering during interactions. Considerable efforts must be made to properly address these artifacts. In this paper, we propose a cross-view multi-lateral filtering scheme to improve the quality of compressed depth maps/videos within the framework of asymmetric multi-view video with depth compression. Through this scheme, a distorted depth map is enhanced via non-local candidates selected from current and neighboring viewpoints of different time-slots. Specifically, these candidates are clustered into a macro super pixel denoting the physical and semantic cross-relationships of the cross-view, spatial and temporal priors. The experimental results show that gains from static depth maps and dynamic depth videos can be obtained from PSNR and SSIM metrics, respectively. In subjective evaluations, even object contours are recovered from a compressed depth video. We also verify our method via several practical applications. For these verifications, artifacts on object contours are properly managed for the development of interactive video and discontinuous object surfaces are restored for 3D modeling. Our results suggest that the proposed filter outperforms state-of-the-art filters and is suitable for use in multi-view color plus depth-based interaction- and remote-oriented applications.

Journal ArticleDOI
01 Jun 2019
TL;DR: A general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling is proposed, showing that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.
Abstract: Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.

Journal ArticleDOI
TL;DR: A novel watermarking method based on a discrete cosine transform (DCT) which guarantees robustness and low computational complexity is proposed, which had the faster and the more robust performance than previous studies.
Abstract: In many studies related to watermarking, spatial-domain methods have a relatively low information-hiding capacity and limited robustness, and transform-domain methods are not applicable in real-time processes because of their considerably high computational time. In this paper, we propose a novel watermarking method based on a discrete cosine transform (DCT), which guarantees robustness and low computational complexity. First, we calculated the DCT coefficient of a specific location. Then, a variation value was calculated according to the embedding bits and quantization steps to modify the coefficient. Last, we embedded watermark bits by directly modifying the pixel values without full-frame DCT. Tests comparing invisibility, robustness, and computational time were conducted for determining the feasibility of the proposed method. The results showed that the proposed method had the faster and the more robust performance than previous studies.

Posted Content
TL;DR: This paper proposes a framework to jointly prune and quantize the DNNs automatically according to a target model size without using any hyper-parameters to manually set the compression ratio for each layer.
Abstract: Deep Neural Networks (DNNs) are applied in a wide range of usecases. There is an increased demand for deploying DNNs on devices that do not have abundant resources such as memory and computation units. Recently, network compression through a variety of techniques such as pruning and quantization have been proposed to reduce the resource requirement. A key parameter that all existing compression techniques are sensitive to is the compression ratio (e.g., pruning sparsity, quantization bitwidth) of each layer. Traditional solutions treat the compression ratios of each layer as hyper-parameters, and tune them using human heuristic. Recent researchers start using black-box hyper-parameter optimizations, but they will introduce new hyper-parameters and have efficiency issue. In this paper, we propose a framework to jointly prune and quantize the DNNs automatically according to a target model size without using any hyper-parameters to manually set the compression ratio for each layer. In the experiments, we show that our framework can compress the weights data of ResNet-50 to be 836$\times$ smaller without accuracy loss on CIFAR-10, and compress AlexNet to be 205$\times$ smaller without accuracy loss on ImageNet classification.

Journal ArticleDOI
TL;DR: Wavelet compression of amplitude/phase and real/imaginary parts of the Fourier spectrum of filtered off-axis digital holograms is compared and the combination of frequency filtering, compression of the obtained spectral components, and extra compression ofThe wavelet decomposition coefficients by threshold processing and quantization is analyzed.
Abstract: Compression of digital holograms allows one to store, transmit, and reconstruct large sets of holographic data. There are many digital image compression methods, and usually wavelets are used for this task. However, many significant specialties exist for compression of digital holograms. As a result, it is preferential to use a set of methods that includes filtering, scalar and vector quantization, wavelet processing, etc. These methods in conjunction allow one to achieve an acceptable quality of reconstructed images and significant compression ratios. In this paper, wavelet compression of amplitude/phase and real/imaginary parts of the Fourier spectrum of filtered off-axis digital holograms is compared. The combination of frequency filtering, compression of the obtained spectral components, and extra compression of the wavelet decomposition coefficients by threshold processing and quantization is analyzed. Computer-generated and experimentally recorded digital holograms are compressed. The quality of the obtained reconstructed images is estimated. The results demonstrate the possibility of compression ratios of 380 using real/imaginary parts. Amplitude/phase compression allows ratios that are a factor of 2–4 lower for obtaining similar quality of reconstructed objects.

Journal ArticleDOI
TL;DR: This work proposes two optimized lossy compression strategies under a state-of-the-art three-staged compression framework (prediction + quantization + entropy-encoding) and demonstrates that the two strategies exhibit the best compression qualities on different types of data sets respectively.
Abstract: An effective data compressor is becoming increasingly critical to today's scientific research, and many lossy compressors are developed in the context of absolute error bounds Based on physical/chemical definitions of simulation fields or multiresolution demand, however, many scientific applications need to compress the data with a pointwise relative error bound (ie, the smaller the data value, the smaller the compression error to tolerate) To this end, we propose two optimized lossy compression strategies under a state-of-the-art three-staged compression framework (prediction + quantization + entropy-encoding) The first strategy (called block-based strategy) splits the data set into many small blocks and computes an absolute error bound for each block, so it is particularly suitable for the data with relatively high consecutiveness in space The second strategy (called multi-threshold-based strategy) splits the whole value range into multiple groups with exponentially increasing thresholds and performs the compression in each group separately, which is particularly suitable for the data with a relatively large value range and spiky value changes We implement the two strategies rigorously and evaluate them comprehensively by using two scientific applications which both require lossy compression with point-wise relative error bound Experiments show that the two strategies exhibit the best compression qualities on different types of data sets respectively The compression ratio of our lossy compressor is higher than that of other state-of-the-art compressors by 172–618 percent on the climate simulation data and 30–210 percent on the N-body simulation data, with the same relative error bound and without degradation of the overall visualization effect of the entire data

Journal ArticleDOI
Zeke Wang1, Kaan Kara1, Hantian Zhang1, Gustavo Alonso1, Onur Mutlu1, Ce Zhang1 
01 Mar 2019
TL;DR: ML-Weaving as mentioned in this paper is a data structure and hardware acceleration technique intended to speed up learning of generalized linear models over low precision data, and it provides a compact inmemory representation that enables the retrieval of data at any level of precision.
Abstract: Learning from the data stored in a database is an important function increasingly available in relational engines. Methods using lower precision input data are of special interest given their overall higher efficiency. However, in databases, these methods have a hidden cost: the quantization of the real value into a smaller number is an expensive step. To address this issue, we present ML-Weaving, a data structure and hardware acceleration technique intended to speed up learning of generalized linear models over low precision data. MLWeaving provides a compact in-memory representation that enables the retrieval of data at any level of precision. MLWeaving also provides a highly efficient implementation of stochastic gradient descent on FPGAs and enables the dynamic tuning of precision, instead of using a fixed precision level during learning. Experimental results show that MLWeaving converges up to 16 x faster than low-precision implementations of first-order methods on CPUs.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed scheme ensures confidentiality, integrity and format compatibility, while image retrieval of different quality factors is still effective.

Journal ArticleDOI
TL;DR: An end-to-end image compression framework based on convolutional neural network to resolve the problem of non-differentiability of the quantization function in the standard codec and an advanced learning algorithm is proposed to train the deep neural networks for compression.

Proceedings ArticleDOI
15 Jun 2019
TL;DR: Deep Spherical Quantization (DSQ) is put forward, a novel method to make deep convolutional neural networks generate supervised and compact binary codes for efficient image search and an easy-to-implement extension of the quantization technique that enforces sparsity on the codebooks is introduced.
Abstract: Hashing methods, which encode high-dimensional images with compact discrete codes, have been widely applied to enhance large-scale image retrieval. In this paper, we put forward Deep Spherical Quantization (DSQ), a novel method to make deep convolutional neural networks generate supervised and compact binary codes for efficient image search. Our approach simultaneously learns a mapping that transforms the input images into a low-dimensional discriminative space, and quantizes the transformed data points using multi-codebook quantization. To eliminate the negative effect of norm variance on codebook learning, we force the network to L_2 normalize the extracted features and then quantize the resulting vectors using a new supervised quantization technique specifically designed for points lying on a unit hypersphere. Furthermore, we introduce an easy-to-implement extension of our quantization technique that enforces sparsity on the codebooks. Extensive experiments demonstrate that DSQ and its sparse variant can generate semantically separable compact binary codes outperforming many state-of-the-art image retrieval methods on three benchmarks.

Journal ArticleDOI
TL;DR: Two new joint encryption and compression schemes are proposed, where one scheme emphasizes compression performance, another highlights protection performance, and performance evaluations using various criteria show that the first scheme has better compression efficiency, while the second scheme hasbetter defense ability against the statistical attack.

Journal ArticleDOI
TL;DR: This paper presents fast computation methods of N-point D CT-V and DCT-VIII, which reduce the number of addition and multiplication operations by 38% and 80.3%, respectively, in average, compared to the original JEM.
Abstract: Joint exploration model (JEM) reference codecs of ISO/IEC and ITU-T utilize multiple types of integer transforms based on DCT and DST of various transform sizes for intra- and inter-predictive coding, which has brought a significant improvement in coding efficiency. JEM adopts three types of integer DCTs (DCT-II, DCT-V, and DCT-VIII), and two types of integer DSTs (DST-I and DST-VII). The fast computations of Integer DCT-II and DST-I are well known, but few studies have been performed for the other types such as DCT-V, DCT-VIII, and DST-VII for all transform sizes. In this paper, we present fast computation methods of N-point DCT-V and DCT-VIII. For this, we first decompose the DCT-VIII into a preprocessing matrix, the DST-VII and a post-processing matrix to quickly compute it by using the linear relation between DCT-VIII and DST-VII. Then, we approximate integer kernels of N = 4, 8, 16, and 32 for DCT-V, DCT-VIII, and DST-VII with norm scaling and bit-shift to be compatible with quantization in each stage of multiplications between decomposed matrices for video coding. In various experiments, the proposed fast computation methods have shown to effectively reduce the total complexity of the matrix operations with little loss in BDBR performance. In particular, our methods reduce the number of addition and multiplication operations by 38% and 80.3%, respectively, in average, compared to the original JEM.

Proceedings ArticleDOI
Stanislav Morozov1, Artem Babenko1
01 Oct 2019
TL;DR: In this article, a DNN architecture based on multi-codebook quantization is proposed for unsupervised visual descriptors compression, which is designed to incorporate both fast data encoding and efficient distances computation via lookup tables.
Abstract: We tackle the problem of unsupervised visual descriptors compression, which is a key ingredient of large-scale image retrieval systems. While the deep learning machinery has benefited literally all computer vision pipelines, the existing state-of-the-art compression methods employ shallow architectures, and we aim to close this gap by our paper. In more detail, we introduce a DNN architecture for the unsupervised compressed-domain retrieval, based on multi-codebook quantization. The proposed architecture is designed to incorporate both fast data encoding and efficient distances computation via lookup tables. We demonstrate the exceptional advantage of our scheme over existing quantization approaches on several datasets of visual descriptors via outperforming the previous state-of-the-art by a large margin.

Journal ArticleDOI
TL;DR: A new approach is suggested in this paper for partial image encryption compression that adopts chaotic 3D cat map to de-correlate relations among pixels in conjunction with an adaptive thresholding technique that is utilized as a lossy compression technique instead of using complex quantization techniques.
Abstract: The advances in digital image processing and communications have created a great demand for real–time secure image transmission over the networks. However, the development of effective, fast and secure dependent image compression encryption systems are still a research problem as the intrinsic features of images such as bulk data capacity and high correlation among pixels hinds the use of the traditional joint encryption compression methods. A new approach is suggested in this paper for partial image encryption compression that adopts chaotic 3D cat map to de-correlate relations among pixels in conjunction with an adaptive thresholding technique that is utilized as a lossy compression technique instead of using complex quantization techniques and also as a substitution technique to increase the security of the cipher image. The proposed scheme is based on employing both of lossless compression with encryption on the most significant part of the image after contourlet transform. However the least significant parts are lossy compressed by employing a simple thresholding rule and arithmetic coding to render the image totally unrecognizable. Due to the weakness of 3D cat map to chosen plain text attack, the suggested scheme incorporates a mechanism to generate random key depending on the contents of the image (context key). Several experiments were done on benchmark images to insure the validity of the proposed technique. The compression analysis and security outcomes indicate that the suggested technique is an efficacious and safe for real time image’s applications.

Journal ArticleDOI
TL;DR: A comprehensive study and evaluation of existing single image compression artifacts removal algorithms, using a new 4K resolution benchmark including diversified foreground objects and background scenes with rich structures, is presented in this paper.
Abstract: We present a comprehensive study and evaluation of existing single image compression artifacts removal algorithms, using a new 4K resolution benchmark including diversified foreground objects and background scenes with rich structures, called Large-scale Ideal Ultra high definition 4K (LIU4K) benchmark. Compression artifacts removal, as a common post-processing technique, aims at alleviating undesirable artifacts such as blockiness, ringing, and banding caused by quantization and approximation in the compression process. In this work, a systematic listing of the reviewed methods is presented based on their basic models (handcrafted models and deep networks). The main contributions and novelties of these methods are highlighted, and the main development directions, including architectures, multi-domain sources, signal structures, and new targeted units, are summarized. Furthermore, based on a unified deep learning configuration (i.e. same training data, loss function, optimization algorithm, etc.), we evaluate recent deep learning-based methods based on diversified evaluation measures. The experimental results show the state-of-the-art performance comparison of existing methods based on both full-reference, non-reference and task-driven metrics. Our survey would give a comprehensive reference source for future research on single image compression artifacts removal and inspire new directions of the related fields.

Journal ArticleDOI
01 Apr 2019
TL;DR: Modified version of cohort intelligence (CI) algorithm, referred to as Improved Cohort Intelligence (CI), was considered as a cryptography technique and implemented to generate optimized cipher text to propose a reversible data hiding scheme.
Abstract: In the recent high level of information security was attained by combining the concepts of cryptography, steganography along with the nature inspired optimization algorithms. However, in today's world computational speed plays a vital role for the success of any scientific method. The optimization algorithms, such as cohort Intelligence with Cognitive Computing (CICC) and Modified-Multi Random Start Local Search (M-MRSLS) were already implemented and applied for JPEG image steganography for 8 × 8 as well as 16 × 16 quantization table, respectively. Although results were satisfactory in terms of image quality and capacity, the computational time was high for most of the test images. To overcome this challenge, the paper proposes modified version of cohort intelligence (CI) algorithm referred to as Improved Cohort Intelligence (CI). The Improved CI algorithm was considered as a cryptography technique and implemented to generate optimized cipher text. Improved CI algorithm was further employed for JPEG image steganography to propose a reversible data hiding scheme. Experimentation was done on grey scale image, of size 256 × 256; both for 8 × 8 and 16 × 16 quantization table. Results validation of the proposed work exhibited very encouraging improvements in the computational cost.