scispace - formally typeset
Search or ask a question

Showing papers on "Entropy encoding published in 2023"


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper exploited the relationship between the code maps produced by deep neural networks and introduced the proxy similarity functions as a workaround to take the global similarity within the context into account and thus hinders accurate entropy estimation.
Abstract: The entropy of the codes usually serves as the rate loss in the recent learned lossy image compression methods. Precise estimation of the probabilistic distribution of the codes plays a vital role in reducing the entropy and boosting the joint rate-distortion performance. However, existing deep learning based entropy models generally assume the latent codes are statistically independent or depend on some side information or local context, which fails to take the global similarity within the context into account and thus hinders the accurate entropy estimation. To address this issue, we propose a special nonlocal operation for context modeling by employing the global similarity within the context. Specifically, due to the constraint of context, nonlocal operation is incalculable in context modeling. We exploit the relationship between the code maps produced by deep neural networks and introduce the proxy similarity functions as a workaround. Then, we combine the local and the global context via a nonlocal attention block and employ it in masked convolutional networks for entropy modeling. Taking the consideration that the width of the transforms is essential in training low distortion models, we finally produce a U-net block in the transforms to increase the width with manageable memory consumption and time complexity. Experiments on Kodak and Tecnick datasets demonstrate the priority of the proposed context-based nonlocal attention block in entropy modeling and the U-net block in low distortion situations. On the whole, our model performs favorably against the existing image compression standards and recent deep image compression models.

6 citations


Journal ArticleDOI
01 Jan 2023
TL;DR: In this paper , a hybrid coding framework for the lossless recompression of JPEG images (LLJPEG) using transform domain intra prediction is proposed, including block partition and intraprediction, transform and quantization, and entropy coding.
Abstract: JPEG, which was developed 30 years ago, is the most widely used image coding format, especially favored by the resource-deficient devices, due to its simplicity and efficiency. With the evolution of the Internet and the popularity of mobile devices, a huge amount of user-generated JPEG images are uploaded to social media sites like Facebook and Flickr or stored in personal computers or notebooks, which leads to an increase in storage cost. However, the performance of JPEG is far from the-state-of-the art coding methods. Therefore, the lossless recompression of JPEG images is urgent to be studied, which will further reduce the storage cost while maintaining the image fidelity. In this paper, a hybrid coding framework for the lossless recompression of JPEG images (LLJPEG) using transform domain intra prediction is proposed, including block partition and intraprediction, transform and quantization, and entropy coding. Specifically, in LLJPEG, intra prediction is first used to obtain a predicted block. Then the predicted block is transformed by DCT and then quantized to obtain the predicted coefficients. After that, the predicted coefficients are subtracted from the original coefficients to get the DCT coefficient residuals. Finally, the DCT residuals are entropy coded. In LLJPEG, some new coding tools are proposed for intra prediction and the entropy coding is redesigned. The experiments show that LLJPEG can reduce the storage space by 29.43% and 26.40% on the Kodak and DIV2K datasets respectively without any loss for JPEG images, while maintaining low decoding complexity.

1 citations



Proceedings ArticleDOI
16 Mar 2023
TL;DR: In this article , a predictive coding algorithm for image lossless compression is introduced, which uses the local change rate of the image in the decorrelation phase to improve the prediction accuracy, and in the coding phase, the algorithm uses error feedback technology to further reduce the error.
Abstract: A predictive coding algorithm for image lossless compression is introduced. In the prediction stage, the algorithm uses the local change rate of the pixel value to adjust the prediction model adaptively, and in the coding stage, the error feedback technology is used to further reduce the information entropy of the error image. The simulation test results on standard images show that the performance of the algorithm is significantly better than the standard lossless compression algorithm. The compression algorithm we proposed uses the local change rate of the image in the decorrelation phase to improve the prediction accuracy, and in the coding phase, the algorithm uses error feedback technology to further reduce the error.

Proceedings ArticleDOI
01 Jan 2023
TL;DR: In this article , a quantization-aware posterior and prior is proposed to enable quantization and entropy coding for image compression, and the model compresses images in a coarse-to-fine fashion and supports parallel encoding and decoding.
Abstract: Recent work has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative modeling. Starting from ResNet VAEs, which are originally designed for data (image) distribution modeling, we redesign their latent variable model using a quantization-aware posterior and prior, enabling easy quantization and entropy coding for image compression. Along with improved neural network blocks, we present a powerful and efficient class of lossy image coders, outperforming previous methods on natural image (lossy) compression. Our model compresses images in a coarse-to-fine fashion and supports parallel encoding and decoding, leading to fast execution on GPUs. Code is made available online.

Posted ContentDOI
07 May 2023
TL;DR: In this article , a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks was proposed for lossy compression of an information source when the decoder has lossless access to a correlated one.
Abstract: We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, real-world applications of this problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks. We find that our neural network-based compression scheme re-discovers some principles of the optimum theoretical solution of the Wyner-Ziv setup, such as binning in the source space as well as linear decoder behavior within each quantization index, for the quadratic-Gaussian case. These behaviors emerge although no structure exploiting knowledge of the source distributions was imposed. Binning is a widely used tool in information theoretic proofs and methods, and to our knowledge, this is the first time it has been explicitly observed to emerge from data-driven learning.

Posted ContentDOI
05 Apr 2023
TL;DR: MMVC as discussed by the authors proposes a block wise mode ensemble deep video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns and entropy models, which can address a wide range of cases from static scenes without apparent motions to dynamic scenes with a moving camera.
Abstract: Learning-based video compression has been extensively studied over the past years, but it still has limitations in adapting to various motion patterns and entropy models. In this paper, we propose multi-mode video compression (MMVC), a block wise mode ensemble deep video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns. Proposed multi-modes include ConvLSTM-based feature domain prediction, optical flow conditioned feature domain prediction, and feature propagation to address a wide range of cases from static scenes without apparent motions to dynamic scenes with a moving camera. We partition the feature space into blocks for temporal prediction in spatial block-based representations. For entropy coding, we consider both dense and sparse post-quantization residual blocks, and apply optional run-length coding to sparse residuals to improve the compression rate. In this sense, our method uses a dual-mode entropy coding scheme guided by a binary density map, which offers significant rate reduction surpassing the extra cost of transmitting the binary selection map. We validate our scheme with some of the most popular benchmarking datasets. Compared with state-of-the-art video compression schemes and standard codecs, our method yields better or competitive results measured with PSNR and MS-SSIM.

Journal ArticleDOI
TL;DR: In this article , two typical lossless compression codes, Huffman and arithmetic coding, are deeply studied, including the principle of coding and the problem of error resiliency, and the ability to resist channel error is an important index for data compression in communication.
Abstract: With the development of communication technology and computer technology, many related industries, such as multimedia entertainment, put forward higher requirements for storing and transmitting information data. The research of data compression technology has attracted more and more attention. Therefore, the error resiliency ability of data compression algorithm is particularly important. How to enhance the error resiliency of data compression communication systems has been a hot topic for researchers. This paper mainly introduces the lossless data compression technology and its basic principle and performance index. Two typical lossless compression codes, Huffman and Arithmetic coding are deeply studied, including the principle of coding and the problem of error resiliency. Huffman coding and Arithmetic coding are two very important lossless compression codes widely used. The ability to resist channel error is an important index for data compression in communication. It is of great significance to further improve the channel adaptability of data compression to study the above two kinds of codes and their ability to resist channel error.

Journal ArticleDOI
01 Mar 2023-Entropy
TL;DR: In this article , a new approach for lossless raster image compression employing interpolative coding was proposed, which can be implemented in less than 60 lines of programming code for the coder and 60 lines for the decoder.
Abstract: A new approach is proposed for lossless raster image compression employing interpolative coding. A new multifunction prediction scheme is presented first. Then, interpolative coding, which has not been applied frequently for image compression, is explained briefly. Its simplification is introduced in regard to the original approach. It is determined that the JPEG LS predictor reduces the information entropy slightly better than the multi-functional approach. Furthermore, the interpolative coding was moderately more efficient than the most frequently used arithmetic coding. Finally, our compression pipeline is compared against JPEG LS, JPEG 2000 in the lossless mode, and PNG using 24 standard grayscale benchmark images. JPEG LS turned out to be the most efficient, followed by JPEG 2000, while our approach using simplified interpolative coding was moderately better than PNG. The implementation of the proposed encoder is extremely simple and can be performed in less than 60 lines of programming code for the coder and 60 lines for the decoder, which is demonstrated in the given pseudocodes.

Proceedings ArticleDOI
17 Apr 2023
TL;DR: In this paper , an entropy model of spatio-temporal characteristics in video compression using machine learning algorithms is proposed, which makes it possible to effectively evaluate both the spatial and temporal characteristics of compressed video data.
Abstract: The paper investigates an entropy model of spatio-temporal characteristics in video compression using machine learning algorithms. A significant drawback of the vast majority of existing works on neural video codecs is the emphasis on the process of optimizing the creation of a latent representation through the study of various network structures, using well-studied models for image compression as an entropy model The model under study makes it possible to effectively evaluate both the spatial and temporal characteristics of compressed video data, which makes it possible to achieve a greater reduction in video redundancy. In addition, the universality of the entropy model also allows you to set the quantization step over the spatial channel. This content-adapted quantization mechanism, similar to rate control in standard codecs, not only helps achieve smooth compression ratio adjustment, but also improves final performance by dynamically distributing quantization intervals. The results of computational experiments on real video sequences confirm the efficiency of the studied video compression method.

Journal ArticleDOI
TL;DR: In this paper , an extension of the two-parameter hypothesis was proposed which consists of three and four-parameters hypothesis, which showed the importance of proper calibration of parameter values of the method on efficiency of data compression.
Abstract: Hybrid video compression plays an invaluable role in digital video transmission and storage services and systems. It performs several-hundred-fold reduction in the amount of video data, which makes these systems much more efficient. An important element of hybrid video compression is entropy coding of the data. The state-of-the-art in this field is the newest variant of the Context-based Adaptive Binary Arithmetic Coding (CABAC) entropy compression algorithm which recently became part of the new Versatile Video Coding (VVC) technology. This work is a part of research that is currently underway to further improve the CABAC technique. This paper provides analysis of the potential for further improvement of the CABAC by more accurate calculation of probabilities of data symbols. The CABAC technique calculates those probabilities by the use of idea of the two-parameters hypothesis. For the needs of analysis presented in this paper, an extension of the aforementioned idea was proposed which consists of three- and four-parameters hypothesis. In addition, the paper shows the importance of proper calibration of parameter values of the method on efficiency of data compression. Results of experiments show that for the considered in the paper variants of the algorithm improvement the possible efficiency gain is at levels 0.11% and 0.167%, for the three- and four-parameter hypothesis, respectively.

Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , a multispectral image compression framework based on channel-stationary is proposed, in which the multi-spectral image first goes through the feature extraction network to get the feature maps, they can be down-sampled and compressed and then input to the arithmetic encoder and entropy priori module respectively.
Abstract: Multispectral image have a large number of complex features. The existing network models have good compression performance for multispectral image, but the encoding and decoding time is long. In order to extract latent statistical features to guide arithmetic coding and save compression time as much as possible, a multispectral image compression framework based on channel-stationary is proposed. The multi-spectral image first go through the feature extraction network to get the feature maps, they can be down-sampling and compressed and then input to the arithmetic encoder and entropy priori module respectively. The weight features obtained by the entropy prior module feed the aid into the image features of the autoregressive module for entropy coding. The auto-regressive module can realize decoding of images, thus saving the time required for encoding and decoding. In addition, the balance between image reconstruction quality and bit rate is achieved by using distortion optimization technology. Then the image is reconstructed by inverse quantization, up-sampling and deconvolution module. The experimental results show that the method is better than JPEG2000 under the condition of similar bit rate, the compression time is shorter, and the network time complexity of the method is higher.

Journal ArticleDOI
26 Apr 2023-Sensors
TL;DR: In this article , a prediction-based context model prefetching strategy was proposed to eliminate the clock consumption of the contextual model for accessing data in memory, and a multi-result context model update (MCMU) was also proposed to reduce the critical path delay of context model updates in multi-bin/clock architecture.
Abstract: Recently, specifically designed video codecs have been preferred due to the expansion of video data in Internet of Things (IoT) devices. Context Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding module widely used in recent video coding standards such as HEVC/H.265 and VVC/H.266. CABAC is a well known throughput bottleneck due to its strong data dependencies. Because the required context model of the current bin often depends on the results of the previous bin, the context model cannot be prefetched early enough and then results in pipeline stalls. To solve this problem, we propose a prediction-based context model prefetching strategy, effectively eliminating the clock consumption of the contextual model for accessing data in memory. Moreover, we offer multi-result context model update (MCMU) to reduce the critical path delay of context model updates in multi-bin/clock architecture. Furthermore, we apply pre-range update and pre-renormalize techniques to reduce the multiplex BAE’s route delay due to the incomplete reliance on the encoding process. Moreover, to further speed up the processing, we propose to process four regular and several bypass bins in parallel with a variable bypass bin incorporation (VBBI) technique. Finally, a quad-loop cache is developed to improve the compatibility of data interactions between the entropy encoder and other video encoder modules. As a result, the pipeline architecture based on the context model prefetching strategy can remove up to 45.66% of the coding time due to stalls of the regular bin, and the parallel architecture can also save 29.25% of the coding time due to model update on average under the condition that the Quantization Parameter (QP) is equal to 22. At the same time, the throughput of our proposed parallel architecture can reach 2191 Mbin/s, which is sufficient to meet the requirements of 8 K Ultra High Definition Television (UHDTV). Additionally, the hardware efficiency (Mbins/s per k gates) of the proposed architecture is higher than that of existing advanced pipeline and parallel architectures.

Posted ContentDOI
11 Jan 2023
TL;DR: In this article , the authors analyzed the compression performance of three kinds of approaches, namely direct entropy, predictive and wavelet-based coding, and showed that for very noisy signals it is more advantageous to directly use an entropy coder without advanced preprocessing steps.
Abstract: Especially in lossless image coding the obtainable compression ratio strongly depends on the amount of noise included in the data as all noise has to be coded, too. Different approaches exist for lossless image coding. We analyze the compression performance of three kinds of approaches, namely direct entropy, predictive and wavelet-based coding. The results from our theoretical model are compared to simulated results from standard algorithms that base on the three approaches. As long as no clipping occurs with increasing noise more bits are needed for lossless compression. We will show that for very noisy signals it is more advantageous to directly use an entropy coder without advanced preprocessing steps.


Journal ArticleDOI
TL;DR: In this paper , an improved Median Edge Detection (iMED) predictor was proposed to improve the prediction process by minimizing the entropy error, and the performance of the proposed predictor was evaluated on the standard grey scale test images dataset and KODAK image dataset.
Abstract: A wide variety of applications are used in lossless image compression models, especially in medical, space, and aerial imaging domains. Predictive coding improves the performance of lossless image compression, which highly relies on entropy error. Lower entropy error results in better image compression. The main focus of this research is to improve the prediction process by minimizing the entropy error. This paper proposes a novel idea for improved Median Edge Detection (iMED) predictor for lossless image compression. MED predictor is improved using k-means clustering and finding the local context of pixels using 20-Dimensional Difference (DDx20) for input images and updates the cluster weights using learning rates (µi) to minimize the prediction errors of pixels. The performance of the proposed predictor is evaluated on the standard grey-scale test images dataset and KODAK image dataset. Results are obtained and compared based on entropy error, bits per pixel (bpp), and computational running time in seconds(s) with the MED, GAP, FLIF, and LBP predictors. The performance of the proposed iMED predictor improves significantly in terms of the entropy error, bpp, and computational running time in seconds(s) after comparison with different state-of-the-art predictors.

Proceedings ArticleDOI
04 Jun 2023
TL;DR: In this paper , an end-to-end multiscale point cloud attribute coding method (MNeT) is proposed that progressively projects the attributes onto multiscales latent spaces.
Abstract: In recent years, several point cloud geometry compression methods that utilize advanced deep learning techniques have been proposed, but there are limited works on attribute compression, especially lossless compression. In this work, we build an end-to-end multiscale point cloud attribute coding method (MNeT) that progressively projects the attributes onto multiscale latent spaces. The multiscale architecture provides an accurate context for the attribute probability modeling and thus minimizes the coding bitrate with a single network prediction. Besides, our method allows scalable coding that lower quality versions can be easily extracted from the losslessly compressed bitstream. We validate our method on a set of point clouds from MVUB and MPEG and show that our method outperforms recently proposed methods and on par with the latest G-PCC version 14. Besides, our coding time is substantially faster than G-PCC.

Book ChapterDOI
30 Jun 2023
TL;DR: In this article , an algorithmic approach to the design of compressors based on the statistics of the symbols present in the text to be compressed is presented. But the authors focus on the time efficiency and algorithmic properties of the discussed statistical coders, while also evaluating their space performance in terms of the empirical entropy of the input text.
Abstract: This chapter deals with a classic topic in data compression and information theory, namely the design of compressors based on the statistics of the symbols present in the text to be compressed. This topic is addressed by means of an algorithmic approach that gives much attention to the time efficiency and algorithmic properties of the discussed statistical coders, while also evaluating their space performance in terms of the empirical entropy of the input text. The chapter deals in detail with the classic Huffman coding and arithmetic coding, and also discusses their engineered versionsc known as canonical Huffman coding and range coding. Its final part is dedicated to describing and commenting on the prediction by partial matching (PPM) coder, whose algorithmic structure is at the core of some of the best statistical coders to date.

Posted ContentDOI
11 Mar 2023
TL;DR: In this article , an end-to-end multiscale point cloud attribute coding method (MNeT) is proposed that progressively projects the attributes onto multiscales latent spaces.
Abstract: In recent years, several point cloud geometry compression methods that utilize advanced deep learning techniques have been proposed, but there are limited works on attribute compression, especially lossless compression. In this work, we build an end-to-end multiscale point cloud attribute coding method (MNeT) that progressively projects the attributes onto multiscale latent spaces. The multiscale architecture provides an accurate context for the attribute probability modeling and thus minimizes the coding bitrate with a single network prediction. Besides, our method allows scalable coding that lower quality versions can be easily extracted from the losslessly compressed bitstream. We validate our method on a set of point clouds from MVUB and MPEG and show that our method outperforms recently proposed methods and on par with the latest G-PCC version 14. Besides, our coding time is substantially faster than G-PCC.

Journal ArticleDOI
TL;DR: In this article , a rearranging method for prefix codes to support a certain level of direct access to the encoded stream without requiring additional data space is proposed, and an efficient decoding algorithm is proposed based on lookup tables.
Abstract: Entropy coding is a widely used technique for lossless data compression. The entropy coding schemes supporting the direct access capability on the encoded stream have been investigated in recent years. However, all prior schemes require auxiliary space to support the direct access ability. This paper proposes a rearranging method for prefix codes to support a certain level of direct access to the encoded stream without requiring additional data space. Then, an efficient decoding algorithm is proposed based on lookup tables. The simulation results show that when the encoded stream does not allow additional space, the number of bits per access read of the proposed method is above two orders of magnitude less than the conventional method. In contrast, the alternative solution consumes at least one more bit per symbol on average than the proposed method to support direct access. This indicates that the proposed scheme can achieve a good trade-off between space usage and access performance. In addition, if a small amount of additional storage space is allowed (it is approximately 0.057% in the simulation), the number of bits per access read in our proposal can be significantly reduced by 90%.

Proceedings ArticleDOI
04 Jun 2023
TL;DR: In this article , the authors proposed a segmentation method that identifies natural regions enabling better adaptive treatment for images with synthetic and pictorial content, and then split the image into two subimages (natural and synthetic parts).
Abstract: In recent years, it has been found that screen content images (SCI) can be effectively compressed based on appropriate probability modelling and suitable entropy coding methods such as arithmetic coding. The key objective is determining the best probability distribution for each pixel position. This strategy works particularly well for images with synthetic (textual) content. However, usually screen content images not only consist of synthetic but also pictorial (natural) regions. These images require diverse models of probability distributions to be optimally compressed. One way to achieve this goal is to separate synthetic and natural regions. This paper proposes a segmentation method that identifies natural regions enabling better adaptive treatment. It supplements a compression method known as Soft Context Formation (SCF) and operates as a pre-processing step. If at least one natural segment is found within the SCI, it is split into two subimages (natural and synthetic parts) and the process of modelling and coding is performed separately for both. For SCIs with natural regions, the proposed method achieves a bit-rate reduction of up to 11.6% and 1.52% with respect to HEVC and the previous version of the SCF.