Showing papers on "Entropy encoding published in 2019"

PDF

Open Access

Proceedings Article•DOI•

Video Compression With Rate-Distortion Autoencoders

[...]

Amirhossein Habibian¹, Ties van Rozendaal¹, Jakub M. Tomczak¹, Taco S. Cohen¹•Institutions (1)

14 Aug 2019-arXiv: Image and Video Processing

TL;DR: A deep generative model for lossy video compression is presented that outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation and opens up novel video compression applications, which have not been feasible with classical codecs.

...read moreread less

Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find that our method outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation. We systematically evaluate various design choices, such as the use of frame-based or spatio-temporal autoencoders, and the type of autoregressive prior. In addition, we present three extensions of the basic method that demonstrate the benefits over classical approaches to compression. First, we introduce semantic compression, where the model is trained to allocate more bits to objects of interest. Second, we study adaptive compression, where the model is adapted to a domain with limited variability, e.g., videos taken from an autonomous car, to achieve superior compression on that domain. Finally, we introduce multimodal compression, where we demonstrate the effectiveness of our model in joint compression of multiple modalities captured by non-standard imaging sensors, such as quad cameras. We believe that this opens up novel video compression applications, which have not been feasible with classical codecs.

...read moreread less

125 citations

Proceedings Article•DOI•

Practical Full Resolution Learned Lossless Image Compression

[...]

Fabian Mentzer¹, Eirikur Agustsson¹, Michael Tschannen¹, Radu Timofte¹, Luc Van Gool¹ - Show less +1 more•Institutions (1)

ETH Zurich¹

15 Jun 2019

TL;DR: L3C as discussed by the authors is a fully parallelizable hierarchical probabilistic model for adaptive entropy coding which is optimized end-to-end for the compression task, and it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000.

...read moreread less

Abstract: We propose the first practical learned lossless image compression system, L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000. At the core of our method is a fully parallelizable hierarchical probabilistic model for adaptive entropy coding which is optimized end-to-end for the compression task. In contrast to recent autoregressive discrete probabilistic models such as PixelCNN, our method i) models the image distribution jointly with learned auxiliary representations instead of exclusively modeling the image distribution in RGB space, and ii) only requires three forward-passes to predict all pixel probabilities instead of one for each pixel. As a result, L3C obtains over two orders of magnitude speedups when sampling compared to the fastest PixelCNN variant (Multiscale-PixelCNN). Furthermore, we find that learning the auxiliary representation is crucial and outperforms predefined auxiliary representations such as an RGB pyramid significantly.

...read moreread less

114 citations

Posted Content•

Non-local Attention Optimized Deep Image Compression

[...]

Haojie Liu, Tong Chen, Peiyao Guo, Qiu Shen, Xun Cao, Yao Wang, Zhan Ma - Show less +3 more

22 Apr 2019-arXiv: Image and Video Processing

TL;DR: The proposed NLAIC framework embeds non-local operations in the encoders and decoders for both image and latent feature probability information to capture both local and global correlations, and applies attention mechanism to generate masks that are used to weigh the features for the image and hyperprior.

...read moreread less

Abstract: This paper proposes a novel Non-Local Attention Optimized Deep Image Compression (NLAIC) framework, which is built on top of the popular variational auto-encoder (VAE) structure. Our NLAIC framework embeds non-local operations in the encoders and decoders for both image and latent feature probability information (known as hyperprior) to capture both local and global correlations, and apply attention mechanism to generate masks that are used to weigh the features for the image and hyperprior, which implicitly adapt bit allocation for different features based on their importance. Furthermore, both hyperpriors and spatial-channel neighbors of the latent features are used to improve entropy coding. The proposed model outperforms the existing methods on Kodak dataset, including learned (e.g., Balle2019, Balle2018) and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and MS-SSIM distortion metrics.

...read moreread less

62 citations

Proceedings Article•DOI•

Accelerating Convolutional Neural Networks via Activation Map Compression

[...]

Georgios Georgiadis¹•Institutions (1)

Samsung¹

15 Jun 2019

TL;DR: In this article, the authors propose a three-stage compression and acceleration pipeline that sparsifies, quantizes and encodes activation maps of Convolutional Neural Networks (CNNs).

...read moreread less

Abstract: The deep learning revolution brought us an extensive array of neural network architectures that achieve state-of-the-art performance in a wide variety of Computer Vision tasks including among others, classification, detection and segmentation. In parallel, we have also been observing an unprecedented demand in computational and memory requirements, rendering the efficient use of neural networks in low-powered devices virtually unattainable. Towards this end, we propose a three-stage compression and acceleration pipeline that sparsifies, quantizes and entropy encodes activation maps of Convolutional Neural Networks. Sparsification increases the representational power of activation maps leading to both acceleration of inference and higher model accuracy. Inception-V3 and MobileNet-V1 can be accelerated by as much as 1.6x with an increase in accuracy of 0.38% and 0.54% on the ImageNet and CIFAR-10 datasets respectively. Quantizing and entropy coding the sparser activation maps lead to higher compression over the baseline, reducing the memory cost of the network execution. Inception-V3 and MobileNet-V1 activation maps, quantized to 16 bits, are compressed by as much as 6x with an increase in accuracy of 0.36% and 0.55% respectively.

...read moreread less

53 citations

Proceedings Article•DOI•

Video Compression With Rate-Distortion Autoencoders

[...]

Amirhossein Habibian¹, Ties van Rozendaal¹, Jakub M. Tomczak¹, Taco S. Cohen¹•Institutions (1)

Qualcomm¹

14 Aug 2019

TL;DR: In this article, a 3D autoencoder with a discrete latent space and an autoregressive prior is used for entropy coding, which is similar to the ELBO used in variational autoencoders.

...read moreread less

Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find that our method outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation. We systematically evaluate various design choices, such as the use of frame-based or spatio-temporal autoencoders, and the type of autoregressive prior. In addition, we present three extensions of the basic method that demonstrate the benefits over classical approaches to compression. First, we introduce \emph{semantic compression}, where the model is trained to allocate more bits to objects of interest. Second, we study \emph{adaptive compression}, where the model is adapted to a domain with limited variability, \eg videos taken from an autonomous car, to achieve superior compression on that domain. Finally, we introduce \emph{multimodal compression}, where we demonstrate the effectiveness of our model in joint compression of multiple modalities captured by non-standard imaging sensors, such as quad cameras. We believe that this opens up novel video compression applications, which have not been feasible with classical codecs.

...read moreread less

53 citations

Journal Article•DOI•

A 2-D Multiple Transform Processor for the Versatile Video Coding Standard

[...]

M.J. Garrido¹, Fernando Pescador¹, M. Chavarrias¹, Pedro J. Lobo¹, César Sanz¹ - Show less +1 more•Institutions (1)

Technical University of Madrid¹

25 Apr 2019-IEEE Transactions on Consumer Electronics

TL;DR: A high performance architecture to implement the aforementioned 2-D transform types for LaTeX, synthesized for low, medium, and high-end FPGA chips, with a moderate consumption of hardware resources.

...read moreread less

Abstract: Versatile video coding (VVC) will be the next generation video coding standard, which is expected to replace HEVC in CE devices, such as tablets, smartphones, and TV sets beyond 2020. The new standard will still be based on transform, quantization, and entropy coding, but a multiple transform selection scheme has been proposed, involving three different types of 2-D Discrete Sine/Cosine transforms (DCT-II, DCT-VIII, and DST-VII), and the transform unit sizes range from $4\times 4$ to $64\times 64$ . To handle the computational complexity of these algorithms, it is useful to explore hardware solutions that could be employed as accelerators. In this paper, a high performance architecture to implement the aforementioned 2-D transform types for $4\times 4$ , $8\times 8$ , $16\times 16$ , and $32\times 32$ sizes is proposed. The design has been synthesized for low, medium, and high-end FPGA chips, being able to process up to 23 fps@ $3840\times 2160$ for $32\times 32$ transform sizes and up to 86 fps@ $3840\times 2160$ for pictures containing an even distribution of the four block sizes. Moreover, these performance results have been obtained with a moderate consumption of hardware resources.

...read moreread less

35 citations

Journal Article•DOI•

Joint image compression–encryption scheme using entropy coding and compressive sensing

[...]

Yanjie Song¹, Zhiliang Zhu¹, Wei Zhang¹, Li Guo¹, Xue Yang¹, Hai Yu¹ - Show less +2 more•Institutions (1)

Northeastern University (China)¹

01 Feb 2019-Nonlinear Dynamics

TL;DR: Experimental and analytical results illustrate the superiority of the proposed joint scheme compared with the existing compression–encryption schemes and JPEG, as well as good encryption performance.

...read moreread less

Abstract: Recently, compressive sensing (CS)-based joint compression–encryption schemes have been widely investigated due to their high efficiency and good security for images. However, the existing schemes typically have a lower compression ratio (CR), and there may be a flaw during their compression processes. Therefore, in this paper, according to the intrinsic features of images, we propose a novel compression architecture to enhance the CR. Meanwhile, based on this architecture, a joint image compression–encryption scheme using entropy coding and CS is designed to implement a complete compression and encryption process. In this joint scheme, a presented bit-level lossless compression–encryption algorithm based on entropy coding for the higher bit-planes is incorporated to improve the quality of the reconstructed image and ensure the security. Alternately, this joint scheme also contains an improved CS-based lossy compression–encryption algorithm for the lower bit-planes, which can guarantee the efficiency and security. Through the cooperation between the proposed lossless and lossy coding, the higher reconstruction performance can be achieved. SHA-256 is combined with all initial keys in the proposed joint scheme to generate the updated keys for chaos cryptosystem to maintain high security and resist some common attacks. Experimental and analytical results illustrate the superiority of the proposed joint scheme compared with the existing compression–encryption schemes and JPEG, as well as good encryption performance.

...read moreread less

35 citations

Proceedings Article•DOI•

Hybrid Video Coding with Trellis-Coded Quantization

[...]

Heiko Schwarz¹, Tung Nguyen¹, Detlev Marpe¹, Thomas Wiegand¹•Institutions (1)

Heinrich Hertz Institute¹

26 Mar 2019

TL;DR: It is shown that the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes.

...read moreread less

Abstract: In state-of-the-art video coding, the prediction error signals are transmitted using transform coding, which consists of an orthogonal transform, scalar quantization, and entropy coding of the quantization indexes. We show that the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes. The proposed approach was implemented into the first test model (VTM-1) of the new standardization project Versatile Video Coding (VVC). Our coding experiments yielded average bit-rate savings of 4.9% for intra-only coding and 3.3% for typical random access configurations, where bit-rate savings of 3.5% (intra-only) and 2.4% (random access) can be attributed to the usage of TCQ. These coding gains are obtained at a 5-10% increase in encoder run time and without any change in decoder run time.

...read moreread less

29 citations

Journal Article•DOI•

Improving PRNU Compression Through Preprocessing, Quantization, and Coding

[...]

Luca Bondi¹, Paolo Bestagini¹, Fernando Pérez-González², Stefano Tubaro¹•Institutions (2)

Polytechnic University of Milan¹, University of Vigo²

01 Mar 2019-IEEE Transactions on Information Forensics and Security

TL;DR: This paper proposes two additional steps that help improving even more Gaussian random projections compression rate, including a decimation preprocessing step tailored at attenuating frequency components in which PRNU traces are already suppressed in JPEG compressed images and a dead-zone quantizer that enables an entropy coding scheme to save bitrate when storingPRNU fingerprints or sending residuals over a communication channel.

...read moreread less

Abstract: In the last decade, the extremely rapid proliferation of digital devices capable of acquiring and sharing images over the Web has significantly increased the amount of digital images publicly accessible by everyone with Internet access. Despite the obvious benefits of such technological improvements, it is becoming mandatory to verify the origin and trustfulness of such shared pictures. Photo response non-uniformity (PRNU) is the reference signal for forensic investigators when it comes to verifying or identifying which camera device shot a picture under analysis. In spite of this, PRNU is almost a white-shaped noise, thus being very difficult to compress for storage or large scale search purposes, which are frequent investigation scenarios. To overcome the issue, the forensic community has developed a series of compression algorithms. Lately, Gaussian random projections have proved to achieve state-of-the-art performance. In this paper, we propose two additional steps that help improving even more Gaussian random projections compression rate: 1) a decimation preprocessing step tailored at attenuating frequency components in which PRNU traces are already suppressed in JPEG compressed images and 2) a dead-zone quantizer (rather than the commonly used binary one) that enables an entropy coding scheme to save bitrate when storing PRNU fingerprints or sending residuals over a communication channel. Reported results show the effectiveness of proposed improvements, both under controlled JPEG compression and in a real case scenario.

...read moreread less

28 citations

Journal Article•DOI•

Lightweight Cipher for H.264 Videos in the Internet of Multimedia Things with Encryption Space Ratio Diagnostics.

[...]

Amna Shifa¹, Mamoona Naveed Asghar¹, Salma Noor², Neelam Gohar², Martin Fleury³ - Show less +1 more•Institutions (3)

Islamia University¹, Shaheed Benazir Bhutto Women University², University of Essex³

11 Mar 2019-Sensors

TL;DR: A detailed comparative analysis of EXPer with other state-of-the-art encryption algorithms confirms that EXPer provides significant confidentiality with a small computational cost and a negligible encryption bitrate overhead, demonstrating that the proposed security scheme is a suitable choice for constrained devices in an Internet of Multimedia Things environment.

...read moreread less

Abstract: Within an Internet of Multimedia Things, the risk of disclosing streamed video content, such as that arising from video surveillance, is of heightened concern. This leads to the encryption of that content. To reduce the overhead and the lack of flexibility arising from full encryption of the content, a good number of selective-encryption algorithms have been proposed in the last decade. Some of them have limitations, in terms of: significant delay due to computational cost, or excess memory utilization, or, despite being energy efficient, not providing a satisfactory level of confidentiality, due to their simplicity. To address such limitations, this paper presents a lightweight selective encryption scheme, in which encoder syntax elements are encrypted with the innovative EXPer (extended permutation with exclusive OR). The selected syntax elements are taken from the final stage of video encoding that is during the entropy coding stage. As a diagnostic tool, the Encryption Space Ratio measures encoding complexity of the video relative to the level of encryption so as to judge the success of the encryption process, according to entropy coder. A detailed comparative analysis of EXPer with other state-of-the-art encryption algorithms confirms that EXPer provides significant confidentiality with a small computational cost and a negligible encryption bitrate overhead. Thus, the results demonstrate that the proposed security scheme is a suitable choice for constrained devices in an Internet of Multimedia Things environment.

...read moreread less

26 citations

Proceedings Article•DOI•

Deep Convolutional Compression For Massive MIMO CSI Feedback

[...]

Qianqian Yang¹, Mahdi Boloursaz Mashhadi¹, Deniz Gunduz¹•Institutions (1)

Imperial College London¹

02 Jul 2019

TL;DR: In this article, a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks, was proposed for massive MIMO systems.

...read moreread less

Abstract: Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, the huge CSI feedback overhead becomes restrictive and degrades the overall spectral efficiency. In this paper, we propose a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks. Simulation results demonstrate that DeepCMC significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of the channel state matrix for the same compression rate, measured in bits per channel dimension.

...read moreread less

Journal Article•DOI•

Joint image encryption and compression schemes based on 16 × 16 DCT

[...]

Peiya Li¹, Peiya Li², Kwok-Tung Lo¹•Institutions (2)

Hong Kong Polytechnic University¹, Jinan University²

01 Jan 2019-Journal of Visual Communication and Image Representation

TL;DR: Two new joint encryption and compression schemes are proposed, where one scheme emphasizes compression performance, another highlights protection performance, and performance evaluations using various criteria show that the first scheme has better compression efficiency, while the second scheme hasbetter defense ability against the statistical attack.

...read moreread less

Journal Article•DOI•

FreeCast: Graceful Free-Viewpoint Video Delivery

[...]

Takuya Fujihashi¹, Toshiaki Koike-Akino², Takashi Watanabe³, Philip Orlik²•Institutions (3)

Ehime University¹, Mitsubishi Electric Research Laboratories², Osaka University³

01 Apr 2019-IEEE Transactions on Multimedia

TL;DR: The proposed FreeCast achieves graceful video quality with the improvement of wireless channel quality under a low overhead requirement by exploiting a fitting function based on a multidimensional Gaussian Markov random field model for overhead reduction to mitigate rate and power loss due to large overhead.

...read moreread less

Abstract: Wireless multi-view plus depth (MVD) video streaming enables free viewpoint video playback on wireless devices, where a viewer can freely synthesize any preferred virtual viewpoint from the received MVD frames. Existing schemes of wireless MVD streaming use digital-based compression to achieve better coding efficiency. However, the digital-based schemes have an issue called the cliff effect, where the video quality is a step function in terms of wireless channel quality. In addition, parameter optimization to assign quantization levels and transmission power across MVD frames are cumbersome. To realize high-quality wireless MVD video streaming, we propose a novel graceful video delivery scheme, called FreeCast . FreeCast directly transmits linear-transformed signals based on 5-D discrete cosine transform, without digital quantization and entropy coding operations. In addition, we exploit a fitting function based on a multidimensional Gaussian Markov random field model for overhead reduction to mitigate rate and power loss due to large overhead. The proposed FreeCast achieves graceful video quality with the improvement of wireless channel quality under a low overhead requirement. In addition, the parameter optimization to achieve highest video quality can be simplified by only controlling a transmission power assignment. Performance results with several test MVD video sequences show that FreeCast yields better video quality in band-limited environments by significantly decreasing the amount of overhead. For instance, structural similarity (SSIM) performance of FreeCast is approximately 0.127 higher than the existing graceful video delivery schemes across wireless channel quality, i.e., signal-to-noise ratio, of 0–25 dB at a transmission symbol rate of 37.5 Msymbols/s.

...read moreread less

Journal Article•DOI•

Sequence Statistical Code Based Data Compression Algorithm for Wireless Sensor Network

[...]

S. Jancy¹, C. Jayakumar²•Institutions (2)

Sathyabama University¹, Sri Venkateswara College of Engineering²

27 Apr 2019-Wireless Personal Communications

TL;DR: Sequence statistical code based data compression algorithm is being proposed to improve the energy efficiency of sensors by using SDC and FOST codes in order to achieve better compression ratio.

...read moreread less

Abstract: Sensors play an integral part in the technologically advanced real world. Wireless sensors are which have powered by batteries with limited capacity. Hence energy efficiency is one of the major issues with wireless sensors. Many techniques have been proposed in order to improve sensor efficiency. This paper discusses to improve energy efficiency of sensor through data compression. Sequence statistical code based data compression algorithm is being proposed to improve the energy efficiency of sensors. SDC and FOST codes were used in this algorithm in order to achieve better compression ratio. The simulation result was compared with arithmetic data compression techniques. In the proposed algorithm computation process is very simple than arithmetic data compression techniques.

...read moreread less

Posted Content•

An Efficient Coding Method for Spike Camera using Inter-Spike Intervals

[...]

Siwei Dong¹, Lin Zhu¹, Daoyuan Xu¹, Yonghong Tian¹, Tiejun Huang¹ - Show less +1 more•Institutions (1)

Peking University¹

20 Dec 2019-arXiv: Multimedia

TL;DR: This work proposes an intensity-based measurement for spike train distance and designs an efficient coding method to meet the challenge, by investigating the spatiotemporal distribution of the spikes.

...read moreread less

Abstract: Recently, a novel bio-inspired spike camera has been proposed, which continuously accumulates luminance intensity and fires spikes while the dispatch threshold is reached. Compared to the conventional frame-based cameras and the emerging dynamic vision sensors, the spike camera has shown great advantages in capturing fast-moving scene in a frame-free manner with full texture reconstruction capabilities. However, it is difficult to transmit or store the large amount of spike data. To address this problem, we first investigate the spatiotemporal distribution of inter-spike intervals and propose an intensity-based measurement of spike train distance. Then, we design an efficient spike coding method, which integrates the techniques of adaptive temporal partitioning, intra-/inter-pixel prediction, quantization and entropy coding into a unified lossy coding framework. Finally, we construct a PKU-Spike dataset captured by the spike camera to evaluate the compression performance. The experimental results on the dataset demonstrate that the proposed approach is effective in compressing such spike data while maintaining the fidelity.

...read moreread less

Proceedings Article•DOI•

Beyond Coding: Detection-driven Image Compression with Semantically Structured Bit-stream

[...]

Tianyu He¹, Simeng Sun¹, Zongyu Guo¹, Zhibo Chen¹•Institutions (1)

University of Science and Technology of China¹

01 Nov 2019

TL;DR: This paper proposes a learning based Semantically Structured Coding (SSC) framework to generate SemanticallyStructured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks.

...read moreread less

Abstract: With the development of 5G and edge computing, it is increasingly important to offload intelligent media computing to edge device. Traditional media coding scheme codes the media into one binary stream without a semantic structure, which prevents many important intelligent applications from operating directly in bit-stream level, including semantic analysis, parsing specific content, media editing, etc. Therefore, in this paper, we propose a learning based Semantically Structured Coding (SSC) framework to generate Semantically Structured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks. Specifically, we integrate an object detection module in our compression framework to locate and align the object in feature domain. After applying quantization and entropy coding, the features are re-organized according to detected and aligned objects to form a bit-stream. Besides, different from existing learning-based compression schemes that individually train models for specific bit-rate, we share most of model parameters among various bit-rates to significantly reduce model size for variable-rate compression. Experimental results demonstrate that only at the cost of negligible overhead, objects can be completely reconstructed from partial bit-stream. We also verified that classification and pose estimation can be directly performed on partial bit-stream without performance degradation.

...read moreread less

Posted Content•

Deep Convolutional Compression for Massive MIMO CSI Feedback

[...]

Qianqian Yang¹, Mahdi Boloursaz Mashhadi¹, Deniz Gunduz¹•Institutions (1)

Imperial College London¹

02 Jul 2019-arXiv: Information Theory

TL;DR: A deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks is proposed, which significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of theChannel state matrix for the same compression rate.

...read moreread less

Abstract: Coded caching provides significant gains over conventional uncoded caching by creating multicasting opportunities among distinct requests. Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, the huge CSI feedback overhead becomes restrictive and degrades the overall spectral efficiency. In this paper, we propose a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks. In comparison with previous works, the main contributions of DeepCMC are two-fold: i) DeepCMC is fully convolutional, and it can be used in a wide range of scenarios with various numbers of sub-channels and transmit antennas; ii) DeepCMC includes quantization and entropy coding blocks and minimizes a cost function that accounts for both the rate of compression and the reconstruction quality of the channel matrix at the BS. Simulation results demonstrate that DeepCMC significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of the channel state matrix for the same compression rate, measured in bits per channel dimension.

...read moreread less

Proceedings Article•DOI•

Neural Network-Based Arithmetic Coding for Inter Prediction Information in HEVC

[...]

Changyue Ma¹, Dong Liu¹, Xiulian Peng², Zheng-Jun Zha¹, Feng Wu¹ - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

26 May 2019

TL;DR: This paper focuses on the syntax elements of inter prediction information that consists of merge flag, merge index, reference index, motion vector difference and motion vector prediction index in HEVC under low-delay P (LDP) setting.

...read moreread less

Abstract: Entropy coding is a fundamental technique in video coding to remove the statistical redundancy in syntax elements. Currently, context-adaptive binary arithmetic coding (CABAC) is used as the entropy coding tool in HEVC. Considering that the manually designed binarization and context models are not flexible to estimate the probability of the syntax elements, we use neural networks to estimate the probability of the syntax elements, then the estimated probabilities together with the values of the syntax elements are fed into an arithmetic coding engine to fulfill entropy coding. In this paper, we focus on the syntax elements of inter prediction information that consists of merge flag, merge index, reference index, motion vector difference and motion vector prediction index in HEVC under low-delay P (LDP) setting. Compared with the previous work on neural network-based arithmetic coding for intra prediction modes and intra DC coefficients, there are three new characteristics in this paper. First, surrounding syntax elements are directly fed into the neural network without converting to reconstructed pixels. Second, unified neural networks are designed for different prediction block sizes. Finally, dependency among the syntax elements in current prediction unit is omitted to improve parallelism. Experimental results show that compared with HEVC, our proposed method achieves up to 0.5% and on average 0.3% BD-rate reduction in LDP configuration.

...read moreread less

Posted Content•

Neural Video Compression using Spatio-Temporal Priors

[...]

Haojie Liu, Tong Chen, Ming Lu, Qiu Shen, Zhan Ma - Show less +1 more

20 Feb 2019-arXiv: Image and Video Processing

TL;DR: A neural video compression framework, leveraging the spatial and temporal priors, independently and jointly to exploit the correlations in intra texture, optical flow based temporal motion and residuals and to accurately model the signal distribution for entropy coding.

...read moreread less

Abstract: The pursuit of higher compression efficiency continuously drives the advances of video coding technologies. Fundamentally, we wish to find better "predictions" or "priors" that are reconstructed previously to remove the signal dependency efficiently and to accurately model the signal distribution for entropy coding. In this work, we propose a neural video compression framework, leveraging the spatial and temporal priors, independently and jointly to exploit the correlations in intra texture, optical flow based temporal motion and residuals. Spatial priors are generated using downscaled low-resolution features, while temporal priors (from previous reference frames and residuals) are captured using a convolutional neural network based long-short term memory (ConvLSTM) structure in a temporal recurrent fashion. All of these parts are connected and trained jointly towards the optimal rate-distortion performance. Compared with the High-Efficiency Video Coding (HEVC) Main Profile (MP), our method has demonstrated averaged 38% Bjontegaard-Delta Rate (BD-Rate) improvement using standard common test sequences, where the distortion is multi-scale structural similarity (MS-SSIM).

...read moreread less

Proceedings Article•DOI•

Improved Quantization and Transform Coefficient Coding for the Emerging Versatile Video Coding (VVC) Standard

[...]

Heiko Schwarz¹, Tung Nguyen¹, Detlev Marpe¹, Thomas Wiegand¹, Marta Karczewicz², Muhammed Zeyd Coban², Dong Jie² - Show less +3 more•Institutions (2)

Heinrich Hertz Institute¹, Qualcomm²

01 Sep 2019

TL;DR: For improving coding efficiency relative to the state-of-the-art video coding standard HEVC, the following modifications are proposed: Replacing scalar quantization with trellis-coded quantization; and utilizing additional statistical dependencies between quantization indexes for entropy coding.

...read moreread less

Abstract: One key component of all block-based hybrid video codecs is transform coding of prediction residues, which consists of an orthogonal block transform, scalar quantization of transform coefficients, and entropy coding of the resulting quantization indexes. For improving coding efficiency relative to the state-of-the-art video coding standard HEVC, we propose the following modifications: (1) Replacing scalar quantization with trellis-coded quantization; and (2) utilizing additional statistical dependencies between quantization indexes for entropy coding. The proposed approach was integrated into the first test model VTM-1 for the new standardization project Versatile Video Coding (VVC). Our coding experiments showed average bit-rate savings of 4.9 % for intra-only, 3.4 % for random access, and 2.8 % for low-delay configurations.

...read moreread less

Journal Article•DOI•

A Lossless Recompression Approach for Video Streaming Transmission

[...]

Zhongyuan Wang¹, Zhiqiang Hou¹, Ruimin Hu¹, Jing Xiao¹•Institutions (1)

Wuhan University¹

18 Mar 2019-IEEE Access

TL;DR: This work proposes a novel lossless recompression approach to eliminate statistical redundancy in the coded video bit stream by increasing the probability of this repetitive pattern of continuous “0” or “1” symbol strings in the binary symbol sequence by rearranging the sequence.

...read moreread less

Abstract: With the increasing amount of HD or even UHD video, the video streaming transmission over the mobile network is confronted with various quality of experience issues. To facilitate the delivery of compressed video over the network, it is extremely helpful to further compress the coded video bit stream without any information loss, so-called lossless recompression. However, the existing lossless video/image compression algorithms aim to remove the pixel-domain (e.g., lossless coding mode of HEVC) or the coefficient-domain (e.g., entropy coding) redundancy rather than the bit-stream-domain redundancy. To this end, we propose a novel lossless recompression approach to eliminate statistical redundancy in the coded video bit stream. The core idea is to increase the probability of this repetitive pattern of continuous “0” or “1” symbol strings in the binary symbol sequence by rearranging the sequence, including value mapping and position aggregation. In particular, Fibonacci-based mapping rule is used to convert the original sequence (those binary symbols in Fibonacci positions) into a new one with more continuous “0” or “1.” Besides, we aggregate as many identical symbols as possible together to further increase the probability of the occurrence of repetitive patterns. Finally, the lossless compression algorithm and the asymptotic lossless compression scheme are designed to achieve a compact representation of the rearranged symbol sequence. We also regulate formal syntactic structures for the proposed mapping rule, aggregation algorithm, and recompression scheme so as to deterministically revert from the recoded version to the original bit stream. The experimental results on compressed H.264/AVC and H.265/HEVC video data show that our approach can considerably reduce the bit rate of streaming video.

...read moreread less

Journal Article•DOI•

Deep Image Compression in the Wavelet Transform Domain Based on High Frequency Sub-Band Prediction

[...]

Chuxi Yang¹, Yan Zhao¹, Shigang Wang¹•Institutions (1)

Jilin University¹

16 Apr 2019-IEEE Access

TL;DR: The experimental results show that the proposed approach outperforms JPEG, JPEG2000, BPG, and some mainstream neural network-based image compression and produces better visual quality with clearer details and textures because more high-frequency coefficients can be reserved, thanks to the high- frequencies prediction.

...read moreread less

Abstract: In this paper, we propose to use deep neural networks for image compression in the wavelet transform domain. When the input image is transformed from the spatial pixel domain to the wavelet transform domain, one low-frequency sub-band (LF sub-band) and three high-frequency sub-bands (HF sub-bands) are generated. Low-frequency sub-band is firstly used to predict each high-frequency sub-band to eliminate redundancy between the sub-bands, after which the sub-bands are fed into different auto-encoders to do the encoding. In order to further improve the compression efficiency, we use a conditional probability model to estimate the context-dependent prior probability of the encoded codes, which can be used for entropy coding. The entire training process is unsupervised, and the auto-encoders and the conditional probability model are trained jointly. The experimental results show that the proposed approach outperforms JPEG, JPEG2000, BPG, and some mainstream neural network-based image compression. Furthermore, it produces better visual quality with clearer details and textures because more high-frequency coefficients can be reserved, thanks to the high-frequency prediction.

...read moreread less

Journal Article•DOI•

A New Perspective on Improving the Lossless Compression Efficiency for Initially Acquired Images

[...]

Jianyu Lin¹•Institutions (1)

Shantou University¹

30 Sep 2019-IEEE Access

TL;DR: A new lossless compression scheme of compressing the initially-acquired continuous-intensity images with a lossy compression algorithm to obtain higher compression efficiency is proposed and the entropy-constrained scalar quantization is implemented using a novel and simple thresholding method.

...read moreread less

Abstract: A new lossless compression scheme of compressing the initially-acquired continuous-intensity images with a lossy compression algorithm to obtain higher compression efficiency is proposed. Even if a lossy algorithm is employed, for decoded original images, there is no loss of data in the same sense as the conventional lossless scheme. To realize the new idea, the compression efficiency of the existing lossy subband compression algorithm is improved at high bitrates. For the entropy coding part, a run-length based, symbol-grouping entropy coding method is introduced. For the quantization part, the entropy-constrained scalar quantization is implemented using a novel and simple thresholding method. Coding results show that bit savings of the proposed lossless scheme, which employs a lossy algorithm, over the conventional lossless scheme achieve a maximum of 27.2% and an average of 11.4% in our test.

...read moreread less

Journal Article•DOI•

Transmission of Still Images Using Low-Complexity Analog Joint Source-Channel Coding

[...]

Jose Balsa¹, Tomas Dominguez-Bolano¹, Oscar Fresnedo¹, Jose A. Garcia-Naya¹, Luis Castedo¹ - Show less +1 more•Institutions (1)

University of A Coruña¹

03 Jul 2019-Sensors

TL;DR: This work shows that the proposed analog JSCC system exhibits a performance similar to that of the digital scheme based on JPEG compression with a noticeable better visual degradation to the human eye, a lower computational complexity, and a negligible delay.

...read moreread less

Abstract: An analog joint source-channel coding (JSCC) system designed for the transmission of still images is proposed and its performance is compared to that of two digital alternatives which differ in the source encoding operation: Joint Photographic Experts Group (JPEG) and JPEG without entropy coding (JPEGw/oEC), respectively, both relying on an optimized channel encoder–modulator tandem. Apart from a visual comparison, the figures of merit considered in the assessment are the structural similarity (SSIM) index and the time required to transmit an image through additive white Gaussian noise (AWGN) and Rayleigh channels. This work shows that the proposed analog system exhibits a performance similar to that of the digital scheme based on JPEG compression with a noticeable better visual degradation to the human eye, a lower computational complexity, and a negligible delay. These results confirm the suitability of analog JSCC for the transmission of still images in scenarios with severe constraints on power consumption, computational capabilities, and for real-time applications. For these reasons the proposed system is a good candidate for surveillance systems, low-constrained devices, Internet of things (IoT) applications, etc.

...read moreread less

Journal Article•DOI•

Lossless Image and Intra-Frame Compression With Integer-to-Integer DST

[...]

Fatih Kamisli¹•Institutions (1)

Middle East Technical University¹

01 Feb 2019-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: In this paper, an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms was proposed, which can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression.

...read moreread less

Abstract: Video coding standards are primarily designed for efficient lossy compression, but it is also desirable to support efficient lossless compression within video coding standards using small modifications to the lossy coding architecture. A simple approach is to skip transform and quantization, and simply entropy code the prediction residual. However, this approach is inefficient at compression. A more efficient and popular approach is to skip transform and quantization but also process the residual block in some modes with differential pulse code modulation (DPCM), along the horizontal or vertical direction, prior to entropy coding. This paper explores an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms. I2i transforms can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression. We focus on lossless intra coding and develop novel i2i approximations of the odd type-3 discrete sine transform (ODST-3). Experimental results with the high efficiency video coding (HEVC) reference software show that when the developed i2i approximations of the ODST-3 are used along the DPCM method of HEVC, an average 2.7% improvement of lossless intra frame compression efficiency is achieved over HEVC version 2, which uses only the DPCM method, without a significant increase in computational complexity.

...read moreread less

Proceedings Article•DOI•

Efficient Lossless Compression Scheme for Multi-channel ECG Signal

[...]

Tsung-Han Tsai¹, Fong-Lin Tsai¹•Institutions (1)

National Central University¹

12 May 2019

TL;DR: A multi-channel ECG lossless compression which uses the adaptive linear prediction for intra and inter channel decorrelation and the coefficient of linear prediction and Golomb-Rice codec will make self-adjustment during the process.

...read moreread less

Abstract: Electrocardiogram (ECG) is the recording of the heart electrical activity and used to diagnose heart disease nowadays. The diagnosis requires a large amount of time for acquiring enough multi-channel data normally. Thus storage and transmission of 12 lead ECG data will result in massive cost. In this work, we propose a multi-channel ECG lossless compression which uses the adaptive linear prediction for intra and inter channel decorrelation. The proposed technique is based on the adaptive Golomb-Rice codec for entropy coding with adaptive linear prediction. Thus the coefficient of linear prediction and Golomb-Rice codec will make self-adjustment during the process. Finally we evaluate the proposed algorithm with MIT-BIH Arrhythmia database for single-channel compression, and PTB database for multichannel compression.

...read moreread less

Proceedings Article•DOI•

HoloCast: Graph Signal Processing for Graceful Point Cloud Delivery

[...]

Takuya Fujihashi¹, Toshiaki Koike-Akino², Takashi Watanabe³, Philip Orlik²•Institutions (3)

Ehime University¹, Mitsubishi Electric Research Laboratories², Osaka University³

20 May 2019

TL;DR: This work proposes a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality, and shows that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.

...read moreread less

Abstract: In conventional point cloud delivery, a sender uses octree-based digital video compression to stream three-dimensional (3D) points and the corresponding color attributes over band-limited links, e.g., wireless channels, for 3D scene reconstructions. However, the digital-based delivery schemes have an issue called cliff effect, where the 3D reconstruction quality is a step function in terms of wireless channel quality. We propose a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality. HoloCast regards the 3D points and color components as graph signals and directly transmits linear-transformed signals based on graph Fourier transform (GFT), without digital quantization and entropy coding operations. One of main contributions in HoloCast is that the use of GFT can deal with non-ordered and non-uniformly distributed multidimensional signals such as holographic data unlike conventional delivery schemes. Performance results with point cloud data show that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.

...read moreread less

Patent•

Variable code rate image encoding and decoding system and method based on deep learning

[...]

Chen Li, Chunlei Cai, Zhang Xiaoyun, Gao Zhiyong, Guo Lu - Show less +1 more

09 Jul 2019

TL;DR: In this article, a variable code rate image coding system and method based on deep learning was proposed, and the system comprises a forward multi-scale decomposition transformation network module which decomposes an input original image into image features of multiple scales; a quantization module quantizes the image features into integers; a self-adaptive code rate distribution module is used for carrying out block-level code rate distributions on image features quantized into integers according to a given target code rate.

...read moreread less

Abstract: The invention discloses a variable code rate image coding system and method based on deep learning, and the system comprises a forward multi-scale decomposition transformation network module which decomposes an input original image into image features of multiple scales; a quantization module quantizes the image features into integers; a self-adaptive code rate distribution module is used for carrying out block-level code rate distribution on the image features quantized into integers according to a given target code rate; an entropy encoding and decoding module encodes the image features subjected to code rate distribution into binary code streams; meanwhile, the invention provides a variable code rate image decoding system and a variable code rate image decoding method which are used fordecoding the codes formed by the encoding system and the encoding method. Positive and negative multi-scale decomposition transformation is constructed by using a deep convolutional neural network, training is carried out by using a large amount of data to obtain an optimal model parameter, and variable code rate image coding and decoding can be realized in practical application in combination with an adaptive code rate distribution method based on image complexity.

...read moreread less

Patent•

Large-capacity hiding method for reversible data in bitstream encryption domain of JPEG image

[...]

Chen Fan, Zheng Mengyang, He Hongjie, Dong Mengyao

04 Jan 2019

TL;DR: In this paper, a large-capacity hiding method for reversible data in a bitstream encryption domain of a JPEG image was proposed, where a user performs exclusive or encryption on extension bits of entropy coding in each of image blocks according to an encryption key in JPEG bitstreams, performs pseudo-random scrambling on remaining AC coefficients except a last non-zero AC coefficient, and then performs the pseudorandom scrambling of all the image blocks; a cloud performs embedding of extraneous information by a histogram shifting method on the AC coefficients in bitstream ciphertext according to a hiding key

...read moreread less

Abstract: The invention discloses a large-capacity hiding method for reversible data in a bitstream encryption domain of a JPEG image. According to the method, a user performs exclusive or encryption on extension bits of entropy coding in each of image blocks according to an encryption key in JPEG bitstreams, performs pseudo-random scrambling on remaining AC coefficients except a last non-zero AC coefficient, and then performs the pseudo-random scrambling on the entropy coding of all the image blocks; a cloud performs embedding of extraneous information by a histogram shifting method on the AC coefficients in a bitstream ciphertext according to a hiding key, and can perform non-destructive extraction; and a receiver performs pseudo-random scrambling resumption on the entropy coding of ciphertext image blocks in the bitstream ciphertext according to the encryption key, and then the pseudo-random scrambling resumption and exclusive or decryption of the AC coefficients are performed in the entropycoding of each of the ciphertext image blocks to obtain decrypted bitstreams same as original JPEG bitstreams. The method not only achieves encryption of the extension bits of the entropy coding, butalso encrypts Huffman coding of the entropy coding; and the steganographic capacity is large, and the safety is high.

...read moreread less

Posted Content•

HoloCast: Graph Signal Processing for Graceful Point Cloud Delivery.

[...]

Takuya Fujihashi¹, Toshiaki Koike-Akino², Takashi Watanabe³, Philip Orlik²•Institutions (3)

Ehime University¹, Mitsubishi Electric Research Laboratories², Osaka University³

08 Mar 2019-arXiv: Multimedia

TL;DR: In this article, a graph Fourier transform (GFT) based point cloud delivery scheme is proposed to improve the 3D reconstruction quality with the improvement of wireless channel quality in noisy wireless environment.

...read moreread less

Abstract: In conventional point cloud delivery, a sender uses octree-based digital video compression to stream three-dimensional (3D) points and the corresponding color attributes over band-limited links, e.g., wireless channels, for 3D scene reconstructions. However, the digital-based delivery schemes have an issue called cliff effect, where the 3D reconstruction quality is a step function in terms of wireless channel quality. We propose a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality. HoloCast regards the 3D points and color components as graph signals and directly transmits linear-transformed signals based on graph Fourier transform (GFT), without digital quantization and entropy coding operations. One of main contributions in HoloCast is that the use of GFT can deal with non-ordered and non-uniformly distributed multi-dimensional signals such as holographic data unlike conventional delivery schemes. Performance results with point cloud data show that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.

...read moreread less