scispace - formally typeset
Search or ask a question

Showing papers on "Entropy encoding published in 2019"


Proceedings ArticleDOI
TL;DR: A deep generative model for lossy video compression is presented that outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation and opens up novel video compression applications, which have not been feasible with classical codecs.
Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find that our method outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation. We systematically evaluate various design choices, such as the use of frame-based or spatio-temporal autoencoders, and the type of autoregressive prior. In addition, we present three extensions of the basic method that demonstrate the benefits over classical approaches to compression. First, we introduce semantic compression, where the model is trained to allocate more bits to objects of interest. Second, we study adaptive compression, where the model is adapted to a domain with limited variability, e.g., videos taken from an autonomous car, to achieve superior compression on that domain. Finally, we introduce multimodal compression, where we demonstrate the effectiveness of our model in joint compression of multiple modalities captured by non-standard imaging sensors, such as quad cameras. We believe that this opens up novel video compression applications, which have not been feasible with classical codecs.

125 citations


Proceedings ArticleDOI
Fabian Mentzer1, Eirikur Agustsson1, Michael Tschannen1, Radu Timofte1, Luc Van Gool1 
15 Jun 2019
TL;DR: L3C as discussed by the authors is a fully parallelizable hierarchical probabilistic model for adaptive entropy coding which is optimized end-to-end for the compression task, and it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000.
Abstract: We propose the first practical learned lossless image compression system, L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and JPEG 2000. At the core of our method is a fully parallelizable hierarchical probabilistic model for adaptive entropy coding which is optimized end-to-end for the compression task. In contrast to recent autoregressive discrete probabilistic models such as PixelCNN, our method i) models the image distribution jointly with learned auxiliary representations instead of exclusively modeling the image distribution in RGB space, and ii) only requires three forward-passes to predict all pixel probabilities instead of one for each pixel. As a result, L3C obtains over two orders of magnitude speedups when sampling compared to the fastest PixelCNN variant (Multiscale-PixelCNN). Furthermore, we find that learning the auxiliary representation is crucial and outperforms predefined auxiliary representations such as an RGB pyramid significantly.

114 citations


Posted Content
TL;DR: The proposed NLAIC framework embeds non-local operations in the encoders and decoders for both image and latent feature probability information to capture both local and global correlations, and applies attention mechanism to generate masks that are used to weigh the features for the image and hyperprior.
Abstract: This paper proposes a novel Non-Local Attention Optimized Deep Image Compression (NLAIC) framework, which is built on top of the popular variational auto-encoder (VAE) structure. Our NLAIC framework embeds non-local operations in the encoders and decoders for both image and latent feature probability information (known as hyperprior) to capture both local and global correlations, and apply attention mechanism to generate masks that are used to weigh the features for the image and hyperprior, which implicitly adapt bit allocation for different features based on their importance. Furthermore, both hyperpriors and spatial-channel neighbors of the latent features are used to improve entropy coding. The proposed model outperforms the existing methods on Kodak dataset, including learned (e.g., Balle2019, Balle2018) and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and MS-SSIM distortion metrics.

62 citations


Proceedings ArticleDOI
Georgios Georgiadis1
15 Jun 2019
TL;DR: In this article, the authors propose a three-stage compression and acceleration pipeline that sparsifies, quantizes and encodes activation maps of Convolutional Neural Networks (CNNs).
Abstract: The deep learning revolution brought us an extensive array of neural network architectures that achieve state-of-the-art performance in a wide variety of Computer Vision tasks including among others, classification, detection and segmentation. In parallel, we have also been observing an unprecedented demand in computational and memory requirements, rendering the efficient use of neural networks in low-powered devices virtually unattainable. Towards this end, we propose a three-stage compression and acceleration pipeline that sparsifies, quantizes and entropy encodes activation maps of Convolutional Neural Networks. Sparsification increases the representational power of activation maps leading to both acceleration of inference and higher model accuracy. Inception-V3 and MobileNet-V1 can be accelerated by as much as 1.6x with an increase in accuracy of 0.38% and 0.54% on the ImageNet and CIFAR-10 datasets respectively. Quantizing and entropy coding the sparser activation maps lead to higher compression over the baseline, reducing the memory cost of the network execution. Inception-V3 and MobileNet-V1 activation maps, quantized to 16 bits, are compressed by as much as 6x with an increase in accuracy of 0.36% and 0.55% respectively.

53 citations


Proceedings ArticleDOI
14 Aug 2019
TL;DR: In this article, a 3D autoencoder with a discrete latent space and an autoregressive prior is used for entropy coding, which is similar to the ELBO used in variational autoencoders.
Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find that our method outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation. We systematically evaluate various design choices, such as the use of frame-based or spatio-temporal autoencoders, and the type of autoregressive prior. In addition, we present three extensions of the basic method that demonstrate the benefits over classical approaches to compression. First, we introduce \emph{semantic compression}, where the model is trained to allocate more bits to objects of interest. Second, we study \emph{adaptive compression}, where the model is adapted to a domain with limited variability, \eg videos taken from an autonomous car, to achieve superior compression on that domain. Finally, we introduce \emph{multimodal compression}, where we demonstrate the effectiveness of our model in joint compression of multiple modalities captured by non-standard imaging sensors, such as quad cameras. We believe that this opens up novel video compression applications, which have not been feasible with classical codecs.

53 citations


Journal ArticleDOI
TL;DR: A high performance architecture to implement the aforementioned 2-D transform types for LaTeX, synthesized for low, medium, and high-end FPGA chips, with a moderate consumption of hardware resources.
Abstract: Versatile video coding (VVC) will be the next generation video coding standard, which is expected to replace HEVC in CE devices, such as tablets, smartphones, and TV sets beyond 2020. The new standard will still be based on transform, quantization, and entropy coding, but a multiple transform selection scheme has been proposed, involving three different types of 2-D Discrete Sine/Cosine transforms (DCT-II, DCT-VIII, and DST-VII), and the transform unit sizes range from $4\times 4$ to $64\times 64$ . To handle the computational complexity of these algorithms, it is useful to explore hardware solutions that could be employed as accelerators. In this paper, a high performance architecture to implement the aforementioned 2-D transform types for $4\times 4$ , $8\times 8$ , $16\times 16$ , and $32\times 32$ sizes is proposed. The design has been synthesized for low, medium, and high-end FPGA chips, being able to process up to 23 fps@ $3840\times 2160$ for $32\times 32$ transform sizes and up to 86 fps@ $3840\times 2160$ for pictures containing an even distribution of the four block sizes. Moreover, these performance results have been obtained with a moderate consumption of hardware resources.

35 citations


Journal ArticleDOI
Yanjie Song1, Zhiliang Zhu1, Wei Zhang1, Li Guo1, Xue Yang1, Hai Yu1 
TL;DR: Experimental and analytical results illustrate the superiority of the proposed joint scheme compared with the existing compression–encryption schemes and JPEG, as well as good encryption performance.
Abstract: Recently, compressive sensing (CS)-based joint compression–encryption schemes have been widely investigated due to their high efficiency and good security for images. However, the existing schemes typically have a lower compression ratio (CR), and there may be a flaw during their compression processes. Therefore, in this paper, according to the intrinsic features of images, we propose a novel compression architecture to enhance the CR. Meanwhile, based on this architecture, a joint image compression–encryption scheme using entropy coding and CS is designed to implement a complete compression and encryption process. In this joint scheme, a presented bit-level lossless compression–encryption algorithm based on entropy coding for the higher bit-planes is incorporated to improve the quality of the reconstructed image and ensure the security. Alternately, this joint scheme also contains an improved CS-based lossy compression–encryption algorithm for the lower bit-planes, which can guarantee the efficiency and security. Through the cooperation between the proposed lossless and lossy coding, the higher reconstruction performance can be achieved. SHA-256 is combined with all initial keys in the proposed joint scheme to generate the updated keys for chaos cryptosystem to maintain high security and resist some common attacks. Experimental and analytical results illustrate the superiority of the proposed joint scheme compared with the existing compression–encryption schemes and JPEG, as well as good encryption performance.

35 citations


Proceedings ArticleDOI
26 Mar 2019
TL;DR: It is shown that the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes.
Abstract: In state-of-the-art video coding, the prediction error signals are transmitted using transform coding, which consists of an orthogonal transform, scalar quantization, and entropy coding of the quantization indexes. We show that the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes. The proposed approach was implemented into the first test model (VTM-1) of the new standardization project Versatile Video Coding (VVC). Our coding experiments yielded average bit-rate savings of 4.9% for intra-only coding and 3.3% for typical random access configurations, where bit-rate savings of 3.5% (intra-only) and 2.4% (random access) can be attributed to the usage of TCQ. These coding gains are obtained at a 5-10% increase in encoder run time and without any change in decoder run time.

29 citations


Journal ArticleDOI
TL;DR: This paper proposes two additional steps that help improving even more Gaussian random projections compression rate, including a decimation preprocessing step tailored at attenuating frequency components in which PRNU traces are already suppressed in JPEG compressed images and a dead-zone quantizer that enables an entropy coding scheme to save bitrate when storingPRNU fingerprints or sending residuals over a communication channel.
Abstract: In the last decade, the extremely rapid proliferation of digital devices capable of acquiring and sharing images over the Web has significantly increased the amount of digital images publicly accessible by everyone with Internet access. Despite the obvious benefits of such technological improvements, it is becoming mandatory to verify the origin and trustfulness of such shared pictures. Photo response non-uniformity (PRNU) is the reference signal for forensic investigators when it comes to verifying or identifying which camera device shot a picture under analysis. In spite of this, PRNU is almost a white-shaped noise, thus being very difficult to compress for storage or large scale search purposes, which are frequent investigation scenarios. To overcome the issue, the forensic community has developed a series of compression algorithms. Lately, Gaussian random projections have proved to achieve state-of-the-art performance. In this paper, we propose two additional steps that help improving even more Gaussian random projections compression rate: 1) a decimation preprocessing step tailored at attenuating frequency components in which PRNU traces are already suppressed in JPEG compressed images and 2) a dead-zone quantizer (rather than the commonly used binary one) that enables an entropy coding scheme to save bitrate when storing PRNU fingerprints or sending residuals over a communication channel. Reported results show the effectiveness of proposed improvements, both under controlled JPEG compression and in a real case scenario.

28 citations


Journal ArticleDOI
11 Mar 2019-Sensors
TL;DR: A detailed comparative analysis of EXPer with other state-of-the-art encryption algorithms confirms that EXPer provides significant confidentiality with a small computational cost and a negligible encryption bitrate overhead, demonstrating that the proposed security scheme is a suitable choice for constrained devices in an Internet of Multimedia Things environment.
Abstract: Within an Internet of Multimedia Things, the risk of disclosing streamed video content, such as that arising from video surveillance, is of heightened concern. This leads to the encryption of that content. To reduce the overhead and the lack of flexibility arising from full encryption of the content, a good number of selective-encryption algorithms have been proposed in the last decade. Some of them have limitations, in terms of: significant delay due to computational cost, or excess memory utilization, or, despite being energy efficient, not providing a satisfactory level of confidentiality, due to their simplicity. To address such limitations, this paper presents a lightweight selective encryption scheme, in which encoder syntax elements are encrypted with the innovative EXPer (extended permutation with exclusive OR). The selected syntax elements are taken from the final stage of video encoding that is during the entropy coding stage. As a diagnostic tool, the Encryption Space Ratio measures encoding complexity of the video relative to the level of encryption so as to judge the success of the encryption process, according to entropy coder. A detailed comparative analysis of EXPer with other state-of-the-art encryption algorithms confirms that EXPer provides significant confidentiality with a small computational cost and a negligible encryption bitrate overhead. Thus, the results demonstrate that the proposed security scheme is a suitable choice for constrained devices in an Internet of Multimedia Things environment.

26 citations


Proceedings ArticleDOI
02 Jul 2019
TL;DR: In this article, a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks, was proposed for massive MIMO systems.
Abstract: Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, the huge CSI feedback overhead becomes restrictive and degrades the overall spectral efficiency. In this paper, we propose a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks. Simulation results demonstrate that DeepCMC significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of the channel state matrix for the same compression rate, measured in bits per channel dimension.

Journal ArticleDOI
TL;DR: Two new joint encryption and compression schemes are proposed, where one scheme emphasizes compression performance, another highlights protection performance, and performance evaluations using various criteria show that the first scheme has better compression efficiency, while the second scheme hasbetter defense ability against the statistical attack.

Journal ArticleDOI
TL;DR: The proposed FreeCast achieves graceful video quality with the improvement of wireless channel quality under a low overhead requirement by exploiting a fitting function based on a multidimensional Gaussian Markov random field model for overhead reduction to mitigate rate and power loss due to large overhead.
Abstract: Wireless multi-view plus depth (MVD) video streaming enables free viewpoint video playback on wireless devices, where a viewer can freely synthesize any preferred virtual viewpoint from the received MVD frames. Existing schemes of wireless MVD streaming use digital-based compression to achieve better coding efficiency. However, the digital-based schemes have an issue called the cliff effect, where the video quality is a step function in terms of wireless channel quality. In addition, parameter optimization to assign quantization levels and transmission power across MVD frames are cumbersome. To realize high-quality wireless MVD video streaming, we propose a novel graceful video delivery scheme, called FreeCast . FreeCast directly transmits linear-transformed signals based on 5-D discrete cosine transform, without digital quantization and entropy coding operations. In addition, we exploit a fitting function based on a multidimensional Gaussian Markov random field model for overhead reduction to mitigate rate and power loss due to large overhead. The proposed FreeCast achieves graceful video quality with the improvement of wireless channel quality under a low overhead requirement. In addition, the parameter optimization to achieve highest video quality can be simplified by only controlling a transmission power assignment. Performance results with several test MVD video sequences show that FreeCast yields better video quality in band-limited environments by significantly decreasing the amount of overhead. For instance, structural similarity (SSIM) performance of FreeCast is approximately 0.127 higher than the existing graceful video delivery schemes across wireless channel quality, i.e., signal-to-noise ratio, of 0–25 dB at a transmission symbol rate of 37.5 Msymbols/s.

Journal ArticleDOI
TL;DR: Sequence statistical code based data compression algorithm is being proposed to improve the energy efficiency of sensors by using SDC and FOST codes in order to achieve better compression ratio.
Abstract: Sensors play an integral part in the technologically advanced real world. Wireless sensors are which have powered by batteries with limited capacity. Hence energy efficiency is one of the major issues with wireless sensors. Many techniques have been proposed in order to improve sensor efficiency. This paper discusses to improve energy efficiency of sensor through data compression. Sequence statistical code based data compression algorithm is being proposed to improve the energy efficiency of sensors. SDC and FOST codes were used in this algorithm in order to achieve better compression ratio. The simulation result was compared with arithmetic data compression techniques. In the proposed algorithm computation process is very simple than arithmetic data compression techniques.

Posted Content
Siwei Dong1, Lin Zhu1, Daoyuan Xu1, Yonghong Tian1, Tiejun Huang1 
TL;DR: This work proposes an intensity-based measurement for spike train distance and designs an efficient coding method to meet the challenge, by investigating the spatiotemporal distribution of the spikes.
Abstract: Recently, a novel bio-inspired spike camera has been proposed, which continuously accumulates luminance intensity and fires spikes while the dispatch threshold is reached. Compared to the conventional frame-based cameras and the emerging dynamic vision sensors, the spike camera has shown great advantages in capturing fast-moving scene in a frame-free manner with full texture reconstruction capabilities. However, it is difficult to transmit or store the large amount of spike data. To address this problem, we first investigate the spatiotemporal distribution of inter-spike intervals and propose an intensity-based measurement of spike train distance. Then, we design an efficient spike coding method, which integrates the techniques of adaptive temporal partitioning, intra-/inter-pixel prediction, quantization and entropy coding into a unified lossy coding framework. Finally, we construct a PKU-Spike dataset captured by the spike camera to evaluate the compression performance. The experimental results on the dataset demonstrate that the proposed approach is effective in compressing such spike data while maintaining the fidelity.

Proceedings ArticleDOI
01 Nov 2019
TL;DR: This paper proposes a learning based Semantically Structured Coding (SSC) framework to generate SemanticallyStructured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks.
Abstract: With the development of 5G and edge computing, it is increasingly important to offload intelligent media computing to edge device. Traditional media coding scheme codes the media into one binary stream without a semantic structure, which prevents many important intelligent applications from operating directly in bit-stream level, including semantic analysis, parsing specific content, media editing, etc. Therefore, in this paper, we propose a learning based Semantically Structured Coding (SSC) framework to generate Semantically Structured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks. Specifically, we integrate an object detection module in our compression framework to locate and align the object in feature domain. After applying quantization and entropy coding, the features are re-organized according to detected and aligned objects to form a bit-stream. Besides, different from existing learning-based compression schemes that individually train models for specific bit-rate, we share most of model parameters among various bit-rates to significantly reduce model size for variable-rate compression. Experimental results demonstrate that only at the cost of negligible overhead, objects can be completely reconstructed from partial bit-stream. We also verified that classification and pose estimation can be directly performed on partial bit-stream without performance degradation.

Posted Content
TL;DR: A deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks is proposed, which significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of theChannel state matrix for the same compression rate.
Abstract: Coded caching provides significant gains over conventional uncoded caching by creating multicasting opportunities among distinct requests. Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, the huge CSI feedback overhead becomes restrictive and degrades the overall spectral efficiency. In this paper, we propose a deep learning based channel state matrix compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks. In comparison with previous works, the main contributions of DeepCMC are two-fold: i) DeepCMC is fully convolutional, and it can be used in a wide range of scenarios with various numbers of sub-channels and transmit antennas; ii) DeepCMC includes quantization and entropy coding blocks and minimizes a cost function that accounts for both the rate of compression and the reconstruction quality of the channel matrix at the BS. Simulation results demonstrate that DeepCMC significantly outperforms the state of the art compression schemes in terms of the reconstruction quality of the channel state matrix for the same compression rate, measured in bits per channel dimension.

Proceedings ArticleDOI
26 May 2019
TL;DR: This paper focuses on the syntax elements of inter prediction information that consists of merge flag, merge index, reference index, motion vector difference and motion vector prediction index in HEVC under low-delay P (LDP) setting.
Abstract: Entropy coding is a fundamental technique in video coding to remove the statistical redundancy in syntax elements. Currently, context-adaptive binary arithmetic coding (CABAC) is used as the entropy coding tool in HEVC. Considering that the manually designed binarization and context models are not flexible to estimate the probability of the syntax elements, we use neural networks to estimate the probability of the syntax elements, then the estimated probabilities together with the values of the syntax elements are fed into an arithmetic coding engine to fulfill entropy coding. In this paper, we focus on the syntax elements of inter prediction information that consists of merge flag, merge index, reference index, motion vector difference and motion vector prediction index in HEVC under low-delay P (LDP) setting. Compared with the previous work on neural network-based arithmetic coding for intra prediction modes and intra DC coefficients, there are three new characteristics in this paper. First, surrounding syntax elements are directly fed into the neural network without converting to reconstructed pixels. Second, unified neural networks are designed for different prediction block sizes. Finally, dependency among the syntax elements in current prediction unit is omitted to improve parallelism. Experimental results show that compared with HEVC, our proposed method achieves up to 0.5% and on average 0.3% BD-rate reduction in LDP configuration.

Posted Content
Haojie Liu, Tong Chen, Ming Lu, Qiu Shen, Zhan Ma 
TL;DR: A neural video compression framework, leveraging the spatial and temporal priors, independently and jointly to exploit the correlations in intra texture, optical flow based temporal motion and residuals and to accurately model the signal distribution for entropy coding.
Abstract: The pursuit of higher compression efficiency continuously drives the advances of video coding technologies. Fundamentally, we wish to find better "predictions" or "priors" that are reconstructed previously to remove the signal dependency efficiently and to accurately model the signal distribution for entropy coding. In this work, we propose a neural video compression framework, leveraging the spatial and temporal priors, independently and jointly to exploit the correlations in intra texture, optical flow based temporal motion and residuals. Spatial priors are generated using downscaled low-resolution features, while temporal priors (from previous reference frames and residuals) are captured using a convolutional neural network based long-short term memory (ConvLSTM) structure in a temporal recurrent fashion. All of these parts are connected and trained jointly towards the optimal rate-distortion performance. Compared with the High-Efficiency Video Coding (HEVC) Main Profile (MP), our method has demonstrated averaged 38% Bjontegaard-Delta Rate (BD-Rate) improvement using standard common test sequences, where the distortion is multi-scale structural similarity (MS-SSIM).

Proceedings ArticleDOI
01 Sep 2019
TL;DR: For improving coding efficiency relative to the state-of-the-art video coding standard HEVC, the following modifications are proposed: Replacing scalar quantization with trellis-coded quantization; and utilizing additional statistical dependencies between quantization indexes for entropy coding.
Abstract: One key component of all block-based hybrid video codecs is transform coding of prediction residues, which consists of an orthogonal block transform, scalar quantization of transform coefficients, and entropy coding of the resulting quantization indexes. For improving coding efficiency relative to the state-of-the-art video coding standard HEVC, we propose the following modifications: (1) Replacing scalar quantization with trellis-coded quantization; and (2) utilizing additional statistical dependencies between quantization indexes for entropy coding. The proposed approach was integrated into the first test model VTM-1 for the new standardization project Versatile Video Coding (VVC). Our coding experiments showed average bit-rate savings of 4.9 % for intra-only, 3.4 % for random access, and 2.8 % for low-delay configurations.

Journal ArticleDOI
TL;DR: This work proposes a novel lossless recompression approach to eliminate statistical redundancy in the coded video bit stream by increasing the probability of this repetitive pattern of continuous “0” or “1” symbol strings in the binary symbol sequence by rearranging the sequence.
Abstract: With the increasing amount of HD or even UHD video, the video streaming transmission over the mobile network is confronted with various quality of experience issues. To facilitate the delivery of compressed video over the network, it is extremely helpful to further compress the coded video bit stream without any information loss, so-called lossless recompression. However, the existing lossless video/image compression algorithms aim to remove the pixel-domain (e.g., lossless coding mode of HEVC) or the coefficient-domain (e.g., entropy coding) redundancy rather than the bit-stream-domain redundancy. To this end, we propose a novel lossless recompression approach to eliminate statistical redundancy in the coded video bit stream. The core idea is to increase the probability of this repetitive pattern of continuous “0” or “1” symbol strings in the binary symbol sequence by rearranging the sequence, including value mapping and position aggregation. In particular, Fibonacci-based mapping rule is used to convert the original sequence (those binary symbols in Fibonacci positions) into a new one with more continuous “0” or “1.” Besides, we aggregate as many identical symbols as possible together to further increase the probability of the occurrence of repetitive patterns. Finally, the lossless compression algorithm and the asymptotic lossless compression scheme are designed to achieve a compact representation of the rearranged symbol sequence. We also regulate formal syntactic structures for the proposed mapping rule, aggregation algorithm, and recompression scheme so as to deterministically revert from the recoded version to the original bit stream. The experimental results on compressed H.264/AVC and H.265/HEVC video data show that our approach can considerably reduce the bit rate of streaming video.

Journal ArticleDOI
TL;DR: The experimental results show that the proposed approach outperforms JPEG, JPEG2000, BPG, and some mainstream neural network-based image compression and produces better visual quality with clearer details and textures because more high-frequency coefficients can be reserved, thanks to the high- frequencies prediction.
Abstract: In this paper, we propose to use deep neural networks for image compression in the wavelet transform domain. When the input image is transformed from the spatial pixel domain to the wavelet transform domain, one low-frequency sub-band (LF sub-band) and three high-frequency sub-bands (HF sub-bands) are generated. Low-frequency sub-band is firstly used to predict each high-frequency sub-band to eliminate redundancy between the sub-bands, after which the sub-bands are fed into different auto-encoders to do the encoding. In order to further improve the compression efficiency, we use a conditional probability model to estimate the context-dependent prior probability of the encoded codes, which can be used for entropy coding. The entire training process is unsupervised, and the auto-encoders and the conditional probability model are trained jointly. The experimental results show that the proposed approach outperforms JPEG, JPEG2000, BPG, and some mainstream neural network-based image compression. Furthermore, it produces better visual quality with clearer details and textures because more high-frequency coefficients can be reserved, thanks to the high-frequency prediction.

Journal ArticleDOI
Jianyu Lin1
TL;DR: A new lossless compression scheme of compressing the initially-acquired continuous-intensity images with a lossy compression algorithm to obtain higher compression efficiency is proposed and the entropy-constrained scalar quantization is implemented using a novel and simple thresholding method.
Abstract: A new lossless compression scheme of compressing the initially-acquired continuous-intensity images with a lossy compression algorithm to obtain higher compression efficiency is proposed. Even if a lossy algorithm is employed, for decoded original images, there is no loss of data in the same sense as the conventional lossless scheme. To realize the new idea, the compression efficiency of the existing lossy subband compression algorithm is improved at high bitrates. For the entropy coding part, a run-length based, symbol-grouping entropy coding method is introduced. For the quantization part, the entropy-constrained scalar quantization is implemented using a novel and simple thresholding method. Coding results show that bit savings of the proposed lossless scheme, which employs a lossy algorithm, over the conventional lossless scheme achieve a maximum of 27.2% and an average of 11.4% in our test.

Journal ArticleDOI
03 Jul 2019-Sensors
TL;DR: This work shows that the proposed analog JSCC system exhibits a performance similar to that of the digital scheme based on JPEG compression with a noticeable better visual degradation to the human eye, a lower computational complexity, and a negligible delay.
Abstract: An analog joint source-channel coding (JSCC) system designed for the transmission of still images is proposed and its performance is compared to that of two digital alternatives which differ in the source encoding operation: Joint Photographic Experts Group (JPEG) and JPEG without entropy coding (JPEGw/oEC), respectively, both relying on an optimized channel encoder–modulator tandem. Apart from a visual comparison, the figures of merit considered in the assessment are the structural similarity (SSIM) index and the time required to transmit an image through additive white Gaussian noise (AWGN) and Rayleigh channels. This work shows that the proposed analog system exhibits a performance similar to that of the digital scheme based on JPEG compression with a noticeable better visual degradation to the human eye, a lower computational complexity, and a negligible delay. These results confirm the suitability of analog JSCC for the transmission of still images in scenarios with severe constraints on power consumption, computational capabilities, and for real-time applications. For these reasons the proposed system is a good candidate for surveillance systems, low-constrained devices, Internet of things (IoT) applications, etc.

Journal ArticleDOI
TL;DR: In this paper, an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms was proposed, which can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression.
Abstract: Video coding standards are primarily designed for efficient lossy compression, but it is also desirable to support efficient lossless compression within video coding standards using small modifications to the lossy coding architecture. A simple approach is to skip transform and quantization, and simply entropy code the prediction residual. However, this approach is inefficient at compression. A more efficient and popular approach is to skip transform and quantization but also process the residual block in some modes with differential pulse code modulation (DPCM), along the horizontal or vertical direction, prior to entropy coding. This paper explores an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms. I2i transforms can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression. We focus on lossless intra coding and develop novel i2i approximations of the odd type-3 discrete sine transform (ODST-3). Experimental results with the high efficiency video coding (HEVC) reference software show that when the developed i2i approximations of the ODST-3 are used along the DPCM method of HEVC, an average 2.7% improvement of lossless intra frame compression efficiency is achieved over HEVC version 2, which uses only the DPCM method, without a significant increase in computational complexity.

Proceedings ArticleDOI
12 May 2019
TL;DR: A multi-channel ECG lossless compression which uses the adaptive linear prediction for intra and inter channel decorrelation and the coefficient of linear prediction and Golomb-Rice codec will make self-adjustment during the process.
Abstract: Electrocardiogram (ECG) is the recording of the heart electrical activity and used to diagnose heart disease nowadays. The diagnosis requires a large amount of time for acquiring enough multi-channel data normally. Thus storage and transmission of 12 lead ECG data will result in massive cost. In this work, we propose a multi-channel ECG lossless compression which uses the adaptive linear prediction for intra and inter channel decorrelation. The proposed technique is based on the adaptive Golomb-Rice codec for entropy coding with adaptive linear prediction. Thus the coefficient of linear prediction and Golomb-Rice codec will make self-adjustment during the process. Finally we evaluate the proposed algorithm with MIT-BIH Arrhythmia database for single-channel compression, and PTB database for multichannel compression.

Proceedings ArticleDOI
20 May 2019
TL;DR: This work proposes a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality, and shows that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.
Abstract: In conventional point cloud delivery, a sender uses octree-based digital video compression to stream three-dimensional (3D) points and the corresponding color attributes over band-limited links, e.g., wireless channels, for 3D scene reconstructions. However, the digital-based delivery schemes have an issue called cliff effect, where the 3D reconstruction quality is a step function in terms of wireless channel quality. We propose a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality. HoloCast regards the 3D points and color components as graph signals and directly transmits linear-transformed signals based on graph Fourier transform (GFT), without digital quantization and entropy coding operations. One of main contributions in HoloCast is that the use of GFT can deal with non-ordered and non-uniformly distributed multidimensional signals such as holographic data unlike conventional delivery schemes. Performance results with point cloud data show that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.

Patent
09 Jul 2019
TL;DR: In this article, a variable code rate image coding system and method based on deep learning was proposed, and the system comprises a forward multi-scale decomposition transformation network module which decomposes an input original image into image features of multiple scales; a quantization module quantizes the image features into integers; a self-adaptive code rate distribution module is used for carrying out block-level code rate distributions on image features quantized into integers according to a given target code rate.
Abstract: The invention discloses a variable code rate image coding system and method based on deep learning, and the system comprises a forward multi-scale decomposition transformation network module which decomposes an input original image into image features of multiple scales; a quantization module quantizes the image features into integers; a self-adaptive code rate distribution module is used for carrying out block-level code rate distribution on the image features quantized into integers according to a given target code rate; an entropy encoding and decoding module encodes the image features subjected to code rate distribution into binary code streams; meanwhile, the invention provides a variable code rate image decoding system and a variable code rate image decoding method which are used fordecoding the codes formed by the encoding system and the encoding method. Positive and negative multi-scale decomposition transformation is constructed by using a deep convolutional neural network, training is carried out by using a large amount of data to obtain an optimal model parameter, and variable code rate image coding and decoding can be realized in practical application in combination with an adaptive code rate distribution method based on image complexity.

Patent
04 Jan 2019
TL;DR: In this paper, a large-capacity hiding method for reversible data in a bitstream encryption domain of a JPEG image was proposed, where a user performs exclusive or encryption on extension bits of entropy coding in each of image blocks according to an encryption key in JPEG bitstreams, performs pseudo-random scrambling on remaining AC coefficients except a last non-zero AC coefficient, and then performs the pseudorandom scrambling of all the image blocks; a cloud performs embedding of extraneous information by a histogram shifting method on the AC coefficients in bitstream ciphertext according to a hiding key
Abstract: The invention discloses a large-capacity hiding method for reversible data in a bitstream encryption domain of a JPEG image. According to the method, a user performs exclusive or encryption on extension bits of entropy coding in each of image blocks according to an encryption key in JPEG bitstreams, performs pseudo-random scrambling on remaining AC coefficients except a last non-zero AC coefficient, and then performs the pseudo-random scrambling on the entropy coding of all the image blocks; a cloud performs embedding of extraneous information by a histogram shifting method on the AC coefficients in a bitstream ciphertext according to a hiding key, and can perform non-destructive extraction; and a receiver performs pseudo-random scrambling resumption on the entropy coding of ciphertext image blocks in the bitstream ciphertext according to the encryption key, and then the pseudo-random scrambling resumption and exclusive or decryption of the AC coefficients are performed in the entropycoding of each of the ciphertext image blocks to obtain decrypted bitstreams same as original JPEG bitstreams. The method not only achieves encryption of the extension bits of the entropy coding, butalso encrypts Huffman coding of the entropy coding; and the steganographic capacity is large, and the safety is high.

Posted Content
TL;DR: In this article, a graph Fourier transform (GFT) based point cloud delivery scheme is proposed to improve the 3D reconstruction quality with the improvement of wireless channel quality in noisy wireless environment.
Abstract: In conventional point cloud delivery, a sender uses octree-based digital video compression to stream three-dimensional (3D) points and the corresponding color attributes over band-limited links, e.g., wireless channels, for 3D scene reconstructions. However, the digital-based delivery schemes have an issue called cliff effect, where the 3D reconstruction quality is a step function in terms of wireless channel quality. We propose a novel scheme of point cloud delivery, called HoloCast, to gracefully improve the reconstruction quality with the improvement of wireless channel quality. HoloCast regards the 3D points and color components as graph signals and directly transmits linear-transformed signals based on graph Fourier transform (GFT), without digital quantization and entropy coding operations. One of main contributions in HoloCast is that the use of GFT can deal with non-ordered and non-uniformly distributed multi-dimensional signals such as holographic data unlike conventional delivery schemes. Performance results with point cloud data show that HoloCast yields better 3D reconstruction quality compared to digital-based methods in noisy wireless environment.