scispace - formally typeset
Search or ask a question
Journal ArticleDOI

HEVC: The New Gold Standard for Video Compression: How Does HEVC Compare with H.264/AVC?

TL;DR: In this paper, the Joint Collaborative Team on Video Coding (JCT-VC) was established with the objective to develop a new high-performance video coding standard for mobile applications.
Abstract: Digital video has become ubiquitous in our everyday lives; everywhere we look, there are devices that can display, capture, and transmit video. The recent advances in technology have made it possible to capture and display video material with ultrahigh definition (UHD) resolution. Now is the time when the current Internet and broadcasting networks do not even have sufficient capacity to transmit large amounts of HD content-Let alone UHD. The need for an improved transmission system is more pronounced in the mobile sector because of the introduction of lightweight HD resolutions (such as 720 pixel) for mobile applications. The limitations of current technologies prompted the International Standards Organization/International Electrotechnical Commission Moving Picture Experts Group (MPEG) and International Telecommunication Union-Telecommunication Standardization Sector Video Coding Experts Group (VCEG) to establish the Joint Collaborative Team on Video Coding (JCT-VC), with the objective to develop a new high-performance video coding standard.
Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive survey of the evolution of video quality assessment methods, analyzing their characteristics, advantages, and drawbacks and identifying the future research directions of QoE is given.
Abstract: Quality of experience (QoE) is the perceptual quality of service (QoS) from the users' perspective. For video service, the relationship between QoE and QoS (such as coding parameters and network statistics) is complicated because users' perceptual video quality is subjective and diversified in different environments. Traditionally, QoE is obtained from subjective test, where human viewers evaluate the quality of tested videos under a laboratory environment. To avoid high cost and offline nature of such tests, objective quality models are developed to predict QoE based on objective QoS parameters, but it is still an indirect way to estimate QoE. With the rising popularity of video streaming over the Internet, data-driven QoE analysis models have newly emerged due to availability of large-scale data. In this paper, we give a comprehensive survey of the evolution of video quality assessment methods, analyzing their characteristics, advantages, and drawbacks. We also introduce QoE-based video applications and, finally, identify the future research directions of QoE.

296 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This work presents a new algorithm for video coding, learned end-to-end for the low-latency mode, which outperforms all existing video codecs across nearly the entire bitrate range, and is the first ML-based method to do so.
Abstract: We present a new algorithm for video coding, learned end-to-end for the low-latency mode. In this setting, our approach outperforms all existing video codecs across nearly the entire bitrate range. To our knowledge, this is the first ML-based method to do so. We evaluate our approach on standard video compression test sets of varying resolutions, and benchmark against all mainstream commercial codecs in the low-latency mode. On standard-definition videos, HEVC/H.265, AVC/H.264 and VP9 typically produce codes up to 60% larger than our algorithm. On high-definition 1080p videos, H.265 and VP9 typically produce codes up to 20% larger, and H.264 up to 35% larger. Furthermore, our approach does not suffer from blocking artifacts and pixelation, and thus produces videos that are more visually pleasing. We propose two main contributions. The first is a novel architecture for video compression, which (1) generalizes motion estimation to perform any learned compensation beyond simple translations, (2) rather than strictly relying on previously transmitted reference frames, maintains a state of arbitrary information learned by the model, and (3) enables jointly compressing all transmitted signals (such as optical flow and residual). Secondly, we present a framework for ML-based spatial rate control --- a mechanism for assigning variable bitrates across space for each frame. This is a critical component for video coding, which to our knowledge had not been developed within a machine learning setting.

196 citations

Journal ArticleDOI
TL;DR: A novel 8-point DCT approximation that requires only 14 addition operations and no multiplications is introduced and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio.
Abstract: Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology

112 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed CU size decision algorithm significantly reduces computational complexity by 69% on average with 2.99% Bjøntegaard difference bitrate increase for random access and performs well for various characteristic of sequences and outperforms the two previous state-of-the-art works.
Abstract: High Efficiency Video Coding (HEVC) employs a coding unit (CU), prediction unit (PU), and transform unit (TU) based on the quadtree coding tree unit (CTU) structure to improve coding efficiency. However, the computational complexity increases greatly because the rate-distortion (RD) optimization process should be performed for all CUs, PUs, and TUs to obtain the optimal CTU partition. In this paper, a fast CU size decision algorithm is proposed to reduce the encoder complexity of HEVC. Based on the statistical analysis, three approaches with SKIP mode decision (SMD), CU skip estimation (CUSE), and early CU termination (ECUT) are considered. In SMD, it is determined that the remaining modes except for SKIP mode are preformed or not. CUSE and ECUT determine that larger CU sizes and smaller CU sizes are coded or not, respectively. Thresholds for SMD, CUSE, and ECUT are designed based on Bayes’ rule with a complexity factor. Update process is performed to estimate the statistical parameters for SMD, CUSE, and ECUT considering the characteristic of RD cost. The experimental results demonstrate that the proposed CU size decision algorithm significantly reduces computational complexity by 69% on average with 2.99% Bjontegaard difference bitrate (BDBR) increase for random access. The complexity reduction and BDBR increase for low delay are 68% and 2.46%, respectively. The experimental results also show that our proposed scheme performs well for various characteristic of sequences and outperforms the two previous state-of-the-art works.

88 citations

Journal ArticleDOI
TL;DR: In this paper, the relationship between 3D quality and bitrate at different frame rates was investigated. But the authors focused on the case of 2D video and not for 3D.
Abstract: Increasing the frame rate of a 3D video generally results in improved Quality of Experience (QoE). However, higher frame rates involve a higher degree of complexity in capturing, transmission, storage, and display. The question that arises here is what frame rate guarantees high viewing quality of experience given the existing/required 3D devices and technologies (3D cameras, 3D TVs, compression, transmission bandwidth, and storage capacity). This question has already been addressed for the case of 2D video, but not for 3D. The objective of this paper is to study the relationship between 3D quality and bitrate at different frame rates. Our performance evaluations show that increasing the frame rate of 3D videos beyond 60 fps may not be visually distinguishable. In addition, our experiments show that when the available bandwidth is reduced, the highest possible 3D quality of experience can be achieved by adjusting (decreasing) the frame rate instead of increasing the compression ratio. The results of our study are of particular interest to network providers for rate adaptation in variable bitrate channels.

88 citations

References
More filters
Journal ArticleDOI
TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.
Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.

8,646 citations

Proceedings ArticleDOI
01 Dec 2011
TL;DR: A new video coding tool, sample adaptive offset (SAO), which is to classify reconstructed pixels into different categories and then reduce the distortion by simply adding an offset for each category of pixels.
Abstract: A new video coding tool, sample adaptive offset (SAO), is introduced in this paper. SAO has been adopted into the Working Draft of the new video coding standard, High-Efficiency Video Coding (HEVC). The SAO is located after deblocking in the video coding loop. The concept of SAO is to classify reconstructed pixels into different categories and then reduce the distortion by simply adding an offset for each category of pixels. The pixel intensity and edge properties are used for pixel classification. To further improve the coding efficiency, a picture can be divided into regions for localization of offset parameters. Simulation results show that SAO can achieve on average 2% bit rate reduction and up to 6% bit rate reduction. The run time increases for encoders and decoders are only 2%.

137 citations

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper proposes a new approach to combined spatial (Intra) prediction and adaptive transform coding in block-based video and image compression, which is implemented within the H.264/AVC intra mode, and is shown in experiments to significantly outperform the standard intra modes, and achieve significant reduction of the blocking effect.
Abstract: This paper proposes a new approach to combined spatial (Intra) prediction and adaptive transform coding in block-based video and image compression. Context-adaptive spatial prediction from available, previously decoded boundaries of the block, is followed by optimal transform coding of the prediction residual. The derivation of both the prediction and the adaptive transform for the prediction error, assumes a separable first-order Gauss-Markov model for the image signal. The resulting optimal transform is shown to be a close relative of the sine transform with phase and frequencies such that basis vectors tend to vanish at known boundaries and maximize energy at unknown boundaries. The overall scheme switches between the above sine-like transform and discrete cosine transform (per direction, horizontal or vertical) depending on the prediction and boundary information. It is implemented within the H.264/AVC intra mode, is shown in experiments to significantly outperform the standard intra mode, and achieve significant reduction of the blocking effect.

117 citations

Proceedings ArticleDOI
Chia-Yang Tsai1, Ching-Yeh Chen1, Chih-Ming Fu1, Yu-Wen Huang1, Shaw-Min Lei1 
29 Dec 2011
TL;DR: A method to estimate filtering distortion without performing real filter operation is proposed for adaptive loop filter in high-efficiency video coding (HEVC) and the number of encoding passes can be effectively reduced from 16 to 1.
Abstract: In this paper, a one-pass encoding algorithm is proposed for adaptive loop filter (ALF) in high-efficiency video coding (HEVC). ALF can improve both subjective and objective video quality, but it also requires a lot of encoding passes (i.e. picture buffer accesses) that will significantly increase external memory access, encoding latency, and power consumption. Therefore, we propose a method to estimate filtering distortion without performing real filter operation. The number of encoding passes can be effectively reduced from 16 to 1. Combined with an initial guess of filter-on/off blocks by using time-delayed filters, the proposed one-pass algorithm only induces average 0.17% BD-rate increase.

15 citations