A Technical Overview of AV1

doi:10.1109/JPROC.2021.3058584

Home
/
Papers
/
A Technical Overview of AV1

Journal Article•DOI•

A Technical Overview of AV1

Jingning Han¹, Bohan Li¹, Debargha Mukherjee¹, Ching-Han Chiang¹, Adrian Grange¹, Cheng Chen¹, Hui Su¹, Sarah Parker¹, Sai Deng¹, Urvang Joshi¹, Yue Chen¹, Yunqing Wang¹, Paul Wilkins¹, Yaowu Xu¹, James Bankoski¹ - Show less +11 more•Institutions (1)

Google¹

26 Feb 2021-Vol. 109, Iss: 9, pp 1435-1462

TL;DR: A technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility is provided.

read less

Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

[...]

Dandan Ding¹, Zhan Ma², Di Chen³, Qingshuang Chen⁴, Zoe Liu, Fengqing Zhu⁴ - Show less +2 more•Institutions (4)

Hangzhou Normal University¹, Nanjing University², Google³, Purdue University⁴

16 Jan 2021

TL;DR: This article presents an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning.

...read moreread less

Abstract: Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)-based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of a subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On postprocessing, we demonstrate two neural adaptive filters to, respectively, facilitate the in-loop and postfiltering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/ .

...read moreread less

35 citations

Cites background from "A Technical Overview of AV1"

...26x series [9]–[13], AVS series [14]–[16], as well as the AV1 [17], [18] from the Alliance of Open Media (AOM) [19])....
[...]

Proceedings Article•DOI•

Rate-Distortion Optimized Learning-Based Image Compression using an Adaptive Hierachical Autoencoder with Conditional Hyperprior

[...]

Fabian Brand¹, Kristian Fischer¹, Andre Kaup¹•Institutions (1)

University of Erlangen-Nuremberg¹

01 Jun 2021

TL;DR: RDONet as discussed by the authors proposes to use different hierarchical levels of the autoencoder to adaptively transform the received latent space back to a reconstructed image, which can save up to 20% rate over comparable non-hierarchical models.

...read moreread less

Abstract: Deep-learning-based compressive autoencoders consist of a single non-linear function mapping the image to a latent space which is quantized and transmitted. Afterwards, a second non-linear function transforms the received latent space back to a reconstructed image. This method achieves superior quality than many traditional image coders, which is due to a non-linear generalization of linear transforms used in traditional coders. However, modern image and video coder achieve large coding gains by applying rate-distortion optimization on dynamic block-partitioning. In this paper, we present RDONet, a novel approach to achieve similar effects in compression with full image autoencoders by using different hierarchical levels, which are transmitted adaptively after performing an external rate-distortion optimization. Using our model, we are able to save up to 20% rate over comparable non-hierarchical models while maintaining the same quality.

...read moreread less

11 citations

Proceedings Article•DOI•

High-Throughput and Low-Power Architectures for the AV1 Arithmetic Encoder

[...]

Tulio Pereira Bitencourt¹, Fabio Luis Livi Ramos, Sergio Bampi¹•Institutions (1)

Universidade Federal do Rio Grande do Sul¹

23 Aug 2021

TL;DR: In this paper, a hardware architecture, named AE-AV1, is presented to entirely execute the arithmetic encoding process of the AV1 codec while achieving enough throughput rate for an ultra-high performance (i.e., 8K@120fps real-time codification).

...read moreread less

Abstract: With the emerging interest in video-on-demand systems, streaming service providers shall acclimate their systems to decrease the global Internet infrastructure impact caused by videos. Video coding standards are presented as a powerful but complex solution for this problem. Hence, to tackle these tools' complexity and allow a better codification flow, hardware designs arise as options for decreasing the bottleneck of video-on-demand systems. This paper presents a hardware architecture, named AE-AV1, that aims to entirely execute the arithmetic encoding process of the AV1 codec while achieving enough throughput rate for an ultra-high performance (i.e., 8K@120fps real-time codification). Moreover, this document also propounds the LP-AE-AV1 architecture, which represents a low-power version of the AE-AV1.

...read moreread less

8 citations

Journal Article•DOI•

FPX-NIC: An FPGA-Accelerated 4K Ultra-High-Definition Neural Video Coding System

[...]

Chuanmin Jia, Xin Hang, Shanshe Wang, Yaqiang Wu, Wen Gao - Show less +1 more

01 Sep 2022

TL;DR: This paper presents FPX-NIC, an FPGA-accelerated NIC framework designed for hardware encoding, which consists of a novel NIC scheme and an energy-efficient neural network (NN) deployment method that is able to improve both processing speed and energy efficiency.

...read moreread less

Abstract: The recent trend in neural image compression (NIC) research could be generally grounded into two categories: analysis-synthesis transform network improvements and entropy estimation optimization. They promote the compression efficiency of NIC by leveraging more expressive network structures and advanced entropy models respectively. From a different but more systematic viewpoint, we extend the horizon of NIC from software- to hardware-based lossy compression using more resource-constrained platforms, such as field programmable gate array (FPGA) or deep-learning processor unit (DPU). In this paper, we propose a novel hardware-oriented NIC system for real-time edge-computing video services. We for the first time present FPX-NIC, an FPGA-accelerated NIC framework designed for hardware encoding, which consists of a novel NIC scheme and an energy-efficient neural network (NN) deployment method. The former contribution is a block-based adaptive NIC approach based on local content characteristics. Essential side-information is also signalled to realize adaptive patch representation. The critical advantage of our latter contribution lies in the network-reconfigurable framework plus fixed-precision weights quantization method that takes advantage of quantization-aware post training procedure to compensate the performance degradation caused by quantization error. Therefore it is able to improve both processing speed and energy efficiency. We finally establish an intelligent video coding system using the proposed scheme, enabling visual capturing, neural encoding, decoding, and display, realizing 4K ultra-high-definition (UHD) all intra neural video coding on edge-computing devices.

...read moreread less

5 citations

Journal Article•DOI•

Estimating the Resize Parameter in End-to-end Learned Image Compression

[...]

Li-Heng Chen, Christos G. Bampis, Zhi Li, Lukas Krasula, Alan C. Bovik - Show less +1 more

26 Apr 2022-arXiv.org

TL;DR: A search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models is described, which suggests that “compression friendly” downsampled representations can be quickly determined during encoding by using an auxiliary network and differentiable image warping.

...read moreread less

Abstract: We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models. Our approach is simple: compose a pair of differentiable downsampling/upsampling layers that sandwich a neural compression model. To determine resize factors for different inputs, we utilize another neural network jointly trained with the compression model, with the end goal of minimizing the rate-distortion objective. Our results suggest that"compression friendly"downsampled representations can be quickly determined during encoding by using an auxiliary network and differentiable image warping. By conducting extensive experimental tests on existing deep image compression models, we show results that our new resizing parameter estimation framework can provide Bj{\o}ntegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines. We also carried out a subjective quality study, the results of which show that our new approach yields favorable compressed images. To facilitate reproducible research in this direction, the implementation used in this paper is being made freely available online at: https://github.com/treammm/ResizeCompression.

...read moreread less

4 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Overview of the H.264/AVC video coding standard

[...]

Thomas Wiegand¹, Gary J. Sullivan², G. Bjontegaard, Ajay Luthra³•Institutions (3)

Heinrich Hertz Institute¹, Microsoft², Motorola³

01 Jul 2003-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.

...read moreread less

Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.

...read moreread less

8,646 citations

"A Technical Overview of AV1" refers background in this paper

...264/AVC [4] and were quickly adopted by the industry....
[...]

Journal Article•DOI•

Overview of the High Efficiency Video Coding (HEVC) Standard

[...]

Gary J. Sullivan¹, Jens-Rainer Ohm², Woo-Jin Han³, Thomas Wiegand⁴•Institutions (4)

Microsoft¹, RWTH Aachen University², Gachon University³, Fraunhofer Society⁴

01 Dec 2012-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.

...read moreread less

Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

...read moreread less

7,383 citations

"A Technical Overview of AV1" refers background in this paper

...VP9 [2] and HEVC [3], both debuted in 2013, achieved in the range of 50% higher compression performance than the prior codec H.264/AVC [4] and were quickly adopted by the industry....
[...]
...This is repeated up to 4 times, which effectively covers the range [3, 14]....
[...]
...If V ∈ [3, 5], this LR symbol will be able to cover its value and complete the coding....
[...]
...VP9 [2] and HEVC [3], both debuted in 2013, achieved in the range of 50% higher compression performance than the prior codec H....
[...]

Journal Article•DOI•

Discrete Cosine Transform

[...]

N. U. Ahmed¹, T. Natarajan¹, K. R. Rao•Institutions (1)

Kansas State University¹

01 Jan 1974-IEEE Transactions on Computers

TL;DR: In this article, a discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed, which can be used in the area of digital processing for the purposes of pattern recognition and Wiener filtering.

...read moreread less

Abstract: A discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed. It is shown that the discrete cosine transform can be used in the area of digital processing for the purposes of pattern recognition and Wiener filtering. Its performance is compared with that of a class of orthogonal transforms and is found to compare closely to that of the Karhunen-Loeve transform, which is known to be optimal. The performances of the Karhunen-Loeve and discrete cosine transforms are also found to compare closely with respect to the rate-distortion criterion.

...read moreread less

4,481 citations

Journal Article•DOI•

Arithmetic coding for data compression

[...]

Ian H. Witten¹, Radford M. Neal¹, John G. Cleary¹•Institutions (1)

University of Calgary¹

01 Jun 1987-Communications of The ACM

TL;DR: The state of the art in data compression is arithmetic coding, not the better-known Huffman method, which gives greater compression, is faster for adaptive models, and clearly separates the model from the channel encoding.

...read moreread less

Abstract: The state of the art in data compression is arithmetic coding, not the better-known Huffman method. Arithmetic coding gives greater compression, is faster for adaptive models, and clearly separates the model from the channel encoding.

...read moreread less

3,188 citations

"A Technical Overview of AV1" refers methods in this paper

...The M-ary symbol arithmetic coding largely follows [37] with all the floating-point data scaled by 2(15) and represented by 15-bit unsigned integers....
[...]
...The arithmetic coding directly uses the CDFs to compress symbols [37]....
[...]

Journal Article•DOI•

A Fast Computational Algorithm for the Discrete Cosine Transform

[...]

Wen-Hsiung Chen, C. Smith¹, S. Fralick¹•Institutions (1)

Ford Motor Company¹

01 Sep 1977-IEEE Transactions on Communications

TL;DR: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional DiscreteCosine Transform algorithms using the Fast Fourier Transform.

...read moreread less

Abstract: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional Discrete Cosine Transform algorithms using the Fast Fourier Transform. The algorithm is derived in the form of matrices and illustrated by a signal-flow graph, which may be readily translated to hardware or software implementations.

...read moreread less

1,301 citations