scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Technical Overview of AV1

26 Feb 2021-Vol. 109, Iss: 9, pp 1435-1462
TL;DR: A technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility is provided.
Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
16 Jan 2021
TL;DR: This article presents an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning.
Abstract: Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)-based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of a subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On postprocessing, we demonstrate two neural adaptive filters to, respectively, facilitate the in-loop and postfiltering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/ .

35 citations


Cites background from "A Technical Overview of AV1"

  • ...26x series [9]–[13], AVS series [14]–[16], as well as the AV1 [17], [18] from the Alliance of Open Media (AOM) [19])....

    [...]

Proceedings ArticleDOI
01 Jun 2021
TL;DR: RDONet as discussed by the authors proposes to use different hierarchical levels of the autoencoder to adaptively transform the received latent space back to a reconstructed image, which can save up to 20% rate over comparable non-hierarchical models.
Abstract: Deep-learning-based compressive autoencoders consist of a single non-linear function mapping the image to a latent space which is quantized and transmitted. Afterwards, a second non-linear function transforms the received latent space back to a reconstructed image. This method achieves superior quality than many traditional image coders, which is due to a non-linear generalization of linear transforms used in traditional coders. However, modern image and video coder achieve large coding gains by applying rate-distortion optimization on dynamic block-partitioning. In this paper, we present RDONet, a novel approach to achieve similar effects in compression with full image autoencoders by using different hierarchical levels, which are transmitted adaptively after performing an external rate-distortion optimization. Using our model, we are able to save up to 20% rate over comparable non-hierarchical models while maintaining the same quality.

11 citations

Proceedings ArticleDOI
23 Aug 2021
TL;DR: In this paper, a hardware architecture, named AE-AV1, is presented to entirely execute the arithmetic encoding process of the AV1 codec while achieving enough throughput rate for an ultra-high performance (i.e., 8K@120fps real-time codification).
Abstract: With the emerging interest in video-on-demand systems, streaming service providers shall acclimate their systems to decrease the global Internet infrastructure impact caused by videos. Video coding standards are presented as a powerful but complex solution for this problem. Hence, to tackle these tools' complexity and allow a better codification flow, hardware designs arise as options for decreasing the bottleneck of video-on-demand systems. This paper presents a hardware architecture, named AE-AV1, that aims to entirely execute the arithmetic encoding process of the AV1 codec while achieving enough throughput rate for an ultra-high performance (i.e., 8K@120fps real-time codification). Moreover, this document also propounds the LP-AE-AV1 architecture, which represents a low-power version of the AE-AV1.

8 citations

Journal ArticleDOI
01 Sep 2022
TL;DR: This paper presents FPX-NIC, an FPGA-accelerated NIC framework designed for hardware encoding, which consists of a novel NIC scheme and an energy-efficient neural network (NN) deployment method that is able to improve both processing speed and energy efficiency.
Abstract: The recent trend in neural image compression (NIC) research could be generally grounded into two categories: analysis-synthesis transform network improvements and entropy estimation optimization. They promote the compression efficiency of NIC by leveraging more expressive network structures and advanced entropy models respectively. From a different but more systematic viewpoint, we extend the horizon of NIC from software- to hardware-based lossy compression using more resource-constrained platforms, such as field programmable gate array (FPGA) or deep-learning processor unit (DPU). In this paper, we propose a novel hardware-oriented NIC system for real-time edge-computing video services. We for the first time present FPX-NIC, an FPGA-accelerated NIC framework designed for hardware encoding, which consists of a novel NIC scheme and an energy-efficient neural network (NN) deployment method. The former contribution is a block-based adaptive NIC approach based on local content characteristics. Essential side-information is also signalled to realize adaptive patch representation. The critical advantage of our latter contribution lies in the network-reconfigurable framework plus fixed-precision weights quantization method that takes advantage of quantization-aware post training procedure to compensate the performance degradation caused by quantization error. Therefore it is able to improve both processing speed and energy efficiency. We finally establish an intelligent video coding system using the proposed scheme, enabling visual capturing, neural encoding, decoding, and display, realizing 4K ultra-high-definition (UHD) all intra neural video coding on edge-computing devices.

5 citations

Journal ArticleDOI
TL;DR: A search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models is described, which suggests that “compression friendly” downsampled representations can be quickly determined during encoding by using an auxiliary network and differentiable image warping.
Abstract: We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models. Our approach is simple: compose a pair of differentiable downsampling/upsampling layers that sandwich a neural compression model. To determine resize factors for different inputs, we utilize another neural network jointly trained with the compression model, with the end goal of minimizing the rate-distortion objective. Our results suggest that"compression friendly"downsampled representations can be quickly determined during encoding by using an auxiliary network and differentiable image warping. By conducting extensive experimental tests on existing deep image compression models, we show results that our new resizing parameter estimation framework can provide Bj{\o}ntegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines. We also carried out a subjective quality study, the results of which show that our new approach yields favorable compressed images. To facilitate reproducible research in this direction, the implementation used in this paper is being made freely available online at: https://github.com/treammm/ResizeCompression.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.
Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.

8,646 citations


"A Technical Overview of AV1" refers background in this paper

  • ...264/AVC [4] and were quickly adopted by the industry....

    [...]

Journal ArticleDOI
TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.
Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

7,383 citations


"A Technical Overview of AV1" refers background in this paper

  • ...VP9 [2] and HEVC [3], both debuted in 2013, achieved in the range of 50% higher compression performance than the prior codec H.264/AVC [4] and were quickly adopted by the industry....

    [...]

  • ...This is repeated up to 4 times, which effectively covers the range [3, 14]....

    [...]

  • ...If V ∈ [3, 5], this LR symbol will be able to cover its value and complete the coding....

    [...]

  • ...VP9 [2] and HEVC [3], both debuted in 2013, achieved in the range of 50% higher compression performance than the prior codec H....

    [...]

Journal ArticleDOI
TL;DR: In this article, a discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed, which can be used in the area of digital processing for the purposes of pattern recognition and Wiener filtering.
Abstract: A discrete cosine transform (DCT) is defined and an algorithm to compute it using the fast Fourier transform is developed. It is shown that the discrete cosine transform can be used in the area of digital processing for the purposes of pattern recognition and Wiener filtering. Its performance is compared with that of a class of orthogonal transforms and is found to compare closely to that of the Karhunen-Loeve transform, which is known to be optimal. The performances of the Karhunen-Loeve and discrete cosine transforms are also found to compare closely with respect to the rate-distortion criterion.

4,481 citations

Journal ArticleDOI
TL;DR: The state of the art in data compression is arithmetic coding, not the better-known Huffman method, which gives greater compression, is faster for adaptive models, and clearly separates the model from the channel encoding.
Abstract: The state of the art in data compression is arithmetic coding, not the better-known Huffman method. Arithmetic coding gives greater compression, is faster for adaptive models, and clearly separates the model from the channel encoding.

3,188 citations


"A Technical Overview of AV1" refers methods in this paper

  • ...The M-ary symbol arithmetic coding largely follows [37] with all the floating-point data scaled by 2(15) and represented by 15-bit unsigned integers....

    [...]

  • ...The arithmetic coding directly uses the CDFs to compress symbols [37]....

    [...]

Journal ArticleDOI
TL;DR: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional DiscreteCosine Transform algorithms using the Fast Fourier Transform.
Abstract: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional Discrete Cosine Transform algorithms using the Fast Fourier Transform. The algorithm is derived in the form of matrices and illustrated by a signal-flow graph, which may be readily translated to hardware or software implementations.

1,301 citations


"A Technical Overview of AV1" refers background in this paper

  • ...The butterfly structure [35] allows substantial reduction in multiplication operations over plain matrix multiplication, i....

    [...]

Related Papers (5)