scispace - formally typeset
Search or ask a question
Author

Ching-Han Chiang

Bio: Ching-Han Chiang is an academic researcher from Google. The author has contributed to research in topics: Codec & Data compression. The author has an hindex of 6, co-authored 17 publications receiving 240 citations.

Papers
More filters
Proceedings ArticleDOI
24 Jun 2018
TL;DR: A brief technical overview of key coding techniques in AV1 is provided along with preliminary compression performance comparison against VP9 and HEVC.
Abstract: AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.

260 citations

Journal ArticleDOI
26 Feb 2021
TL;DR: A technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility is provided.
Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

95 citations

Journal ArticleDOI
23 Feb 2020
TL;DR: A technical overview of key coding techniques in AV1 is provided and the coding performance gains are validated by video compression tests performed with the libaom AV1 encoder against the libvpx VP9 encoder.
Abstract: In 2018, the Alliance for Open Media (AOMedia) finalized its first video compression format AV1, which is jointly developed by the industry consortium of leading video technology companies. The main goal of AV1 is to provide an open source and royalty-free video coding format that substantially outperforms state-of-the-art codecs available on the market in compression efficiency while remaining practical decoding complexity as well as being optimized for hardware feasibility and scalability on modern devices. To give detailed insights into how the targeted performance and feasibility is realized, this paper provides a technical overview of key coding techniques in AV1. Besides, the coding performance gains are validated by video compression tests performed with the libaom AV1 encoder against the libvpx VP9 encoder. Preliminary comparison with two leading HEVC encoders, x265 and HM, and the reference software of VVC is also conducted on AOM's common test set and an open 4k set.

44 citations

Journal Article
TL;DR: This paper targets the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services.
Abstract: Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce energy use and costs overall. In this paper, we present an application of the MuZero algorithm to the challenge of video compression. Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services. We treat this as a sequential decision making problem to maximize the video quality with an episodic constraint imposed by the target bitrate. Notably, we introduce a novel self-competition based reward mechanism to solve constrained RL with variable constraint satisfaction difficulty, which is challenging for existing constrained RL methods. We demonstrate that the MuZero-based rate control achieves an average 6.28% reduction in size of the compressed videos for the same delivered video quality level (measured as PSNR BD-rate) compared to libvpx's two-pass VBR rate control policy, while having better constraint satisfaction behavior.

18 citations

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A level map approach to transform coding, which decomposes the coding of coefficient magnitudes into consecutive runs of binary map coding, each corresponds to whether a coefficient is equal to or greater than the given level.
Abstract: Transform coding is widely used in the video and image codec to largely remove the spatial correlation. The magnitude of transform coefficient is weakly correlated to a number of factors, including its frequency band, the neighboring coefficient magnitudes, luma/chroma planes, etc. To exploit such correlations for efficient entropy coding, one would build a probability model conditioned on the available contexts. However, the interaction of these factors creates a high dimensional space, a direct use of which would easily fall into the over-fitting problem. How to construct a compact context set which effectively captures the underlying correlations remains a major challenge in video and image compression. Prior research work primarily relies on bucketizing the previously coded coefficients into a small number of categories as the context model for next coefficient. Certain information loss is inevitable due to the classification process. To fully exploit the available context in a limited model space, a level map approach is proposed in this work. It decomposes the coding of coefficient magnitudes into consecutive runs of binary map coding, each corresponds to whether a coefficient is equal to or greater than the given level. Under the Markov assumption across the levels, nearly all the reference symbols available to each level map can be approximated as binary random variables. It hence allows the context model to account for all the surrounding coefficients information provided by the lower level maps, while retaining a reasonably compact size. Experimental evidence demonstrates that the proposed coding scheme provides considerable compression performance gains consistently over a large test settings.

11 citations


Cited by
More filters
Proceedings ArticleDOI
24 Jun 2018
TL;DR: A brief technical overview of key coding techniques in AV1 is provided along with preliminary compression performance comparison against VP9 and HEVC.
Abstract: AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.

260 citations

Posted Content
TL;DR: CompressAI is presented, a platform that provides custom operations, layers, models and tools to research, develop and evaluate end-to-end image and video compression codecs and is intended to be soon extended to the video compression domain.
Abstract: This paper presents CompressAI, a platform that provides custom operations, layers, models and tools to research, develop and evaluate end-to-end image and video compression codecs. In particular, CompressAI includes pre-trained models and evaluation tools to compare learned methods with traditional codecs. Multiple models from the state-of-the-art on learned end-to-end compression have thus been reimplemented in PyTorch and trained from scratch. We also report objective comparison results using PSNR and MS-SSIM metrics vs. bit-rate, using the Kodak image dataset as test set. Although this framework currently implements models for still-picture compression, it is intended to be soon extended to the video compression domain.

175 citations

Posted Content
David Minnen1, Saurabh Singh1
TL;DR: In this article, channel-conditioning and latent residual prediction are introduced to improve the performance of the entropy-constrained autoencoder with an entropy model that uses both forward and backward adaptation.
Abstract: In learning-based approaches to image compression, codecs are developed by optimizing a computational model to minimize a rate-distortion objective. Currently, the most effective learned image codecs take the form of an entropy-constrained autoencoder with an entropy model that uses both forward and backward adaptation. Forward adaptation makes use of side information and can be efficiently integrated into a deep neural network. In contrast, backward adaptation typically makes predictions based on the causal context of each symbol, which requires serial processing that prevents efficient GPU / TPU utilization. We introduce two enhancements, channel-conditioning and latent residual prediction, that lead to network architectures with better rate-distortion performance than existing context-adaptive models while minimizing serial processing. Empirically, we see an average rate savings of 6.7% on the Kodak image set and 11.4% on the Tecnick image set compared to a context-adaptive baseline model. At low bit rates, where the improvements are most effective, our model saves up to 18% over the baseline and outperforms hand-engineered codecs like BPG by up to 25%.

129 citations

Journal ArticleDOI
26 Feb 2021
TL;DR: A technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility is provided.
Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

95 citations

Proceedings ArticleDOI
David Minnen1, Saurabh Singh1
01 Oct 2020
TL;DR: This work introduces two enhancements, channel-conditioning and latent residual prediction, that lead to network architectures with better rate-distortion performance than existing context-adaptive models while minimizing serial processing.
Abstract: In learning-based approaches to image compression, codecs are developed by optimizing a computational model to minimize a rate-distortion objective. Currently, the most effective learned image codecs take the form of an entropy-constrained autoencoder with an entropy model that uses both forward and backward adaptation. Forward adaptation makes use of side information and can be efficiently integrated into a deep neural network. In contrast, backward adaptation typically makes predictions based on the causal context of each symbol, which requires serial processing that prevents efficient GPU / TPU utilization. We introduce two enhancements, channel-conditioning and latent residual prediction, that lead to network architectures with better rate-distortion performance than existing context-adaptive models while minimizing serial processing. Empirically, we see an average rate savings of 6.7% on the Kodak image set and 11.4% on the Tecnick image set compared to a context-adaptive baseline model. At low bit rates, where the improvements are most effective, our model saves up to 18% over the baseline and outperforms hand-engineered codecs like BPG by up to 25%.

91 citations