scispace - formally typeset
Open AccessJournal ArticleDOI

Learning for Video Compression

TLDR
The proposed PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks can model spatiotemporal coherence to effectively perform predictive coding inside the learning network and provides a possible new direction to further improve compression efficiency and functionalities of future video coding.
Abstract
One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper, we propose the concept of PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks. PMCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of PMCNN, we further explore a learning-based framework for video compression with additional components of iterative analysis/synthesis and binarization. The experimental results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning-based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding.

read more

Citations
More filters
Proceedings ArticleDOI

DVC: An End-To-End Deep Video Compression Framework

TL;DR: This paper proposes the first end-to-end video compression deep model that jointly optimizes all the components for video compression, and shows that the proposed approach can outperform the widely used video coding standard H.264 in terms of PSNR and be even on par with the latest standard MS-SSIM.
Journal ArticleDOI

Image and Video Compression With Neural Networks: A Review

TL;DR: The evolution and development of neural network-based compression methodologies are introduced for images and video respectively and the joint compression on semantic and visual information is tentatively explored to formulate high efficiency signal representation structure for both human vision and machine vision.
Proceedings ArticleDOI

Video Compression With Rate-Distortion Autoencoders

TL;DR: A deep generative model for lossy video compression is presented that outperforms the state-of-the-art learned video compression networks based on motion compensation or interpolation and opens up novel video compression applications, which have not been feasible with classical codecs.
Journal ArticleDOI

Nonlinear Transform Coding

TL;DR: A novel variant of entropy-constrained vector quantization, based on artificial neural networks, as well as learned entropy models, is introduced to assess the empirical rate–distortion performance of nonlinear transform coding methods.
Journal ArticleDOI

An End-to-End Learning Framework for Video Compression

TL;DR: This paper proposes the first end-to-end deep video compression framework that can outperform the widely used video coding standard H.264 and be even on par with the latest standard H265.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Proceedings ArticleDOI

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.
Related Papers (5)