Jan Hendrik Vorwerk
Bio: Jan Hendrik Vorwerk is an academic researcher from RWTH Aachen University. The author has contributed to research in topics: Motion estimation & Motion compensation. The author has an hindex of 1, co-authored 1 publications receiving 23 citations.
••19 Apr 2015
TL;DR: A higher order motion compensation framework for the HEVC is proposed, motivated by the higher maximum block size introduced by the High Efficiency Video Coding Standard (HEVC) and a suitable combination of motion parameter estimation, interpolation and coding.
Abstract: In recent video compression standards only translational motion is accurately compensated, constraining any higher order motion to be approximated by being split into smaller translational units. The objective of this paper is to improve the coding efficiency for video sequences containing complex motion.Various higher order motion models are considered and evaluated to this end. Motivated by the higher maximum block size introduced by the High Efficiency Video Coding Standard (HEVC) and a suitable combination of motion parameter estimation, interpolation and coding, a higher order motion compensation framework for the HEVC is proposed. Through this an average data reduction of 2.9% as well as an increase in average block size is achieved.
TL;DR: A simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity is studied.
Abstract: In this paper, we study a simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity. The proposed framework mainly has three key contributions. First, we propose to reduce the number of affine motion parameters from 6 to 4. The proposed four-parameter affine motion model can not only handle most of the complex motions in natural videos, but also save the bits for two parameters. Second, to efficiently encode the affine motion parameters, we propose two motion prediction modes, i.e., an advanced affine motion vector prediction scheme combined with a gradient-based fast affine motion estimation algorithm and an affine model merge scheme, where the latter attempts to reuse the affine motion parameters (instead of the motion vectors) of neighboring blocks. Third, we propose two fast affine motion compensation algorithms. One is the one-step sub-pixel interpolation that reduces the computations of each interpolation. The other is the interpolation-precision-based adaptive block size motion compensation that performs motion compensation at the block level rather than the pixel level to reduce the number of interpolation. Our proposed techniques have been implemented based on the state-of-the-art high-efficiency video coding standard, and the experimental results show that the proposed techniques altogether achieve, on average, 11.1% and 19.3% bits saving for random access and low-delay configurations, respectively, on typical video sequences that have rich rotation or zooming motions. Meanwhile, the computational complexity increases of both the encoder and the decoder are within an acceptable range.
TL;DR: An integrated framework is developed to handle the geometry distortion of different projection formats for the 360° video to improve coding efficiency and can be seamlessly integrated into the latest video coding standard high-efficiency video coding.
Abstract: The 360° video compression has two main challenges due to projection distortions, namely, the geometry distortion and the face boundary discontinuity. There are some tradeoffs between selecting equi-rectangular projection (ERP) and polyhedron projection. In ERP, the geometry distortion is severer than the face boundary discontinuity; while for the polyhedron projections, the face boundary discontinuity is severer than the geometry distortion. These two distortions will have side effects on the motion compensation and undermine the compression efficiency of the 360° video. In this paper, an integrated framework is developed to handle these two problems to improve coding efficiency. The proposed framework mainly has two key contributions. First, we derive a unified advanced spherical motion model to handle the geometry distortion of different projection formats for the 360° video. When fitting the projection between the various projection formats and the sphere into the unified framework, a specific solution can be obtained for each projection format. Second, we propose a local 3D padding method to handle the face boundary discontinuity between the neighboring faces in various projection formats of the 360° video. The local 3D padding method can be applied to different projection formats through setting different angles between neighboring faces. These two methods are independent of each other and can also be combined into an integrated framework to achieve a better rate-distortion performance. The proposed framework can be seamlessly integrated into the latest video coding standard high-efficiency video coding. The experimental results demonstrate that introducing proposed coding tools can achieve significant bitrate savings compared with the current state-of-the-art method.
••01 Oct 2017
TL;DR: This paper uses a translational object motion model that accounts for the spherical geometry of the imaging system to design a new algorithm to perform block matching in sequences of panoramic frames that are the result of the equirectangular projection.
Abstract: This paper presents an extension of block-based motion estimation for omnidirectional videos, based on a translational object motion model that accounts for the spherical geometry of the imaging system. We use this model to design a new algorithm to perform block matching in sequences of panoramic frames that are the result of the equirectangular projection. Experimental results demonstrate that significant gains can be achieved with respect to the classical exhaustive block matching algorithm in terms of accuracy of motion prediction. In particular, average quality improvements up to approximately 6 dB in terms of Peak Signal to Noise Ratio (PSNR), 0.043 in terms of Structural SIMilarity index (SSIM), and 2 dB in terms of spherical PSNR, can be achieved on the predicted frames.
TL;DR: The technical details of each coding tool are presented and the design elements with the consideration of typical hardware implementations are highlighted and visual quality improvement is demonstrated and analyzed.
Abstract: Efficient representation and coding of fine-granular motion information is one of the key research areas for exploiting inter-frame correlation in video coding. Representative techniques towards this direction are affine motion compensation (AMC), decoder-side motion vector refinement (DMVR), and subblock-based temporal motion vector prediction (SbTMVP). Fine-granular motion information is derived at subblock level for all the three coding tools. In addition, the obtained inter prediction can be further refined by two optical flow-based coding tools, the bi-directional optical flow (BDOF) for bi-directional inter prediction and the prediction refinement with optical flow (PROF) exclusively used in combination with AMC. The aforementioned five coding tools have been extensively studied and finally adopted in the Versatile Video Coding (VVC) standard. This paper presents technical details of each tool and highlights the design elements with the consideration of typical hardware implementations. Following the common test conditions defined by Joint Video Experts Team (JVET) for the development of VVC, 5.7% bitrate reduction on average is achieved by the five tools. For test sequences characterized by large and complex motion, up to 13.4% bitrate reduction is observed. Additionally, visual quality improvement is demonstrated and analyzed.
01 Sep 2016
TL;DR: More efficient encoding of higher order motion parameters from a neighbored block needs to consider the dependency of the block-to-block parameter difference based on the spatial relation between two block centers, and an algorithm is introduced.
Abstract: Conventionally, complex motion in video sequences is approximated by smaller block units in order to be representable by a translational motion model. This approximation results in a fine block partitioning and a high prediction error, both at cost of more data rate than potentially necessary. A worthwhile data reduction has been shown to be achievable by adding a higher order motion model to the most recent video coding standard, High Efficiency Video Coding (HEVC). The benefit of this additional option of inter-frame prediction is due to the more accurate motion compensation as well as the usage of larger block sizes. This paper deals with more efficient encoding of higher order motion parameters in this context. The geometrically accurate prediction of higher order motion parameters from a neighbored block needs to consider the dependency of the block-to-block parameter difference based on the spatial relation between two block centers. An algorithm is introduced for correcting the translational component and reducing the difference between the actual and the predicted motion when determining higher order parameters from neighbored blocks. Additionally, a further increase of the maximum block size up to 512×512 pixels is investigated.