Author
Yang Haitao
Bio: Yang Haitao is an academic researcher from Huawei. The author has contributed to research in topics: Motion compensation & Terminal (electronics). The author has an hindex of 4, co-authored 9 publications receiving 71 citations.
Papers
More filters
24 May 2015
TL;DR: A fast gradient based affine motion estimation algorithm is proposed to decrease the encoder complexity and is implemented into the test model of the newest video coding standard High Efficiency Video Coding (HEVC).
Abstract: As the translational motion model used in recent video coding standards cannot represent the complex motion such as rotation and zooming well, a simple local affine motion compensation framework supporting multiple reference frames is proposed in this paper to characterize the complex motion. Besides, since the commonly used fast motion estimation for affine motion model is still quite complex, a fast gradient based affine motion estimation algorithm is proposed to decrease the encoder complexity. The proposed algorithm is implemented into the test model of the newest video coding standard High Efficiency Video Coding (HEVC). Experimental results show that the bit rate reduction for sequences with complex motion can be up to 16.8%.
31 citations
Patent•
16 Dec 2015
TL;DR: In this article, the authors proposed a method and a device for image prediction, which comprises the steps of obtaining a first reference unit of an image unit, wherein the image unit and the first reference units obtain respective predicted images by using the same affine model.
Abstract: The invention provides a method and a device for image prediction. The method comprises the steps of: obtaining a first reference unit of an image unit, wherein the image unit and the first reference unit obtain respective predicted images by using the same affine model; obtaining motion information of basic motion compensation units at the at least two preset positions of the first reference unit; and obtaining the motion information of the basic motion compensation units of the image unit. Therefore, by multiplexing the motion information of the first reference unit which adopts the same affine pre-model, a motion vector of the current image unit is obtained more exactly, the coding and decoding complexity is kept while the prediction accuracy is improved, and the coding and decoding performance is improved.
23 citations
Posted Content•
TL;DR: A co-projection-plane based 3-D padding method is proposed to project the reference pixels in the neighboring face to the current face to guarantee exact texture continuity and demonstrates that the texture discontinuity in the face boundary can be well handled by the proposed algorithm.
Abstract: The polyhedron projection for 360-degree video is becoming more and more popular since it can lead to much less geometry distortion compared with the equirectangular projection. However, in the polyhedron projection, we can observe very obvious texture discontinuity in the area near the face boundary. Such a texture discontinuity may lead to serious quality degradation when motion compensation crosses the discontinuous face boundary. To solve this problem, in this paper, we first propose to fill the corresponding neighboring faces in the suitable positions as the extension of the current face to keep approximated texture continuity. Then a co-projection-plane based 3-D padding method is proposed to project the reference pixels in the neighboring face to the current face to guarantee exact texture continuity. Under the proposed scheme, the reference pixel is always projected to the same plane with the current pixel when performing motion compensation so that the texture discontinuity problem can be solved. The proposed scheme is implemented in the reference software of High Efficiency Video Coding. Compared with the existing method, the proposed algorithm can significantly improve the rate-distortion performance. The experimental results obviously demonstrate that the texture discontinuity in the face boundary can be well handled by the proposed algorithm.
12 citations
10 Jul 2017
TL;DR: In this article, a co-projection-plane based 3-D padding method is proposed to project the reference pixels in the neighboring face to the current face to guarantee exact texture continuity.
Abstract: The polyhedron projection for 360-degree video is becoming more and more popular since it can lead to much less geometry distortion compared with the equirectangular projection. However, in the polyhedron projection, we can observe very obvious texture discontinuity in the area near the face boundary. Such a texture discontinuity may lead to serious quality degradation when motion compensation crosses the discontinuous face boundary. To solve this problem, in this paper, we first propose to fill the corresponding neighboring faces in the suitable positions as the extension of the current face to keep approximated texture continuity. Then a co-projection-plane based 3-D padding method is proposed to project the reference pixels in the neighboring face to the current face to guarantee exact texture continuity. Under the proposed scheme, the reference pixel is always projected to the same plane with the current pixel when performing motion compensation so that the texture discontinuity problem can be solved. The proposed scheme is implemented in the reference software of High Efficiency Video Coding. Compared with the existing method, the proposed algorithm can significantly improve the rate-distortion performance. The experimental results obviously demonstrate that the texture discontinuity in the face boundary can be well handled by the proposed algorithm.
9 citations
Patent•
17 Jun 2015TL;DR: In this article, a video stream acquiring method and a device related to the field of information technology is presented. And the method comprises steps: first, a terminal acquires the minimal bit rate corresponding to self parameters; then, the terminal sends a video streaming acquiring request to a server; finally, when the server receives the video stream acquisition request sent by the terminal, a MPD is sent to the terminal and the terminal receives the MPD sent by a server, and according to the MP D and the minimal rate, the video streams is acquired.
Abstract: The embodiment of the invention discloses a video stream acquiring method and a device, which relate to the field of information technology, and can improve the utilization rate of network bandwidth. The method comprises steps: firstly, a terminal acquires the minimal bit rate corresponding to self parameters; then, the terminal sends a video stream acquiring request to a server; finally, when the server receives the video stream acquiring request sent by the terminal, a MPD is sent to the terminal, the terminal receives the MPD sent by the server, and according to the MPD and the minimal bit rate, the video stream is acquired. The embodiment of the invention can be used for enabling a user to download video data via the terminal.
2 citations
Cited by
More filters
TL;DR: Recent advances, such as different projection methods benefiting video coding, specialized video quality evaluation metrics and optimized methods for transmission, are all presented and classified in this paper.
Abstract: In this paper, we review the recent advances in the pipeline of omnidirectional video processing including projection and evaluation. Being distinct from the traditional video, the omnidirectional video, also called panoramic video or 360 degree video, is in the spherical domain, thus specialized tools are necessary. For this type of video, each picture should be projected to a 2-D plane for encoding and decoding, adapting to the input of existing video coding systems. Thus the coding influence of the projection and the accuracy of the evaluation method are very important in this pipeline. Recent advances, such as different projection methods benefiting video coding, specialized video quality evaluation metrics and optimized methods for transmission, are all presented and classified in this paper. In addition, the coding performances under different projection methods are specified. The future trends of omnidirectional video processing are also discussed.
90 citations
TL;DR: A simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity is studied.
Abstract: In this paper, we study a simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity. The proposed framework mainly has three key contributions. First, we propose to reduce the number of affine motion parameters from 6 to 4. The proposed four-parameter affine motion model can not only handle most of the complex motions in natural videos, but also save the bits for two parameters. Second, to efficiently encode the affine motion parameters, we propose two motion prediction modes, i.e., an advanced affine motion vector prediction scheme combined with a gradient-based fast affine motion estimation algorithm and an affine model merge scheme, where the latter attempts to reuse the affine motion parameters (instead of the motion vectors) of neighboring blocks. Third, we propose two fast affine motion compensation algorithms. One is the one-step sub-pixel interpolation that reduces the computations of each interpolation. The other is the interpolation-precision-based adaptive block size motion compensation that performs motion compensation at the block level rather than the pixel level to reduce the number of interpolation. Our proposed techniques have been implemented based on the state-of-the-art high-efficiency video coding standard, and the experimental results show that the proposed techniques altogether achieve, on average, 11.1% and 19.3% bits saving for random access and low-delay configurations, respectively, on typical video sequences that have rich rotation or zooming motions. Meanwhile, the computational complexity increases of both the encoder and the decoder are within an acceptable range.
84 citations
TL;DR: This survey presents the current literature related to 360° video streaming and presents the video and viewer datasets, which may be used to drive large-scale simulations and experiments.
Abstract: Head-mounted displays and 360° videos have become increasingly more popular, delivering a more immersive viewing experience to end users. Streaming 360° videos over the best-effort Internet, however, faces tremendous challenges, because of the high resolution and the short response time requirements. This survey presents the current literature related to 360° video streaming. We start with 360° video streaming systems built for real experiments to investigate the practicality and efficiency of 360° video streaming. We then present the video and viewer datasets, which may be used to drive large-scale simulations and experiments. Different optimization tools in various stages of the 360° video streaming pipeline are discussed in detail. We also present various applications enabled by 360° video streaming. In the appendices, we review the off-the-shelf hardware available at the time of writing and the open research problems.
83 citations
Patent•
27 Feb 2017TL;DR: In this paper, the affine motion vectors are derived from three different neighboring coded blocks of the current block, and the current motion model is derived according to the motion vectors if the first affine candidate is selected.
Abstract: An encoding or decoding method with affine motion compensation includes receiving input data associated with a current block in a current picture, and deriving a first affine candidate for the current block including three affine motion vectors for predicting motion vectors at control points of the current block if the current block is coded or to be coded in affine Merge mode. The affine motion vectors are derived from three different neighboring coded blocks of the current block. An affine motion model is derived according to the affine motion vectors if the first affine candidate is selected. Moreover, the method includes encoding or decoding the current block by locating a reference block in a reference picture according to the affine motion model. The current block is restricted to be coded in uni-directional prediction if the current block is coded or to be coded in affine Inter mode.
73 citations
TL;DR: This paper describes technologies relevant to 360° video for VVC, including projection formats, pre- and post-processing methods, and 360°-video specific coding tool modifications in these proposals.
Abstract: Augmented reality (AR) and virtual reality (VR) applications have seen rising popularity in recent years. Omnidirectional 360° video is a video format often used in AR and VR applications. To address the industry needs, a new HEVC edition recently published includes several supplemental enhancement information (SEI) messages to enable the carriage of omnidirectional video using HEVC. However, further improvement in 360° video compression efficiency is needed. In order to address this challenge, the Joint Video Exploration Team (JVET) of ITU-T VCEG and ISO/IEC MPEG has been investigating 360° video coding technologies, including projection formats, pre- and post-processing technologies, as well as 360°-video-specific coding tools since 2016. The joint call for proposals (CfP) recently issued by ITU-T VCEG and ISO/IEC MPEG on video compression technologies beyond HEVC included a category on 360° video. Twelve CfP responses in the 360° video category were received. This paper describes technologies relevant to 360° video for VVC. A summary of projection formats, pre- and post-processing methods, and 360°-video specific coding tool modifications in these proposals is provided.
57 citations