scispace - formally typeset
Search or ask a question
Author

Ping An

Other affiliations: Chinese Ministry of Education
Bio: Ping An is an academic researcher from Shanghai University. The author has contributed to research in topics: Depth map & Image quality. The author has an hindex of 19, co-authored 176 publications receiving 1574 citations. Previous affiliations of Ping An include Chinese Ministry of Education.


Papers
More filters
Journal ArticleDOI
TL;DR: A fast CU size decision and mode decision algorithm for HEVC intra coding that can save 21% computational complexity on average with negligible loss of coding efficiency is proposed.
Abstract: The emerging international standard of High Efficiency Video Coding (HEVC) is a successor to H.264/AVC. In the joint model of HEVC, the tree structured coding unit (CU) is adopted, which allows recursive splitting into four equally sized blocks. At each depth level, it enables up to 34 intra prediction modes. The intra mode decision process in HEVC is performed using all the possible depth levels and prediction modes to find the one with the least rate distortion (RD) cost using Lagrange multiplier. This achieves the highest coding efficiency but requires a very high computational complexity. In this paper, we propose a fast CU size decision and mode decision algorithm for HEVC intra coding. Since the optimal CU depth level is highly content-dependent, it is not efficient to use a fixed CU depth range for a whole image. Therefore, we can skip some specific depth levels rarely used in spatially nearby CUs. Meanwhile, there are RD cost and prediction mode correlations among different depth levels or spatially nearby CUs. By fully exploiting these correlations, we can skip some prediction modes which are rarely used in the parent CUs in the upper depth levels or spatially nearby CUs. Experimental results demonstrate that the proposed algorithm can save 21% computational complexity on average with negligible loss of coding efficiency.

295 citations

Journal ArticleDOI
TL;DR: A fast intra-coding algorithm consisting of low-complexity coding tree units (CTU) structure decision and fast intra mode decision and the complexity reduction of the proposed algorithm is up to 70% compared to VVC reference software, and averagely 63% encoding time saving is achieved.
Abstract: Quadtree with nested multi-type tree (QTMT) partition structure is an efficient improvement in versatile video coding (VVC) over the quadtree (QT) structure in the advanced high-efficiency video coding (HEVC) standard. With the exception of the recursive QT partition structure, recursive multi-type tree partition is applied to each leaf node, which generates more flexible block sizes. Besides, intra prediction modes are extended from 35 to 67 so as to satisfy various texture patterns. These newly developed techniques achieve high coding efficiency but also result in very high computational complexity. To tackle this problem, we propose a fast intra-coding algorithm consisting of low-complexity coding tree units (CTU) structure decision and fast intra mode decision in this paper. The contributions of the proposed algorithm lie in the following aspects: 1) the new block size and coding mode distribution features are first explored for a reasonable fast coding scheme; 2) a novel fast QTMT partition decision framework is developed, which can determine the partition decision on both QT and multi-type tree with a novel cascade decision structure; and 3) fast intra mode decision with gradient descent search is introduced, while the best initial search point and search step are also investigated in this paper. The simulation results show that the complexity reduction of the proposed algorithm is up to 70% compared to VVC reference software (VTM), and averagely 63% encoding time saving is achieved with 1.93% BDBR increasing. Such results demonstrate that our method yields a superior performance in terms of computational complexity and compression quality compared to the state-of-the-art methods.

166 citations

Journal ArticleDOI
Liquan Shen1, Zhi Liu1, Tao Yan1, Zhaoyang Zhang1, Ping An1 
TL;DR: A fast ME and DE algorithm that adaptively utilizes the inter-view correlation is proposed that can save 85% computational complexity on average, with negligible loss of coding efficiency.
Abstract: The emerging international standard for multiview video coding (MVC) is an extension of H.264/advanced video coding. In the joint mode of MVC, both motion estimation (ME) and disparity estimation (DE) are included in the encoding process. This achieves the highest coding efficiency but requires a very high computational complexity. In this letter, we propose a fast ME and DE algorithm that adaptively utilizes the inter-view correlation. The coding mode complexity and the motion homogeneity of a macroblock (MB) are first analyzed according to the coding modes and motion vectors from the corresponding MBs in the neighbor views, which are located by means of global disparity vector. According to the coding mode complexity and the motion homogeneity, the proposed algorithm adjusts the search strategies for different types of MBs in order to perform a precise search according to video content. Experimental results demonstrate that the proposed algorithm can save 85% computational complexity on average, with negligible loss of coding efficiency.

85 citations

Journal ArticleDOI
Liquan Shen1, Zhi Liu1, Suxing Liu1, Zhaoyang Zhang1, Ping An1 
TL;DR: A fast DE and ME algorithm based on motion homogeneity is proposed to reduce MVC computational complexity and simulation results show that the proposed algorithm can save 63% average computational complexity, with negligible loss of coding efficiency.
Abstract: Multi-view video coding (MVC) is an ongoing standard in which variable size disparity estimation (DE) and motion estimation (ME) are both employed to select the best coding mode for each macroblock (MB). This technique achieves the highest possible coding efficiency, but it results in extremely large encoding time which obstructs it from practical use. In this paper, a fast DE and ME algorithm based on motion homogeneity is proposed to reduce MVC computational complexity. The basic idea of the method is to utilize the spatial property of motion field in prediction where DE and variable size ME are needed, and only in these regions DE and variable size ME are enabled. The motion field is generated by the corresponding motion vectors (MVs) in spatial window. Simulation results show that the proposed algorithm can save 63% average computational complexity, with negligible loss of coding efficiency.

76 citations

Journal ArticleDOI
TL;DR: A data-driven approach based on the deep convolutional neural network with global and local residual learning to restore the depth structure from coarse to fine via multi-scale frequency synthesis is proposed.
Abstract: The depth maps obtained by the consumer-level sensors are always noisy in the low-resolution (LR) domain. Existing methods for the guided depth super-resolution, which are based on the pre-defined local and global models, perform well in general cases (e.g., joint bilateral filter and Markov random field). However, such model-based methods may fail to describe the potential relationship between RGB-D image pairs. To solve this problem, this paper proposes a data-driven approach based on the deep convolutional neural network with global and local residual learning. It progressively upsamples the LR depth map guided by the high-resolution intensity image in multiple scales. A global residual learning is adopted to learn the difference between the ground truth and the coarsely upsampled depth map, and the local residual learning is introduced in each scale-dependent reconstruction sub-network. This scheme can restore the depth structure from coarse to fine via multi-scale frequency synthesis. In addition, batch normalization layers are used to improve the performance of depth map denoising. Our method is evaluated in noise-free and noisy cases. A comprehensive comparison against 17 state-of-the-art methods is carried out. The experimental results show that the proposed method has faster convergence speed as well as improved performances based on the qualitative and quantitative evaluations.

55 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

01 Jan 2006

3,012 citations

Book
02 Jan 1991

1,377 citations