3D High-Efficiency Video Coding for Multi-View Video and Depth Data

doi:10.1109/TIP.2013.2264820

Home
/
Papers
/
3D High-Efficiency Video Coding for Multi-View Video and Depth Data

Journal Article•DOI•

3D High-Efficiency Video Coding for Multi-View Video and Depth Data

Klaus-Robert Müller¹, Heiko Schwarz¹, Detlev Marpe¹, Christian Bartnik¹, Sebastian Bosse¹, Heribert Brust¹, Tobias Hinz¹, Haricharan Lakshman¹, Philipp Merkle¹, F. H. Rhee¹, Gerhard Tech¹, Martin Winken¹, Thomas Wiegand¹ - Show less +9 more•Institutions (1)

Heinrich Hertz Institute¹

01 Sep 2013-IEEE Transactions on Image Processing (IEEE)-Vol. 22, Iss: 9, pp 3366-3378

TL;DR: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data, and develops and integrated a novel encoder control that guarantees that high quality intermediate views can be generated based on the decoded data.

read less

Abstract: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data. In addition to the known concept of disparity-compensated prediction, inter-view motion parameter, and inter-view residual prediction for coding of the dependent video views are developed and integrated. Furthermore, for depth coding, new intra coding modes, a modified motion compensation and motion vector coding as well as the concept of motion parameter inheritance are part of the HEVC extension. A novel encoder control uses view synthesis optimization, which guarantees that high quality intermediate views can be generated based on the decoded data. The bitstream format supports the extraction of partial bitstreams, so that conventional 2D video, stereo video, and the full multi-view video plus depth format can be decoded from a single bitstream. Objective and subjective results are presented, demonstrating that the proposed approach provides 50% bit rate savings in comparison with HEVC simulcast and 20% in comparison with a straightforward multi-view extension of HEVC without the newly developed coding tools.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Overview of the Multiview and 3D Extensions of High Efficiency Video Coding

[...]

Gerhard Tech¹, Ying Chen², Karsten Muller¹, Jens-Rainer Ohm³, Anthony Vetro⁴, Ye-Kui Wang² - Show less +2 more•Institutions (4)

Heinrich Hertz Institute¹, Qualcomm², RWTH Aachen University³, Mitsubishi Electric Research Laboratories⁴

01 Jan 2016-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The more advanced 3D video extension, 3D-HEVC, targets a coded representation consisting of multiple views and associated depth maps, as required for generating additional intermediate views inAdvanced 3D displays.

...read moreread less

Abstract: The High Efficiency Video Coding (HEVC) standard has recently been extended to support efficient representation of multiview video and depth-based 3D video formats. The multiview extension, MV-HEVC, allows efficient coding of multiple camera views and associated auxiliary pictures, and can be implemented by reusing single-layer decoders without changing the block-level processing modules since block-level syntax and decoding processes remain unchanged. Bit rate savings compared with HEVC simulcast are achieved by enabling the use of inter-view references in motion-compensated prediction. The more advanced 3D video extension, 3D-HEVC, targets a coded representation consisting of multiple views and associated depth maps, as required for generating additional intermediate views in advanced 3D displays. Additional bit rate reduction compared with MV-HEVC is achieved by specifying new block-level video coding tools, which explicitly exploit statistical dependencies between video texture and depth and specifically adapt to the properties of depth maps. The technical concepts and features of both extensions are presented in this paper.

...read moreread less

385 citations

Journal Article•DOI•

Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video

[...]

Rufael Mekuria¹, Kees Blom², Pablo Cesar²•Institutions (2)

VU University Amsterdam¹, Centrum Wiskunde & Informatica²

01 Apr 2017-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds and shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world.

...read moreread less

Abstract: We present a generic and real-time time-varying point cloud codec for 3D immersive video. This codec is suitable for mixed reality applications in which 3D point clouds are acquired at a fast rate. In this codec, intra frames are coded progressively in an octree subdivision. To further exploit inter-frame dependencies, we present an inter-prediction algorithm that partitions the octree voxel space in $N \times N \times N$ macroblocks ( $N=8,16,32$ ). The algorithm codes points in these blocks in the predictive frame as a rigid transform applied to the points in the intra-coded frame. The rigid transform is computed using the iterative closest point algorithm and compactly represented in a quaternion quantization scheme. To encode the color attributes, we defined a mapping of color per vertex attributes in the traversed octree to an image grid and use legacy image coding method based on JPEG. As a result, a generic compression framework suitable for real-time 3D tele-immersion is developed. This framework has been optimized to run in real time on commodity hardware for both the encoder and decoder. Objective evaluation shows that a higher rate-distortion performance is achieved compared with available point cloud codecs. A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds. In addition, it shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world. The codec is available as open source for integration in immersive and augmented communication applications and serves as a base reference software platform in JTC1/SC29/WG11 (MPEG) for the further development of standardized point-cloud compression solutions.

...read moreread less

346 citations

Cites methods from "3D High-Efficiency Video Coding for..."

...3) Multiview Plus Depth Compression: Multiview plus depth representation was considered for storing video and depth maps from multiple cameras [7], [8]....
[...]
...Existing video coding standards, such as Advanced Video Coding (AVC) Multi View Video (MVV) [7] and MVV-D [8], can support these functionalities via techniques from (depth) image-based rendering (DIBR)....
[...]

Proceedings Article•DOI•

Cutting the Cord: Designing a High-quality Untethered VR System with Low Latency Remote Rendering

[...]

Luyang Liu¹, Ruiguang Zhong², Wuyang Zhang¹, Yunxin Liu³, Jiansong Zhang⁴, Lintao Zhang³, Marco Gruteser¹ - Show less +3 more•Institutions (4)

Rutgers University¹, Beijing University of Posts and Telecommunications², Microsoft³, Alibaba Group⁴

10 Jun 2018

TL;DR: This paper introduces an end-to-end untethered VR system design and open platform that can meet virtual reality latency and quality requirements at 4K resolution over a wireless link and introduces a Remote VSync Driven Rendering technique to minimize display latency.

...read moreread less

Abstract: This paper introduces an end-to-end untethered VR system design and open platform that can meet virtual reality latency and quality requirements at 4K resolution over a wireless link. High-quality VR systems generate graphics data at a data rate much higher than those supported by existing wireless-communication products such as Wi-Fi and 60GHz wireless communication. The necessary image encoding, makes it challenging to maintain the stringent VR latency requirements. To achieve the required latency, our system employs a Parallel Rendering and Streaming mechanism to reduce the add-on streaming latency, by pipelining the rendering, encoding, transmission and decoding procedures. Furthermore, we introduce a Remote VSync Driven Rendering technique to minimize display latency. To evaluate the system, we implement an end-to-end remote rendering platform on commodity hardware over a 60Ghz wireless network. Results show that the system can support current 2160x1200 VR resolution at 90Hz with less than 16ms end-to-end latency, and 4K resolution with 20ms latency, while keeping a visually lossless image quality to the user.

...read moreread less

112 citations

Journal Article•DOI•

Advanced Media-Based Smart Big Data on Intelligent Cloud Systems

[...]

Kostas E. Psannis¹, Christos Stergiou¹, Brij B. Gupta²•Institutions (2)

University of Macedonia¹, National Institute of Technology, Kurukshetra²

01 Jan 2019

TL;DR: The proposed encoding algorithm outperforms the conventional HEVC standard which demonstrated by the performance evaluations and could be used and integrated into HEVC, as a Smart Big Data, without violating the standard.

...read moreread less

Abstract: Today's advanced media technology preaches an enthralling time that will enormously bear on daily life. Moreover, the rapid raise of wireless communications and networking will ultimately bring advanced media to our lives anytime, anywhere, and on any device. According to the National Institute of Standards and Technology (NIST), Cloud Computing (CC) is a scheme for enabling convenient, on-demand network access to a shared pool of configurable computing pores (for example networks, applications, storage, servers, and services) which could be promptly foresighted and delivered with minimal management effort or service provider interaction. This paper proposed an efficient algorithm for advanced scalable Media-basedSmart Big Data (3D, Ultra HD) on Intelligent Cloud Computing systems. The proposed encoding algorithmoutperforms the conventional HEVC standard which demonstrated by the performance evaluations. In order to ratify the proposed approach, in addition, a relative study has been carried out. The proposed method could be used and integrated into HEVC, as a Smart Big Data, without violating the standard.

...read moreread less

91 citations

Cites methods from "3D High-Efficiency Video Coding for..."

...During the last years, several techniques for HEVC-media have been contrived [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38],...
[...]
...a fast CU size decision algorithm for HM was proposed by the authors of [25]....
[...]
...The [25] delineates an extension of HEVC...
[...]
...out the newly developed coding tools demonstrated in [25] by the objective and subjective results presented....
[...]

Journal Article•DOI•

Fast Mode Decision Based on Grayscale Similarity and Inter-View Correlation for Depth Map Coding in 3D-HEVC

[...]

Jianjun Lei¹, Duan Jinhui¹, Feng Wu², Nam Ling³, Chunping Hou¹ - Show less +1 more•Institutions (3)

Tianjin University¹, University of Science and Technology of China², Santa Clara University³

01 Mar 2018-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel fast mode decision algorithm for depth map coding based on the grayscale similarity and inter-view correlation that achieves considerable time saving with negligible degradation of coding performance is presented.

...read moreread less

Abstract: The 3D extension of High Efficiency Video Coding significantly improves the coding efficiency of 3D video at the expense of computational complexity. This paper presents a novel fast mode decision algorithm for depth map coding based on the grayscale similarity and inter-view correlation. First, depth map grayscale similarity is adopted to judge whether the reference frame could assist the coding of the current frame. When the difference in the average grayscale between the co-located coding unit (CU) and the current CU is smaller than the similarity threshold, the depth level of the current CU will be restricted by that of the coded reference CU. Second, the grayscale similarity and inter-view correlation are jointly used for dependent views to achieve early decision on the best prediction unit (PU) mode. The mode decision procedure will be determined early when the co-located CU, which has a grayscale similarity with the current CU, selects Merge or Inter $2N \times 2N$ as the best prediction mode. Moreover, when the corresponding CU in the independent view selects Merge or Inter $2N \times 2N$ as the best prediction mode, the current CU will skip other PU modes checking based on the strong inter-view correlation. Finally, different strategies are proposed for the P-frames and B-frames of dependent views in view of the characteristics of different prediction structures. For B frames, the PU mode information of the coded independent view is utilized as reference to skip the unnecessary mode decision processes. For P frames, the spatial–temporal correlation is considered in the process of early mode decision to determine whether to choose the Merge mode or Inter $2N \times 2N$ as the best mode. Experimental results show that our proposed scheme achieves considerable time saving with negligible degradation of coding performance.

...read moreread less

88 citations

Cites methods from "3D High-Efficiency Video Coding for..."

...3D-HEVC introduces new prediction techniques and coding tools to improve the efficiency of MVD data coding [13]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A mathematical theory of communication

[...]

Claude E. Shannon

01 Jul 1948-Bell System Technical Journal

TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.

...read moreread less

Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.

...read moreread less

65,425 citations

Journal Article•DOI•

Overview of the H.264/AVC video coding standard

[...]

Thomas Wiegand¹, Gary J. Sullivan², G. Bjontegaard, Ajay Luthra³•Institutions (3)

Heinrich Hertz Institute¹, Microsoft², Motorola³

01 Jul 2003-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.

...read moreread less

Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.

...read moreread less

8,646 citations

"3D High-Efficiency Video Coding for..." refers methods in this paper

...As a result, multi-view video coding (MVC) was standardized as an extension of H.264/MPEG-4 Advanced Video Coding (AVC) [6], [20], [47], [52], [53]....
[...]

Journal Article•DOI•

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

[...]

Daniel Scharstein¹, Richard Szeliski², Ramin Zabih³•Institutions (3)

Middlebury College¹, Microsoft², Cornell University³

09 Dec 2001-International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

...read moreread less

7,458 citations

"3D High-Efficiency Video Coding for..." refers methods in this paper

...The depth information can be provided through different methods, including direct recording by special time-of-flight cameras [28], extraction from computer animated video material from the inherent 3D geometry representation [19], or disparity estimation [1], [42]....
[...]

Journal Article•DOI•

Overview of the High Efficiency Video Coding (HEVC) Standard

[...]

Gary J. Sullivan¹, Jens-Rainer Ohm², Woo-Jin Han³, Thomas Wiegand⁴•Institutions (4)

Microsoft¹, RWTH Aachen University², Gachon University³, Fraunhofer Society⁴

01 Dec 2012-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.

...read moreread less

Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

...read moreread less

7,383 citations

"3D High-Efficiency Video Coding for..." refers background or methods in this paper

...Objective and subjective results are presented in Section VII and conclusions are drawn in Section VIII....
[...]
...For a better representation of such edges, we added four new intra prediction modes for depth coding....
[...]