scispace - formally typeset
Search or ask a question
Journal ArticleDOI

3D High-Efficiency Video Coding for Multi-View Video and Depth Data

TL;DR: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data, and develops and integrated a novel encoder control that guarantees that high quality intermediate views can be generated based on the decoded data.
Abstract: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data. In addition to the known concept of disparity-compensated prediction, inter-view motion parameter, and inter-view residual prediction for coding of the dependent video views are developed and integrated. Furthermore, for depth coding, new intra coding modes, a modified motion compensation and motion vector coding as well as the concept of motion parameter inheritance are part of the HEVC extension. A novel encoder control uses view synthesis optimization, which guarantees that high quality intermediate views can be generated based on the decoded data. The bitstream format supports the extraction of partial bitstreams, so that conventional 2D video, stereo video, and the full multi-view video plus depth format can be decoded from a single bitstream. Objective and subjective results are presented, demonstrating that the proposed approach provides 50% bit rate savings in comparison with HEVC simulcast and 20% in comparison with a straightforward multi-view extension of HEVC without the newly developed coding tools.
Citations
More filters
Journal ArticleDOI
TL;DR: The more advanced 3D video extension, 3D-HEVC, targets a coded representation consisting of multiple views and associated depth maps, as required for generating additional intermediate views inAdvanced 3D displays.
Abstract: The High Efficiency Video Coding (HEVC) standard has recently been extended to support efficient representation of multiview video and depth-based 3D video formats. The multiview extension, MV-HEVC, allows efficient coding of multiple camera views and associated auxiliary pictures, and can be implemented by reusing single-layer decoders without changing the block-level processing modules since block-level syntax and decoding processes remain unchanged. Bit rate savings compared with HEVC simulcast are achieved by enabling the use of inter-view references in motion-compensated prediction. The more advanced 3D video extension, 3D-HEVC, targets a coded representation consisting of multiple views and associated depth maps, as required for generating additional intermediate views in advanced 3D displays. Additional bit rate reduction compared with MV-HEVC is achieved by specifying new block-level video coding tools, which explicitly exploit statistical dependencies between video texture and depth and specifically adapt to the properties of depth maps. The technical concepts and features of both extensions are presented in this paper.

385 citations

Journal ArticleDOI
TL;DR: A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds and shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world.
Abstract: We present a generic and real-time time-varying point cloud codec for 3D immersive video. This codec is suitable for mixed reality applications in which 3D point clouds are acquired at a fast rate. In this codec, intra frames are coded progressively in an octree subdivision. To further exploit inter-frame dependencies, we present an inter-prediction algorithm that partitions the octree voxel space in $N \times N \times N$ macroblocks ( $N=8,16,32$ ). The algorithm codes points in these blocks in the predictive frame as a rigid transform applied to the points in the intra-coded frame. The rigid transform is computed using the iterative closest point algorithm and compactly represented in a quaternion quantization scheme. To encode the color attributes, we defined a mapping of color per vertex attributes in the traversed octree to an image grid and use legacy image coding method based on JPEG. As a result, a generic compression framework suitable for real-time 3D tele-immersion is developed. This framework has been optimized to run in real time on commodity hardware for both the encoder and decoder. Objective evaluation shows that a higher rate-distortion performance is achieved compared with available point cloud codecs. A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds. In addition, it shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world. The codec is available as open source for integration in immersive and augmented communication applications and serves as a base reference software platform in JTC1/SC29/WG11 (MPEG) for the further development of standardized point-cloud compression solutions.

346 citations


Cites methods from "3D High-Efficiency Video Coding for..."

  • ...3) Multiview Plus Depth Compression: Multiview plus depth representation was considered for storing video and depth maps from multiple cameras [7], [8]....

    [...]

  • ...Existing video coding standards, such as Advanced Video Coding (AVC) Multi View Video (MVV) [7] and MVV-D [8], can support these functionalities via techniques from (depth) image-based rendering (DIBR)....

    [...]

Proceedings ArticleDOI
10 Jun 2018
TL;DR: This paper introduces an end-to-end untethered VR system design and open platform that can meet virtual reality latency and quality requirements at 4K resolution over a wireless link and introduces a Remote VSync Driven Rendering technique to minimize display latency.
Abstract: This paper introduces an end-to-end untethered VR system design and open platform that can meet virtual reality latency and quality requirements at 4K resolution over a wireless link. High-quality VR systems generate graphics data at a data rate much higher than those supported by existing wireless-communication products such as Wi-Fi and 60GHz wireless communication. The necessary image encoding, makes it challenging to maintain the stringent VR latency requirements. To achieve the required latency, our system employs a Parallel Rendering and Streaming mechanism to reduce the add-on streaming latency, by pipelining the rendering, encoding, transmission and decoding procedures. Furthermore, we introduce a Remote VSync Driven Rendering technique to minimize display latency. To evaluate the system, we implement an end-to-end remote rendering platform on commodity hardware over a 60Ghz wireless network. Results show that the system can support current 2160x1200 VR resolution at 90Hz with less than 16ms end-to-end latency, and 4K resolution with 20ms latency, while keeping a visually lossless image quality to the user.

112 citations

Journal ArticleDOI
01 Jan 2019
TL;DR: The proposed encoding algorithm outperforms the conventional HEVC standard which demonstrated by the performance evaluations and could be used and integrated into HEVC, as a Smart Big Data, without violating the standard.
Abstract: Today's advanced media technology preaches an enthralling time that will enormously bear on daily life. Moreover, the rapid raise of wireless communications and networking will ultimately bring advanced media to our lives anytime, anywhere, and on any device. According to the National Institute of Standards and Technology (NIST), Cloud Computing (CC) is a scheme for enabling convenient, on-demand network access to a shared pool of configurable computing pores (for example networks, applications, storage, servers, and services) which could be promptly foresighted and delivered with minimal management effort or service provider interaction. This paper proposed an efficient algorithm for advanced scalable Media-basedSmart Big Data (3D, Ultra HD) on Intelligent Cloud Computing systems. The proposed encoding algorithmoutperforms the conventional HEVC standard which demonstrated by the performance evaluations. In order to ratify the proposed approach, in addition, a relative study has been carried out. The proposed method could be used and integrated into HEVC, as a Smart Big Data, without violating the standard.

91 citations


Cites methods from "3D High-Efficiency Video Coding for..."

  • ...During the last years, several techniques for HEVC-media have been contrived [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38],...

    [...]

  • ...a fast CU size decision algorithm for HM was proposed by the authors of [25]....

    [...]

  • ...The [25] delineates an extension of HEVC...

    [...]

  • ...out the newly developed coding tools demonstrated in [25] by the objective and subjective results presented....

    [...]

Journal ArticleDOI
TL;DR: A novel fast mode decision algorithm for depth map coding based on the grayscale similarity and inter-view correlation that achieves considerable time saving with negligible degradation of coding performance is presented.
Abstract: The 3D extension of High Efficiency Video Coding significantly improves the coding efficiency of 3D video at the expense of computational complexity. This paper presents a novel fast mode decision algorithm for depth map coding based on the grayscale similarity and inter-view correlation. First, depth map grayscale similarity is adopted to judge whether the reference frame could assist the coding of the current frame. When the difference in the average grayscale between the co-located coding unit (CU) and the current CU is smaller than the similarity threshold, the depth level of the current CU will be restricted by that of the coded reference CU. Second, the grayscale similarity and inter-view correlation are jointly used for dependent views to achieve early decision on the best prediction unit (PU) mode. The mode decision procedure will be determined early when the co-located CU, which has a grayscale similarity with the current CU, selects Merge or Inter $2N \times 2N$ as the best prediction mode. Moreover, when the corresponding CU in the independent view selects Merge or Inter $2N \times 2N$ as the best prediction mode, the current CU will skip other PU modes checking based on the strong inter-view correlation. Finally, different strategies are proposed for the P-frames and B-frames of dependent views in view of the characteristics of different prediction structures. For B frames, the PU mode information of the coded independent view is utilized as reference to skip the unnecessary mode decision processes. For P frames, the spatial–temporal correlation is considered in the process of early mode decision to determine whether to choose the Merge mode or Inter $2N \times 2N$ as the best mode. Experimental results show that our proposed scheme achieves considerable time saving with negligible degradation of coding performance.

88 citations


Cites methods from "3D High-Efficiency Video Coding for..."

  • ...3D-HEVC introduces new prediction techniques and coding tools to improve the efficiency of MVD data coding [13]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.

65,425 citations

Journal ArticleDOI
TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.
Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.

8,646 citations


"3D High-Efficiency Video Coding for..." refers methods in this paper

  • ...As a result, multi-view video coding (MVC) was standardized as an extension of H.264/MPEG-4 Advanced Video Coding (AVC) [6], [20], [47], [52], [53]....

    [...]

Journal ArticleDOI
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

7,458 citations


"3D High-Efficiency Video Coding for..." refers methods in this paper

  • ...The depth information can be provided through different methods, including direct recording by special time-of-flight cameras [28], extraction from computer animated video material from the inherent 3D geometry representation [19], or disparity estimation [1], [42]....

    [...]

Journal ArticleDOI
TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.
Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

7,383 citations


"3D High-Efficiency Video Coding for..." refers background or methods in this paper

  • ...Objective and subjective results are presented in Section VII and conclusions are drawn in Section VIII....

    [...]

  • ...For a better representation of such edges, we added four new intra prediction modes for depth coding....

    [...]