An overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC is provided and a summary of the coding performance achieved by MVC for both stereo- and multiview video is provided.
Abstract:
Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.
TL;DR: This paper describes efficient coding methods for video and depth data, and synthesis methods are presented, which mitigate errors from depth estimation and coding, for the generation of views.
TL;DR: The design for these extensions represents the latest state of the art for video coding and its applications, including work on range extensions for color format and bit depth enhancement, embedded-bitstream scalability, and 3D video.
TL;DR: The more advanced 3D video extension, 3D-HEVC, targets a coded representation consisting of multiple views and associated depth maps, as required for generating additional intermediate views inAdvanced 3D displays.
TL;DR: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data, and develops and integrated a novel encoder control that guarantees that high quality intermediate views can be generated based on the decoded data.
TL;DR: A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds and shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world.
TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.
TL;DR: An overview of the basic concepts for extending H.264/AVC towards SVC are provided and the basic tools for providing temporal, spatial, and quality scalability are described in detail and experimentally analyzed regarding their efficiency and complexity.
Levels impose constraints on the bitstreams produced by MVC encoders, to establish bounds on the necessary decoder resources and complexity.
Q2. What is the purpose of the MVC design?
For applications in which random access or view switching is important, the prediction structure can be designed to minimize access delay, and the MVC design provides a way for an encoder to describe the prediction structure for this purpose.
Q3. What was the first call for proposals for efficient multiview video coding?
Considering recent advancements in video compression technology and the anticipated needs for state-of-the-art coding of multiview video, MPEG issued a Call for Proposals (CfP) for efficient multiview video coding technology in October of 2005.
Q4. What is the main consequence of not requiring changes to lower levels of the syntax?
A major consequence of not requiring changes to lower levels of the syntax (at the macroblock level and below it) is that MVC is compatible with existing hardware for decoding single-view video with H.264/MPEG-4 AVC.
Q5. What is the average reduction in bit rate for a single view of stereo movie?
In other studies [50], an average reduction of 20-30% of the bit rate for the second (dependent) view of typical stereo movie content was reported, with a peak reduction for an individual test sequence of 43% of the bit rate of the dependent view.
Q6. What are the types of 3D display systems that require glasses?
There are many types of 3D display systems [14] including classic stereo systems that require special-purpose glasses to more sophisticated multiview auto-stereoscopic displays that do not require glasses [15].
Q7. What is the way to reduce the bit rate of asymmetrical coding?
Prior studies on asymmetrical coding of stereo video, in which one of the views is encoded with lower quality than the other, suggest that a further substantial savings in bit rate for the non-base view could be achieved using that technique.
Q8. What are the main aspects of the MVC design?
Several other aspects of the MVC design were further elaborated on in [44], including random access and view switching, extraction of operation points (sets of coded views at particular levels of a nested temporal referencing structure) of an MVC bitstream for adaptation to network and device constraints, parallel processing, and a description of several newly adoptedPROCEEDINGS OF THE IEEE (2011): VETRO, WIEGAND, SULLIVAN7SEI messages that are relevant for multiview video bitstreams.