TL;DR: A novel data hiding strategy is proposed to integrate disparity data, needed for 3D visualization based on depth image based rendering, into a single H.264 format video bitstream, with the potential of being imperceptible and efficient in terms of rate distortion trade-off.
Abstract: In this paper, a novel data hiding strategy is proposed to integrate disparity data, needed for 3D visualization based on depth image based rendering, into a single H.264 format video bitstream. The proposed method has the potential of being imperceptible and efficient in terms of rate distortion trade-off. Depth information has been embedded in some of quantized transformed coefficients (QTCs) while taking into account the reconstruction loop. This provides us with a high payload for embedding of depth information in the texture data, with a negligible decrease in PSNR. To maintain synchronization, the embedding is carried out while taking into account the correspondence of video frames. Three different benchmark video sequences containing different combinations of motion, texture and objects are used for experimental evaluation of the proposed algorithm.
There exist several methods to generate 3D content.
It can be in form of stereoscopic video, which consists of two separate video bitstreams: one for the left eye and one for the right eye.
Another example of 3D content generation is proposed in [1], which consists of monoscopic color video (also known as texture) and associated per-pixel depth information.
Using this data representation, 3D view of a real-world scene can then be generated at the receiver side by means of depth image based rendering (DIBR) techniques.
Second, they infer approximate depth information from the relative movements of automatically tracked image segments.
2. PRELIMINARIES
It is accompanied by depth-image sequence with the same spatio-temporal resolution.
Depth information is an 8-bit gray value with the gray level 0 specifying the furthest value and the gray level 255 defining the closest value as shown in Fig.
First of all, x(i, j) is predicted from its neighboring blocks and the authors get the residual block.
Both Aq(u, v) and Eq(u, v) are indexed by QP. Fq(u, v) is the rounding factor from the quantization rounding factor matrix.
This ŷ(u, v) is entropy coded and sent to the decoder side.
3. THE PROPOSED ALGORITHM
The authors propose to embed the depth information in texture data during H.264/AVC encoding process for 3D video data.
Embedding is performed in the QTCs while taking into account the reconstruction loop to avoid mismatch on encoder and decoder side.
Moreover, hidden message is embedded in only those QTCs which are above a certain threshold.
For ’1’ LSB watermark embedding, f can be given as shown in Algorithm 1: Algorithm 1.
In third and final step, this subsampled and compressed depth information is embedded in the texture data during the encoding process of texture data.
4. EXPERIMENTAL RESULTS
For the experimental results, three benchmark 3D video sequences , namely interview, orbi and cg, have been used for the analysis in resolution of 720 × 576 [7].
Table 1 contains a comparison of PSNR of original compressed video and those, which contains depth information for three video sequences at QP value of 18.
Framewise analysis of decoded and up-scaled depth infor- mation is presented in Fig.
Generally, RMSE is lower for simple scences, but it is relatively higher for complex scences.
For visual analysis, Fig. 4 shows the watermarked video frames along with the depth information which was embedded in them.
5. CONCLUSION
The experimental results have shown that the authors can embed the depth information in texture data of respective frame in a synchronous manner, while maintaining a good value of RMSE for depth data, under a minimal set of RD trade-off.
First, the present algorithm will be extended for inter (P and B frames).
Second, the depth data will be compressed in a scalable manner to embed the highest possible amount of depth information.
TL;DR: A novel joint coding scheme is proposed for 3D media content including stereo images and multiview-plus-depth (MVD) video for the purpose of depth information hiding by a reversible watermarking algorithm called Quantized DCT Expansion (QDCTE).
Abstract: In this paper, a novel joint coding scheme is proposed for 3D media content including stereo images and multiview-plus-depth (MVD) video for the purpose of depth information hiding. The depth information is an image or image channel which reveals the distance of scene objects' surfaces from a viewpoint. With the concern of copyright protection, access control and coding efficiency for 3D content, we propose to hide the depth information into the texture image/video by a reversible watermarking algorithm called Quantized DCT Expansion (QDCTE). Considering the crucial importance of depth information for depth-image-based rendering (DIBR), full resolution depth image/video is compressed and embedded into the texture image/video, and it can be extracted without extra quality degradation other than compression itself. The reversibility of the proposed algorithm guarantees that texture image/video quality will not suffer from the watermarking process even if high payload (i.e. depth information) is embedded into the cover image/video. In order to control the size increase of watermarked image/video, the embedding function is carefully selected and the entropy coding process is also customized according to watermarking strength. Huffman and content-adaptive variable-length coding (CAVLC), which are respectively used for JPEG image and H.264 video entropy encoding, are analyzed and customized. After depth information embedding, we propose a new method to update the entropy codeword table with high efficiency and low computational complexity according to watermark embedding strength. By using our proposed coding scheme, the depth information can be hidden into the compressed texture image/video with little bitstream size overhead while the quality degradation of original cover image/video from watermarking can be completely removed at the receiver side.
TL;DR: Simulation results demonstrate that the proposed distributed packet protection mechanism can effectively mitigate packet loss in a mesh-based P2P network.
Abstract: This paper proposes a distributed packet protection mechanism that can minimize the packet loss probability for mesh based P2P video streaming systems. The proposed scheme combines a peer selection method with forward error correction (FEC) codes. The parent peers select the child peers, which can achieve the minimal packet loss probability compared to other candidate child peers, to transmit the FEC redundant substream. Moreover, the proposed scheme utilizes a packet loss model to estimate the packet loss probability in a mesh based P2P network. The packet loss propagation among peers is modeled through Markov random field (MRF). Simulation results demonstrate that our scheme can effectively mitigate packet loss in a mesh-based P2P network.
TL;DR: A novel reversible data hiding scheme is proposed to integrate depth maps into corresponding texture video bitstreams to achieve better video rendering quality and coding efficiency compared with existing related schemes.
Abstract: To support 3-D video and free-viewpoint video applications, efficient texture videos and depth maps coding should be addressed. In this paper, a novel reversible data hiding scheme is proposed to integrate depth maps into corresponding texture video bitstreams. At the sender end, the depth video bitstream obtained by depth down-sampling and compression is embedded in residual coefficients of corresponding texture video. The data embedding is implemented by the histogram shifting technique. At the receiver end, the depth maps can be retrieved with scalable quality after data extraction, video decoding and texture-based depth reconstruction. Due to the attractive property of reversible data hiding, the texture video bitstream can be perfectly recovered. Experimental results demonstrate that the proposed scheme can achieve better video rendering quality and coding efficiency compared with existing related schemes.
Cites background or methods from "Synchronization of texture and dept..."
...’s scheme [15] is used for data embedding and bilinear interpolation is used for depth reconstruction....
[...]
...[15] proposed to embed the depth video bitstream obtained by depth down-sampling and compression in quantized DCT coefficients of corresponding texture video using LSB replacement....
[...]
...’s scheme [15] and our proposed scheme respectively....
[...]
...’s scheme [15], the decoded down-sampled depth maps are reconstructed using nearest-neighbor interpolation, bilinear interpolation, Wildeboer et al....
[...]
...’s scheme [15] and our proposed scheme is listed in Table 2, in which bold digits mean that the corresponding scheme can achieve the smallest video bit rate increment....
TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.
Abstract: From the Publisher:
A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.
14,282 citations
Additional excerpts
...For example, 2D video can also be converted into 3D video using structure from motion techniques [2]....
TL;DR: This research derives an image-warping equation that maps the visible points in a reference image to their correct positions in any desired view and derives a new visibility algorithm that determines a drawing order for the image warp.
Abstract: The conventional approach to three-dimensional computer graphics produces images from geometric scene descriptions by simulating the interaction of light with matter. My research explores an alternative approach that replaces the geometric scene description with perspective images and replaces the simulation process with data interpolation.
I derive an image-warping equation that maps the visible points in a reference image to their correct positions in any desired view. This mapping from reference image to desired image is determined by the center-of-projection and pinhole-camera model of the two images and by a generalized disparity value associated with each point in the reference image. This generalized disparity value, which represents the structure of the scene, can be determined from point correspondences between multiple reference images.
The image-warping equation alone is insufficient to synthesize desired images because multiple reference-image points may map to a single point. I derive a new visibility algorithm that determines a drawing order for the image warp. This algorithm results in correct visibility for the desired image independent of the reference image's contents.
The utility of the image-based approach can be enhanced with a more general pinhole-camera model. I provide several generalizations of the warping equation's pinhole-camera model and discuss how to build an image-based representation when information about the reference image's center-of-projection and camera model is unavailable.
474 citations
"Synchronization of texture and dept..." refers methods in this paper
...Depth image based rendering (DIBR) is used for creating virtual scenes using a still or moving frame along with associated depth information [4]....
TL;DR: This paper discusses an advanced approach for a 3DTV service, which is based on the concept of video-plus-depth data representations, and provides a modular and flexible system architecture supporting a wide range of multi-view structures.
Abstract: Due to enormous progress in the areas of auto-stereoscopic 3D displays, digital video broadcast and computer vision algorithms, 3D television (3DTV) has reached a high technical maturity and many people now believe in its readiness for marketing. Experimental prototypes of entire 3DTV processing chains have been demonstrated successfully during the last few years, and the motion picture experts group (MPEG) of ISO/IEC has launched related ad hoc groups and standardization efforts envisaging the emerging market segment of 3DTV. In this context the paper discusses an advanced approach for a 3DTV service, which is based on the concept of video-plus-depth data representations. It particularly considers aspects of interoperability and multi-view adaptation for the case that different multi-baseline geometries are used for multi-view capturing and 3D display. Furthermore it presents algorithmic solutions for the creation of depth maps and depth image-based rendering related to this framework of multi-view adaptation. In contrast to other proposals, which are more focused on specialized configurations, the underlying approach provides a modular and flexible system architecture supporting a wide range of multi-view structures.
434 citations
"Synchronization of texture and dept..." refers background in this paper
...Another example of 3D content generation is proposed in [1], which consists of monoscopic color video (also known as texture) and associated per-pixel depth information....
Q1. What are the contributions mentioned in the paper "Synchronization of texture and depth map by data hiding for 3d h. 264 video" ?
In this paper, a novel data hiding strategy is proposed to integrate disparity data, needed for 3D visualization based on depth image based rendering, into a single H. 264 format video bitstream. This provides us with a high payload for embedding of depth information in the texture data, with a negligible decrease in PSNR. The proposed method has the potential of being imperceptible and efficient in terms of rate distortion trade-off.
Q2. What have the authors stated for future works in "Synchronization of texture and depth map by data hiding for 3d h. 264 video" ?
In future the authors will extend their work in two aspects. Second, the depth data will be compressed in a scalable manner to embed the highest possible amount of depth information.