Showing papers on "View synthesis published in 2014"

PDF

Open Access

Journal Article•DOI•

Variational Light Field Analysis for Disparity Estimation and Super-Resolution

[...]

Sven Wanner¹, Bastian Goldluecke¹•Institutions (1)

01 Mar 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The problem of view synthesis is formulated as a continuous inverse problem, which allows us to correctly take into account foreshortening effects caused by scene geometry transformations, and all optimization problems are solved with state-of-the-art convex relaxation techniques.

...read moreread less

Abstract: We develop a continuous framework for the analysis of 4D light fields, and describe novel variational methods for disparity reconstruction as well as spatial and angular super-resolution. Disparity maps are estimated locally using epipolar plane image analysis without the need for expensive matching cost minimization. The method works fast and with inherent subpixel accuracy since no discretization of the disparity space is necessary. In a variational framework, we employ the disparity maps to generate super-resolved novel views of a scene, which corresponds to increasing the sampling rate of the 4D light field in spatial as well as angular direction. In contrast to previous work, we formulate the problem of view synthesis as a continuous inverse problem, which allows us to correctly take into account foreshortening effects caused by scene geometry transformations. All optimization problems are solved with state-of-the-art convex relaxation techniques. We test our algorithms on a number of real-world examples as well as our new benchmark data set for light fields, and compare results to a multiview stereo method. The proposed method is both faster as well as more accurate. Data sets and source code are provided online for additional evaluation.

...read moreread less

575 citations

Proceedings Article•DOI•

A Surround View Camera Solution for Embedded Systems

[...]

Buyue Zhang¹, Vikram Appia¹, Ibrahim Ethem Pekkucuksen¹, Yucheng Liu¹, Aziz Umit Batur¹, Pavan Shastry¹, Stanley Liu¹, Shiju Sivasankaran¹, Kedar Chitnis² - Show less +5 more•Institutions (2)

Texas Instruments¹, Purdue University²

23 Jun 2014

TL;DR: This paper presents a surround view camera solution that consists of three key algorithm components: geometric alignment, photometric alignment, and composite view synthesis that produces a seamlessly stitched bird-eye view of the vehicle from four cameras.

...read moreread less

Abstract: Automotive surround view camera system is an emerging automotive ADAS (Advanced Driver Assistance System) technology that assists the driver in parking the vehicle safely by allowing him/her to see a top-down view of the 360 degree surroundings of the vehicle. Such a system normally consists of four to six wide-angle (fish-eye lens) cameras mounted around the vehicle, each facing a different direction. From these camera inputs, a composite bird-eye view of the vehicle is synthesized and shown to the driver in real-time during parking. In this paper, we present a surround view camera solution that consists of three key algorithm components: geometric alignment, photometric alignment, and composite view synthesis. Our solution produces a seamlessly stitched bird-eye view of the vehicle from four cameras. It runs real-time on DSP C66x producing an 880x1080 output video at 30 fps.

...read moreread less

83 citations

Journal Article•DOI•

Depth Map Driven Hole Filling Algorithm Exploiting Temporal Correlation Information

[...]

Chao Yao¹, Tammam Tillo², Yao Zhao¹, Jimin Xiao², Huihui Bai¹, Chunyu Lin¹ - Show less +2 more•Institutions (2)

Beijing Jiaotong University¹, Xi'an Jiaotong-Liverpool University²

16 May 2014-IEEE Transactions on Broadcasting

TL;DR: The temporal correlation of texture and depth information is exploited to generate a background reference image that is then used to fill the holes associated with the dynamic parts of the scene, whereas for static parts the traditional inpainting method is used.

...read moreread less

Abstract: The depth-image-based-rendering is a key technique to realize free viewpoint television. However, one critical problem in these systems is filling the disocclusion due to the 3-D warping process. This paper exploits the temporal correlation of texture and depth information to generate a background reference image. This is then used to fill the holes associated with the dynamic parts of the scene, whereas for static parts the traditional inpainting method is used. To generate the background reference image, the Gaussian mixture model is employed on the texture information, whereas, depth maps information are used to detect moving objects so as to enhance the background reference image. The proposed holes filling approach is particularly useful for the single-view-plus-depth format, where, contrary to the multi-view-plus-depth format, only information of one view could be used for this task. The experimental results show that objective and subjective gains can be achieved, and the gain ranges from 1 to 3 dB over the inpainting method.

...read moreread less

66 citations

Proceedings Article•DOI•

Bayesian View Synthesis and Image-Based Rendering Principles

[...]

Sergi Pujades¹, Frédéric Devernay¹, Bastian Goldluecke²•Institutions (2)

University of Grenoble¹, Heidelberg University²

23 Jun 2014

TL;DR: This paper contributes a new physics-based generative model and the corresponding Maximum a Posteriori estimate, providing the desired unification between heuristics-based methods and a Bayesian formulation and shows that the novel Bayesian model significantly improves the quality of novel views, in particular if the scene geometry estimate is inaccurate.

...read moreread less

Abstract: In this paper, we address the problem of synthesizing novel views from a set of input images. State of the art methods, such as the Unstructured Lumigraph, have been using heuristics to combine information from the original views, often using an explicit or implicit approximation of the scene geometry. While the proposed heuristics have been largely explored and proven to work effectively, a Bayesian formulation was recently introduced, formalizing some of the previously proposed heuristics, pointing out which physical phenomena could lie behind each. However, some important heuristics were still not taken into account and lack proper formalization. We contribute a new physics-based generative model and the corresponding Maximum a Posteriori estimate, providing the desired unification between heuristics-based methods and a Bayesian formulation. The key point is to systematically consider the error induced by the uncertainty in the geometric proxy. We provide an extensive discussion, analyzing how the obtained equations explain the heuristics developed in previous methods. Furthermore, we show that our novel Bayesian model significantly improves the quality of novel views, in particular if the scene geometry estimate is inaccurate.

...read moreread less

58 citations

Journal Article•DOI•

Low Complexity Adaptive View Synthesis Optimization in HEVC Based 3D Video Coding

[...]

Siwei Ma¹, Shiqi Wang¹, Wen Gao¹•Institutions (1)

Peking University¹

01 Jan 2014-IEEE Transactions on Multimedia

TL;DR: A novel zero-synthesized view difference (ZSVD) model is devised which jointly accounts for the distortion of the synthesized view induced by the compound impact of depth-disparity mapping, texture adaptation, and occlusion in the view synthesis process and can remarkably reduce the coding computational complexity with negligible performance loss.

...read moreread less

Abstract: In this correspondence, we explore a low-complexity adaptive view synthesis optimization (VSO) scheme in the upcoming high-efficiency video coding (HEVC)-based 3-D video coding standard. We first devise a novel zero-synthesized view difference (ZSVD) model which jointly accounts for the distortion of the synthesized view induced by the compound impact of depth-disparity mapping, texture adaptation, and occlusion in the view synthesis process. This model can efficiently estimate the maximum allowable depth distortion in synthesizing a virtual view without introducing any geometry distortion. Then, an adaptive ZSVD-aware VSO scheme is proposed by incorporating the ZSVD model into the rate-distortion optimization process, which is developed by pruning the conventional view synthesis algorithm. Extensive experimental results confirm that the proposed model is capable of accurately predicting the zero distortion of the synthesized view and exhibit that the proposed ZSVD-aware VSO scheme can remarkably reduce the coding computational complexity with negligible performance loss.

...read moreread less

40 citations

Journal Article•DOI•

View Synthesis Distortion Estimation for AVC- and HEVC-Compatible 3-D Video Coding

[...]

Byung Tae Oh¹, Kwan-Jung Oh²•Institutions (2)

Korea Aerospace University¹, Electronics and Telecommunications Research Institute²

01 Jun 2014-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The simulation results show that the proposed scheme can achieve approximately 5.4% and 10.2% coding gains for AVC- and HEVC-compatible 3-D coding, respectively, and the results show the remarkable complexity reduction of the scheme compared to the view synthesis optimization method currently used in3-D-HEVC.

...read moreread less

Abstract: This paper presents an efficient view synthesis distortion estimation method for 3-D video. It also introduces the application of this method to Advanced Video Coding (AVC)- and High Efficiency Video Coding (HEVC)-compatible 3-D video coding. Although the proposed view synthesis distortion scheme is generic, its use for actual 3-D video codec systems addresses the many issues caused by different video-coding formats and restrictions. The solutions for these issues are herein proposed. The simulation results show that the proposed scheme can achieve approximately 5.4% and 10.2% coding gains for AVC- and HEVC-compatible 3-D coding, respectively. In addition, the results show the remarkable complexity reduction of the scheme compared to the view synthesis optimization method currently used in 3-D-HEVC. The proposed method has been adopted into the presently developing AVC- and HEVC-compatible test model reference software.

...read moreread less

37 citations

Proceedings Article•DOI•

A DASH-based Free Viewpoint Video Streaming System

[...]

Ahmed Hamza¹, Mohamed Hefeeda¹•Institutions (1)

Simon Fraser University¹

19 Mar 2014

TL;DR: A rate adaptation logic based on sampled rate-distortion (R-D) values, which relate the distortion of synthesized view to the bit rates of the texture and depth components of the reference views, is proposed to maximize the quality of rendered virtual views.

...read moreread less

Abstract: We present an interactive free-viewpoint video (FVV) streaming system that is based on the dynamic adaptive streaming over HTTP (DASH) standard. The system uses standard HTTP Web servers to achieve scalability with a large number of users and performs view synthesis and rate adaptation at the client-side to achieve high response time. We propose a rate adaptation logic based on sampled rate-distortion (R-D) values, which relate the distortion of synthesized view to the bit rates of the texture and depth components of the reference views, to maximize the quality of rendered virtual views. Initial results indicate that the proposed R-D-based rate adaptation strategy outperforms equal bit rate allocation among the reference streams components.

...read moreread less

35 citations

Journal Article•DOI•

Probability-Based Rendering for View Synthesis

[...]

Bumsub Ham¹, Dongbo Min, Changjae Oh¹, Minh N. Do², Kwanghoon Sohn¹ - Show less +1 more•Institutions (2)

Yonsei University¹, University of Illinois at Urbana–Champaign²

01 Feb 2014-IEEE Transactions on Image Processing

TL;DR: The PBR is effective in suppressing flicker artifacts of virtual video rendering although no temporal aspect is considered, and it is shown that the depth map itself calculated from the RWR-based method (by simply choosing the most probable matching point) is also comparable with that of the state-of-the-art local stereo matching methods.

...read moreread less

Abstract: In this paper, a probability-based rendering (PBR) method is described for reconstructing an intermediate view with a steady-state matching probability (SSMP) density function. Conventionally, given multiple reference images, the intermediate view is synthesized via the depth image-based rendering technique in which geometric information (e.g., depth) is explicitly leveraged, thus leading to serious rendering artifacts on the synthesized view even with small depth errors. We address this problem by formulating the rendering process as an image fusion in which the textures of all probable matching points are adaptively blended with the SSMP representing the likelihood that points among the input reference images are matched. The PBR hence becomes more robust against depth estimation errors than existing view synthesis approaches. The MP in the steady-state, SSMP, is inferred for each pixel via the random walk with restart (RWR). The RWR always guarantees visually consistent MP, as opposed to conventional optimization schemes (e.g., diffusion or filtering-based approaches), the accuracy of which heavily depends on parameters used. Experimental results demonstrate the superiority of the PBR over the existing view synthesis approaches both qualitatively and quantitatively. Especially, the PBR is effective in suppressing flicker artifacts of virtual video rendering although no temporal aspect is considered. Moreover, it is shown that the depth map itself calculated from our RWR-based method (by simply choosing the most probable matching point) is also comparable with that of the state-of-the-art local stereo matching methods.

...read moreread less

34 citations

Proceedings Article•DOI•

Inferring Unseen Views of People

[...]

Chao-Yeh Chen¹, Kristen Grauman¹•Institutions (1)

University of Texas at Austin¹

23 Jun 2014

TL;DR: This work generates novel synthetic views of people based on a 3D appearance tensor indexed by images, viewpoints, and image positions and shows that the inferred views are both visually and quantitatively accurate.

...read moreread less

Abstract: We pose unseen view synthesis as a probabilistic tensor completion problem. Given images of people organized by their rough viewpoint, we form a 3D appearance tensor indexed by images (pose examples), viewpoints, and image positions. After discovering the low-dimensional latent factors that approximate that tensor, we can impute its missing entries. In this way, we generate novel synthetic views of people -- even when they are observed from just one camera viewpoint. We show that the inferred views are both visually and quantitatively accurate. Furthermore, we demonstrate their value for recognizing actions in unseen views and estimating viewpoint in novel images. While existing methods are often forced to choose between data that is either realistic or multi-view, our virtual views offer both, thereby allowing greater robustness to viewpoint in novel images.

...read moreread less

32 citations

Journal Article•DOI•

A Weighted Optimization Approach to Time-of-Flight Sensor Fusion

[...]

Sebastian Schwarz¹, Mårten Sjöström¹, Roger Olsson¹•Institutions (1)

Mid Sweden University¹

01 Jan 2014-IEEE Transactions on Image Processing

TL;DR: Object and subjective results proof the suitability of the approach to time-of-flight super resolution approach for depth scenery capture, based on the combination of depth and texture sources.

...read moreread less

Abstract: Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.

...read moreread less

30 citations

Patent•

Multi-view synthesis in real-time with fallback to 2D from 3D to reduce flicker in low or unstable stereo-matching image regions

[...]

Che Yuen Brian Lam¹, Wei Lun Alan Cheung¹•Institutions (1)

Hong Kong Applied Science and Technology Research Institute¹

24 Mar 2014

TL;DR: In this paper, a first depth map is generated from stereo images by stereo-matching and a disparity fallback selects a second depth map from a single view without stereo matching, preventing stereo matching errors from producing visible artifacts or flickering.

...read moreread less

Abstract: Multi view images are generated with reduced flickering. A first depth map is generated from stereo images by stereo-matching. When stereo-matching is poor or varies too much from frame to frame, disparity fallback selects a second depth map that is generated from a single view without stereo-matching, preventing stereo-matching errors from producing visible artifacts or flickering. Flat or textureless regions can use the second depth map, while regions with good stereo-matching use the first depth map. Depth maps are generated with a one-frame delay and buffered. Low-cost temporal coherence reduces costs used for stereo-matching when the pixel location selected as the lowest-cost disparity is within a distance threshold of the same pixel in a last frame. Hybrid view synthesis uses forward mapping for smaller numbers of views, and backward mapping from the forward-mapping results for larger numbers of views. Rotated masks are generated on-the-fly for backward mapping.

...read moreread less

Patent•

Harmonized inter-view and view synthesis prediction for 3d video coding

[...]

Dmytro Rusanovskyy¹, Miska Hannuksela¹•Institutions (1)

Nokia¹

07 Apr 2014

TL;DR: In this article, the type of prediction used for a reference picture index may be signaled in the video bit-stream, and the omission of motion vectors from the video bits-stream for a certain image element may also be signaled; signaling may indicate to the decoder that motion vectors used in prediction are to be construed at decoding.

...read moreread less

Abstract: There are disclosed various methods, apparatuses and computer program products for video encoding. The type of prediction used for a reference picture index may be signaled in the video bit-stream. The omission of motion vectors from the video bit-stream for a certain image element may also be signaled; signaling may indicate to the decoder that motion vectors used in prediction are to be construed at the decoder. The construction of motion vectors may take place by using disparity information that has been obtained from depth information of the picture being used as a reference.

...read moreread less

Proceedings Article•DOI•

Inter-view consistent hole filling in view extrapolation for multi-view image generation

[...]

Soo Sung Yoon¹, Hosik Sohn¹, Yong Ju Jung¹, Yong Man Ro¹•Institutions (1)

KAIST¹

29 Oct 2014

TL;DR: Experimental results show that the proposed method significantly improves the inter-view consistency for multiview images and depth maps, compared to those of previous methods.

...read moreread less

Abstract: This paper proposes a new inter-view consistent hole filling method in view extrapolation for multi-view image generation. In stereopsis, inter-view consistency regarding structure, color, and luminance is one of the crucial factors that affect the overall viewing quality of three-dimensional image contents. In particular, the inter-view inconsistency could induce visual stress on the human visual system. To ensure the inter-view consistency, the proposed method suggests a hole filling method in an order from the nearest to farthest view to the reference view by propagating the filled color information in the preceding view. In addition, a novel depth map filling method is incorporated to achieve the inter-view consistency. Experimental results show that the proposed method significantly improves the inter-view consistency for multiview images and depth maps, compared to those of previous methods.

...read moreread less

Patent•

Technique for view synthesis

[...]

Branko Djordjevic¹, Can Cagman¹, Alexander Langer¹•Institutions (1)

Ericsson¹

10 Oct 2014

TL;DR: In this article, a technique for computing an image of a virtual view based on a plurality of camera views is presented, where two or three camera views that at least partially overlap and overlap with the virtual view are selected among the plurality of views.

...read moreread less

Abstract: A technique for computing an image of a virtual view based on a plurality of camera views is presented. One or more cameras provide the plurality of camera views. As to a method aspect of the technique, two or three camera views that at least partially overlap and that at least partially overlap with the virtual view are selected among the plurality of camera views. The image of the virtual view is computed based on objects in the selected camera views using a multilinear relation that relates the selected camera views and the virtual view.

...read moreread less

Journal Article•DOI•

Depth Map Coding for View Synthesis Based on Distortion Analyses

[...]

Feng Shao¹, Weisi Lin², Gangyi Jiang¹, Mei Yu¹, Qionghai Dai³ - Show less +1 more•Institutions (3)

Ningbo University¹, Nanyang Technological University², Tsinghua University³

20 Jan 2014-IEEE Journal on Emerging and Selected Topics in Circuits and Systems

TL;DR: The experimental results show that the proposed scheme not only achieves high view synthesis performance, but also reduce the computational complexity of encoding.

...read moreread less

Abstract: In 3-D video, view synthesis with depth-image-based rendering is employed to generate any virtual view between available camera views. Distortions in depth map induce geometry changes in the virtual views, and thus degrade the performance of view synthesis. This paper proposes a depth map coding method to improve the performance of view synthesis based on distortion analyses. The major technical innovation of this paper is to formulate maximum tolerable depth distortion (MTDD) and depth disocclusion mask (DDM), since such depth sensitivity for view synthesis and inter-view redundancy can be well utilized in coding. To be more specific, we define two different encoders (e.g., base encoder and side encoder) for depth maps in left and right views, respectively. For base encoding, different types of coding units are extracted based on the distribution of MTDD and assigned with different quantitative parameters for coding. For side encoding, a warped-SKIP mode is designed to remove inter-view redundancy based on the distribution of DDM. The experimental results show that the proposed scheme not only achieves high view synthesis performance, but also reduce the computational complexity of encoding.

...read moreread less

Journal Article•DOI•

Efficient Multiview Depth Coding Optimization Based on Allowable Depth Distortion in View Synthesis

[...]

Yun Zhang¹, Sam Kwong¹, Sudeng Hu², Chung-Chieh Jay Kuo²•Institutions (2)

Chinese Academy of Sciences¹, University of Southern California²

08 Sep 2014-IEEE Transactions on Image Processing

TL;DR: An allowable depth distortion (ADD) model is presented for 3D depth map coding, and an ADD-based rate-distortion model is proposed for mode decision and motion/disparity estimation modules aiming at minimizing view synthesis distortion at a given bit rate constraint.

...read moreread less

Abstract: Depth video is used as the geometrical information of 3D world scenes in 3D view synthesis. Due to the mismatch between the number of depth levels and disparity levels in the view synthesis, the relationship between depth distortion and rendering position error can be modeled as a many-to-one mapping function, in which different depth distortion values might be projected to the same geometrical distortion in the synthesized virtual view image. Based on this property, we present an allowable depth distortion (ADD) model for 3D depth map coding. Then, an ADD-based rate-distortion model is proposed for mode decision and motion/disparity estimation modules aiming at minimizing view synthesis distortion at a given bit rate constraint. In addition, an ADD-based depth bit reduction algorithm is proposed to further reduce the depth bit rate while maintaining the qualities of the synthesized images. Experimental results in intra depth coding show that the proposed overall algorithm achieves Bjontegaard delta peak signal-to-noise ratio gains of 1.58 and 2.68 dB on average for half and integer-pixel rendering precisions, respectively. In addition, the proposed algorithms are also highly efficient for inter depth coding when evaluated with different metrics.

...read moreread less

Journal Article•DOI•

A flexible architecture for multi-view 3DTV based on uncalibrated cameras

[...]

Mansi Sharma¹, Santanu Chaudhury¹, Brejesh Lall¹, M. S. Venkatesh¹•Institutions (1)

Indian Institutes of Technology¹

01 May 2014-Journal of Visual Communication and Image Representation

TL;DR: The proposed signal representation improves the interactivity of dense point-based methods, making them appropriate for modeling the scene semantics and free-viewpoint 3DTV applications, and a ''selective'' warping technique is proposed that takes the advantage of temporal coherence to reduce the computational overhead.

...read moreread less

Journal Article•DOI•

Computational multi-view imaging with kinect

[...]

Xinchen Ye¹, Jingyu Yang¹, Hao Huang¹, Chunping Hou¹, Yao Wang² - Show less +1 more•Institutions (2)

Tianjin University¹, New York University²

21 Aug 2014-IEEE Transactions on Broadcasting

TL;DR: Experimental results show that the proposed lightweight multiview imaging approach with Kinect, a handheld integrated depth-color camera, under the depth-image-based rendering framework restores high quality depth maps even for large missing areas, and synthesizes naturalMultiview images from restored depth maps.

...read moreread less

Abstract: The lack of 3-D content has become a bottleneck for the advancement of three-dimensional television (3-DTV), but conventional multicamera arrays for multiview imaging are expensive to setup and cumbersome to use. This paper proposes a lightweight multiview imaging approach with Kinect, a handheld integrated depth-color camera, under the depth-image-based rendering framework. The proposed method consists of two components: depth restoration from noisy and incomplete depth measurements and view synthesis from depth-color pairs. In depth restoration, we propose a moving 2-D polynomial approximation via least squares to suppress quantization errors in the acquired depth values, and propose a progressive edge-guided trilateral filter to fill missing areas of the depth map. Edges extracted from color image are used to predict the locations of depth discontinuities in missing areas and to guide the proposed trilateral filter avoiding filtering across discontinuities. In view synthesis, we propose a low-rank matrix restoration model to inpaint disocclusion regions, fully exploiting the nonlocal correlations in images, and devise an efficient algorithm under the augmented lagrange multiplier (ALM) framework. Disocclusion areas are inpainted progressively from the boundaries of disocclusion with an estimated priority consisting of four terms: warping term, reliability term, texture term, and depth term. Experimental results show that our method restores high quality depth maps even for large missing areas, and synthesizes natural multiview images from restored depth maps. Strong 3-D visual experiences are observed when the synthesized multiview images are shown in two types of stereoscopic displays.

...read moreread less

Proceedings Article•DOI•

Real-time Reflective and Refractive Novel-view Synthesis

[...]

Gerrit Lochmann¹, Bernhard Reinert², Tobias Ritschel², Stefan Müller¹, Hans-Peter Seidel² - Show less +1 more•Institutions (2)

University of Koblenz and Landau¹, Max Planck Society²

01 Jan 2014

TL;DR: This work extends novel-view image synthesis from the common diffuse and opaque image formation model to the reflective and refractive case, using a ray tree of RGBZ images, where each node contains one RGB light path which is to be warped differently depending on the depth Z and the type of path.

...read moreread less

Abstract: We extend novel-view image synthesis from the common diffuse and opaque image formation model to the reflective and refractive case. Our approach uses a ray tree of RGBZ images, where each node contains one RGB light path which is to be warped differently depending on the depth Z and the type of path. Core of our approach are two efficient procedures for reflective and refractive warping. Different from the diffuse and opaque case, no simple direct solution exists for general geometry. Instead, a per-pixel optimization in combination with informed initial guesses warps an HD image with reflections and refractions in 18 ms on a current mobile GPU. The key application is latency avoidance in remote rendering in particular for head-mounted displays. Other applications are single-pass stereo or multi-view, motion blur and depth-of-field rendering as well as their combinations.

...read moreread less

Proceedings Article•DOI•

FTV standardization in MPEG

[...]

Masayuki Tanimoto

02 Jul 2014

TL;DR: MPEG started the third phase of FTV standardization in August 2013, targeting super multiview and free navigation applications, which need more flexible camera arrangement, more efficient video coding and better view synthesis.

...read moreread less

Abstract: FTV (Free-viewpoint TV) enables to view a 3D world by freely changing the viewpoint. MPEG has been developing FTV standards since 2001. MVC (Multiview Video Coding) was the first phase of FTV, which enabled the efficient coding of multiview video. View synthesis is not considered in MVC. 3DV (3D Video) is the second phase of FTV, which enables the efficient coding of multiview video and their depth maps for multiview displays. View synthesis between linearly arranged cameras is considered in 3DV. Based on recent development of 3D technology, MPEG started the third phase of FTV in August 2013, targeting super multiview and free navigation applications. They need more flexible camera arrangement, more efficient video coding and better view synthesis. The vision of this FTV standardization is to establish a new FTV framework that revolutionizes viewing of 3D scenes.

...read moreread less

Journal Article•DOI•

Bit Allocation Algorithm With Novel View Synthesis Distortion Model for Multiview Video Plus Depth Coding

[...]

Tae-Young Chung¹, Jae-Young Sim², Chang-Su Kim³•Institutions (3)

Samsung¹, Ulsan National Institute of Science and Technology², Korea University³

02 Jun 2014-IEEE Transactions on Image Processing

TL;DR: An efficient bit allocation algorithm based on a novel view synthesis distortion model is proposed for the rate-distortion optimized coding of multiview video plus depth sequences in this paper, which can optimally divide a limited bit budget between the texture and depth data.

...read moreread less

Abstract: An efficient bit allocation algorithm based on a novel view synthesis distortion model is proposed for the rate-distortion optimized coding of multiview video plus depth sequences in this paper. We decompose an input frame into nonedge blocks and edge blocks. For each nonedge block, we linearly approximate its texture and disparity values, and derive a view synthesis distortion model, which quantifies the impacts of the texture and depth distortions on the qualities of synthesized virtual views. On the other hand, for each edge block, we use its texture and disparity gradients for the distortion model. In addition, we formulate a bit-rate allocation problem in terms of the quantization parameters for texture and depth data. By solving the problem, we can optimally divide a limited bit budget between the texture and depth data, in order to maximize the qualities of synthesized virtual views, as well as those of encoded real views. Experimental results demonstrate that the proposed algorithm yields the average PSNR gains of 1.98 and 2.04 dB in two-view and three-view scenarios, respectively, as compared with a benchmark conventional algorithm.

...read moreread less

Proceedings Article•DOI•

Efficient Coding of Depth Map by Exploiting Temporal Correlation

[...]

Shampa Shahriyar¹, Manzur Murshed², Mortuza Ali², Manoranjan Paul³•Institutions (3)

Monash University¹, Federation University Australia², Charles Sturt University³

01 Nov 2014

TL;DR: It is demonstrated that encoding inter-coded depth block residuals with quantization at pixel domain is more efficient than the intra-coding techniques relying on explicit edge preservation.

...read moreread less

Abstract: With the growing demands for 3D and multi-view video content, efficient depth data coding becomes a vital issue in image and video coding area. In this paper, we propose a simple depth coding scheme using multiple prediction modes exploiting temporal correlation of depth map. Current depth coding techniques mostly depend on intra-coding mode that cannot get the advantage of temporal redundancy in the depth maps and higher spatial redundancy in inter-predicted depth residuals. Depth maps are characterized by smooth regions with sharp edges that play an important role in the view synthesis process. As depth maps are more sensitive to coding errors, use of transformation or approximation of edges by explicit edge modelling has impact on view synthesis quality. Moreover, lossy compression of depth map brings additional geometrical distortion to synthetic view. In this paper, we have demonstrated that encoding inter-coded depth block residuals with quantization at pixel domain is more efficient than the intra-coding techniques relying on explicit edge preservation. On standard 3D video sequences, the proposed depth coding has achieved superior image quality of synthesized views against the new 3D-HEVC standard for depth map bit-rate 0.25 bpp or higher.

...read moreread less

Proceedings Article•DOI•

Edge enhancement of depth based rendered images

[...]

Muhammad Shahid Farid¹, Maurizio Lucenteforte¹, Marco Grangetto¹•Institutions (1)

University of Turin¹

01 Oct 2014

TL;DR: It is shown that it is possible to significantly improve the visual quality of the interpolated view by enforcing prior knowledge on the admissible deformations of edges under projective transformation and results show that the proposed approach is very effective.

...read moreread less

Abstract: Depth image based rendering is a well-known technology for the generation of virtual views in between a limited set of views acquired by a cameras array Intermediate views are rendered by warping image pixels based on their depth Nonetheless, depth maps are usually imperfect as they need to be estimated through stereo matching algorithms; moreover, for representation and transmission requirements depth values are obviously quantized Such depth representation errors translate into a warping error when generating intermediate views thus impacting on the rendered image quality We observe that depth errors turn to be very critical when they affect the object contours since in such a case they cause significant structural distortion in the warped objects This paper presents an algorithm to improve the visual quality of the synthesized views by enforcing the shape of the edges in presence of erroneous depth estimates We show that it is possible to significantly improve the visual quality of the interpolated view by enforcing prior knowledge on the admissible deformations of edges under projective transformation Both visual and objective results show that the proposed approach is very effective

...read moreread less

Proceedings Article•DOI•

Free-viewpoint video sequences: A new challenge for objective quality metrics

[...]

Philippe Hanhart, Emilie Bosc, Patrick Le Callet, Touradj Ebrahimi

01 Sep 2014

TL;DR: Analyzing the performance of several commonly used objective quality metrics on FVV sequences, which were synthesized from decompressed depth data, using subjective scores as ground truth showed that commonly used metrics were not reliable predictors of perceived image quality when different contents and distortions were considered.

...read moreread less

Abstract: Free-viewpoint television is expected to create a more natural and interactive viewing experience by providing the ability to interactively change the viewpoint to enjoy a 3D scene. To render new virtual viewpoints, free-viewpoint systems rely on view synthesis. However, it is known that most objective metrics fail at predicting perceived quality of synthesized views. Therefore, it is legitimate to question the reliability of commonly used objective metrics to assess the quality of free-viewpoint video (FVV) sequences. In this paper, we analyze the performance of several commonly used objective quality metrics on FVV sequences, which were synthesized from decompressed depth data, using subjective scores as ground truth. Statistical analyses showed that commonly used metrics were not reliable predictors of perceived image quality when different contents and distortions were considered. However, the correlation improved when considering individual conditions, which indicates that the artifacts produced by some view synthesis algorithms might not be correctly handled by current metrics.

...read moreread less

Patent•

Method and apparatus of view synthesis prediction in 3d video coding

[...]

Yi-Wen Chen¹, Jicheng An, Jian-Liang Lin•Institutions (1)

MediaTek¹

02 Apr 2014

TL;DR: In this article, a method and apparatus for a three-dimensional encoding or decoding system incorporating view synthesis prediction (VSP) with reduced computational complexity and/or memory access bandwidth are disclosed.

...read moreread less

Abstract: A method and apparatus for a three-dimensional encoding or decoding system incorporating view synthesis prediction (VSP) with reduced computational complexity and/or memory access bandwidth are disclosed. The system applies the VSP process to the texture data only and applies non-VSP process to the depth data. Therefore, when a current texture block in a dependent view is coded according to VSP by backward warping the current texture block to the reference picture using an associated depth block and the motion parameter inheritance (MPI) mode is selected for the corresponding depth block in the dependent view, the corresponding depth block in the dependent view is encoded or decoded using non-VSP inter- view prediction based on motion information inherited from the current texture block.

...read moreread less

Proceedings Article•DOI•

Experiments on acquisition and processing of video for free-viewpoint television

[...]

Marek Domanski¹, Adrian Dziembowski¹, Agnieszka Kuehn¹, Maciej Kurc¹, Adam Luczak¹, Dawid Mieloch¹, Jakub Siast¹, Olgierd Stankiewicz¹, Krzysztof Wegner¹ - Show less +5 more•Institutions (1)

Poznań University of Technology¹

02 Jul 2014

TL;DR: An experimental multiview video production, processing and delivery chain developed at Poznan University of Technology for research on free-viewpoint television with no cabling is needed in the system, which is important for shooting real-world events.

...read moreread less

Abstract: The paper describes an experimental multi-view video production, processing and delivery chain developed at Poznan University of Technology for research on free-viewpoint television. The multiview-video acquisition system consists of HD camera units with wireless synchronization, wireless control, video storage and power supply units. Therefore no cabling is needed in the system that is very important for shooting real-world events. The system is mostly used for nearly circular setup of cameras but the locations of cameras are arbitrary, and the procedures for system calibration and multiview video correction are considered. The paper deals also with adoption for circular camera arrangement of the techniques implemented in Depth Estimation Reference Software and View Synthesis Reference Software.

...read moreread less

Patent•

View synthesis in 3d video

[...]

Ying Chen¹, Ye-Kui Wang¹, Li Zhang¹•Institutions (1)

Qualcomm¹

10 Jan 2014

TL;DR: In this paper, a method of decoding video data includes determining whether a reference index for a current block corresponds to an inter-view reference picture, and when the reference index of the current block correspond to the reference picture.

...read moreread less

Abstract: In an example, a method of decoding video data includes determining whether a reference index for a current block corresponds to an inter-view reference picture, and when the reference index for the current block corresponds to the inter-view reference picture, obtaining, from an encoded bitstream, data indicating a view synthesis prediction (VSP) mode of the current block, where the VSP mode for the reference index indicates whether the current block is predicted with view synthesis prediction from the inter-view reference picture.

...read moreread less

Journal Article•DOI•

Seamless View Synthesis Through Texture Optimization

[...]

Wenxiu Sun¹, Oscar C. Au¹, Lingfeng Xu¹, Yujun Li¹, Wei Hu¹ - Show less +1 more•Institutions (1)

Hong Kong University of Science and Technology¹

01 Jan 2014-IEEE Transactions on Image Processing

TL;DR: A novel view synthesis method named Visto, which uses a reference input view to generate synthesized views in nearby viewpoints that tends to implicitly inherit the image characteristics from the reference view without the explicit use of image priors or texture modeling.

...read moreread less

Abstract: In this paper, we present a novel view synthesis method named Visto, which uses a reference input view to generate synthesized views in nearby viewpoints. We formulate the problem as a joint optimization of inter-view texture and depth map similarity, a framework that is significantly different from other traditional approaches. As such, Visto tends to implicitly inherit the image characteristics from the reference view without the explicit use of image priors or texture modeling. Visto assumes that each patch is available in both the synthesized and reference views and thus can be applied to the common area between the two views but not the out-of-region area at the border of the synthesized view. Visto uses a Gauss-Seidel-like iterative approach to minimize the energy function. Simulation results suggest that Visto can generate seamless virtual views and outperform other state-of-the-art methods.

...read moreread less

Proceedings Article•DOI•

View synthesis quality mapping for depth-based super resolution on mixed resolution 3D video

[...]

Michal Joachimiak¹, Miska Hannuksela², Moncef Gabbouj¹•Institutions (2)

Tampere University of Technology¹, Nokia²

02 Jul 2014

TL;DR: This work proposes to improve the view synthesis step by per-pixel selection of the projection method and can improve the depth-based super resolution upsampling by 0.64% of dBR on average for total coded bitrate and 0.55% of ddBR for synthesized views.

...read moreread less

Abstract: Advances in 3D video technology expand the availability of hardware for 3D video generation and display Some 3D capture and coding arrangements take advantage of a mixed resolution setup In difference to typical 3D multiview video the mixed resolution scenario assumes that a subset of views is coded at reduced spatial resolution After decoding, the low resolution views have to upsampled in order to preserve homogeneous resolution of the rendered 3D video We improve the depth-based super resolution technique that uses a view synthesis process as an essential step We propose to improve the view synthesis step by per-pixel selection of the projection method The method is tested on the data coded using a mixed resolution 3D video codec implemented on top of the 3DV-ATM reference software and evaluated under the JCT-3V common test conditions Simulation results show that our method can improve the depth-based super resolution upsampling by 064% of dBR on average for total coded bitrate and 055% of dBR on average for synthesized views Aggregated coding gain with respect to full resolution scenario is improved by 771% of dBR on average for total coded bitrate and 111% dBR for synthesized views

...read moreread less

Patent•

Multi-view video encoding method using view synthesis prediction and apparatus therefor, and multi-view video decoding method and apparatus therefor

[...]

Jin-Young Lee¹, Byeong-Doo Choi¹, Min-Woo Park¹, Ho-Cheon Wey¹, Jae-Won Yoon¹, Yong-jin Cho¹ - Show less +2 more•Institutions (1)

Samsung¹

17 Apr 2014

TL;DR: In this paper, a multi-view video decoding apparatus and method and a multiview encoding apparatus are presented. But the decoding method is not described. But it is assumed that the decoding is performed on an adjacent block of the current block.

...read moreread less

Abstract: Provided are a multi-view video decoding apparatus and method and a multi-view encoding apparatus and method. The decoding method includes: determining whether a prediction mode of a current block being decoded is a merge mode; when the prediction mode is determined to be the merge mode, forming a merge candidate list including at least one of an inter-view candidate, a spatial candidate, a disparity candidate, a view synthesis prediction candidate, and a temporal candidate; and predicting the current block by selecting a merge candidate for predicting the current block from the merge candidate list, wherein whether to include, in the merge candidate list, at least one of a view synthesis prediction candidate for an adjacent block of the current block and a view synthesis prediction candidate for the current block is determined based on whether view synthesis prediction is performed on the adjacent block and the current block.

...read moreread less