Showing papers on "View synthesis published in 2007"

PDF

Open Access

Journal Article•DOI•

[...]

Derek Hoiem¹, Alexei A. Efros¹, Martial Hebert¹•Institutions (1)

01 Oct 2007-International Journal of Computer Vision

TL;DR: This paper takes the first step towards constructing the surface layout, a labeling of the image intogeometric classes, to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region.

...read moreread less

Abstract: Humans have an amazing ability to instantly grasp the overall 3D structure of a scene--ground orientation, relative positions of major landmarks, etc.--even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.

...read moreread less

735 citations

Patent•

Method and system for processing multiview videos for view synthesis using skip and direct modes

[...]

Sehoon Yea¹, Anthony Vetro¹•Institutions (1)

Mitsubishi Electric Research Laboratories¹

09 Jan 2007

TL;DR: In this paper, a method for synthesizing a particular view of the multiview video is presented, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera.

...read moreread less

Abstract: A method processes a multiview videos of a scene, in which each video is acquired by a corresponding camera arranged at a particular pose, and in which a view of each camera overlaps with the view of at least one other camera. Side information for synthesizing a particular view of the multiview video is obtained in either an encoder or decoder. A synthesized multiview video is synthesized from the multiview videos and the side information. A reference picture list is maintained for each current frame of each of the multiview videos, the reference picture indexes temporal reference pictures and spatial reference pictures of the acquired multiview videos and the synthesized reference pictures of the synthesized multiview video. Each current frame of the multiview videos is predicted according to reference pictures indexed by the associated reference picture list with a skip mode and a direct mode, whereby the side information is inferred from the synthesized reference picture.

...read moreread less

208 citations

Journal Article•DOI•

Image-Based Rendering and Synthesis

[...]

Shing-Chow Chan, Heung-Yeung Shum, King-To Ng

27 Nov 2007-IEEE Signal Processing Magazine

TL;DR: An object-based IBR system to illustrate the techniques involved and its potential application in view synthesis and processing are explained and Stereo matching, which is an important technique for depth estimation and view synthesis, is briefly explained and some of the top-ranked methods are highlighted.

...read moreread less

Abstract: One of the most important applications in multiview imaging (MVI) is the development of advanced immersive viewing or visualization systems using, for instance, 3DTV. With the introduction of multiview TVs, it is expected that a new age of 3DTV systems will arrive in the near future. Image-based rendering (IBR) refers to a collection of techniques and representations that allow 3-D scenes and objects to be visualized in a realistic way without full 3-D model reconstruction. IBR uses images as the primary substrate. The potential for photorealistic visualization has tremendous appeal, and it has been receiving increasing attention over the years. Applications such as video games, virtual travel, and E-commerce stand to benefit from this technology. This article serves as a tutorial introduction and brief review of this important technology. First the classification, principles, and key research issues of IBR are discussed. Then, an object-based IBR system to illustrate the techniques involved and its potential application in view synthesis and processing are explained. Stereo matching, which is an important technique for depth estimation and view synthesis, is briefly explained and some of the top-ranked methods are highlighted. Finally, the challenging problem of interactive IBR is explained. Possible solutions and some state-of-the-art systems are also reviewed.

...read moreread less

150 citations

Efficient Dense Stereo with Occlusion for New View-Synthesis by Four-State Dynamic Programming

[...]

Antonio Criminisi, Jamie Shotton, Andrew Blake, Carsten Rother, Philip H. S. Torr - Show less +1 more

01 Jan 2007

TL;DR: A new algorithm is proposed for efficient stereo and novel view synthesis that synthesises images from a virtual camera in arbitrary position near the physical cameras based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation.

...read moreread less

Abstract: A new algorithm is proposed for efﬁcient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efﬁcient novel view generation. The two main contributions of this paper are: i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (ﬂicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space. The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efﬁcient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, ﬁnally, basic 3D scene editing.

...read moreread less

118 citations

Journal Article•DOI•

Efficient Dense Stereo with Occlusions for New View-Synthesis by Four-State Dynamic Programming

[...]

Antonio Criminisi¹, Andrew Blake¹, Carsten Rother¹, Jamie Shotton², Philip H. S. Torr³ - Show less +1 more•Institutions (3)

Microsoft¹, University of Cambridge², Oxford Brookes University³

01 Jan 2007-International Journal of Computer Vision

TL;DR: In this article, a new four state matching graph for dense stereo dynamic programming is proposed to support accurate occlusion labelling and a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface.

...read moreread less

Abstract: A new algorithm is proposed for efficient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation. The two main contributions of this paper are: (i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; (ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space. The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efficient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing.

...read moreread less

106 citations

Journal Article•DOI•

Reconstructing Dense Light Field From Array of Multifocus Images for Novel View Synthesis

[...]

Akira Kubota¹, Kiyoharu Aizawa², Tsuhan Chen³•Institutions (3)

Tokyo Institute of Technology¹, University of Tokyo², Carnegie Mellon University³

01 Jan 2007-IEEE Transactions on Image Processing

TL;DR: This paper presents a novel method for synthesizing a novel view from two sets of differently focused images taken by an aperture camera array for a scene consisting of two approximately constant depths.

...read moreread less

Abstract: This paper presents a novel method for synthesizing a novel view from two sets of differently focused images taken by an aperture camera array for a scene consisting of two approximately constant depths. The proposed method consists of two steps. The first step is a view interpolation to reconstruct an all-in-focus dense light field of the scene. The second step is to synthesize a novel view by a light-field rendering technique from the reconstructed dense light field. The view interpolation in the first step can be achieved simply by linear filters that are designed to shift different object regions separately, without region segmentation. The proposed method can effectively create a dense array of pin-hole cameras (i.e., all-in-focus images), so that the novel view can be synthesized with better quality

...read moreread less

92 citations

Journal Article•DOI•

Photo-Consistent Reconstruction of Semitransparent Scenes by Density-Sheet Decomposition

[...]

Samuel W. Hasinoff¹, Kiriakos N. Kutulakos¹•Institutions (1)

University of Toronto¹

01 May 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper considers the problem of reconstructing visually realistic 3D models of dynamic semitransparent scenes, such as fire, from a very small set of simultaneous views, and reduces reconstruction to a convex combination of sheet-like density fields, each of which is derived from the density sheet of two input views.

...read moreread less

Abstract: This paper considers the problem of reconstructing visually realistic 3D models of dynamic semitransparent scenes, such as fire, from a very small set of simultaneous views (even two). We show that this problem is equivalent to a severely underconstrained computerized tomography problem, for which traditional methods break down. Our approach is based on the observation that every pair of photographs of a semitransparent scene defines a unique density field, called a density sheet, that 1) concentrates all its density on one connected, semitransparent surface, 2) reproduces the two photos exactly, and 3) is the most spatially compact density field that does so. From this observation, we reduce reconstruction to the convex combination of sheet-like density fields, each of which is derived from the density sheet of two input views. We have applied this method specifically to the problem of reconstructing 3D models of fire. Experimental results suggest that this method enables high-quality view synthesis without overfitting artifacts

...read moreread less

68 citations

Journal Article•DOI•

Virtual Viewpoint Replay for a Soccer Match by View Interpolation From Multiple Cameras

[...]

Naho Inamoto¹, Hideo Saito¹•Institutions (1)

Keio University¹

01 Oct 2007-IEEE Transactions on Multimedia

TL;DR: A novel method for virtual view synthesis that allows viewers to virtually fly through real soccer scenes, which are captured by multiple cameras in a stadium, by view interpolation of real camera images near the chosen viewpoints.

...read moreread less

Abstract: This paper presents a novel method for virtual view synthesis that allows viewers to virtually fly through real soccer scenes, which are captured by multiple cameras in a stadium. The proposed method generates images of arbitrary viewpoints by view interpolation of real camera images near the chosen viewpoints. In this method, cameras do not need to be strongly calibrated since projective geometry between cameras is employed for the interpolation. For avoiding the complex and unreliable process of 3-D recovery, object scenes are segmented into several regions according to the geometric property of the scene. Dense correspondence between real views, which is necessary for intermediate view generation, is automatically obtained by applying projective geometry to each region. By superimposing intermediate images for all regions, virtual views for the entire soccer scene are generated. The efforts for camera calibration are reduced and correspondence matching requires no manual operation; hence, the proposed method can be easily applied to dynamic events in a large space. An application for fly-through observations of soccer match replays is introduced along with the algorithm of view synthesis and experimental results. This is a new approach for providing arbitrary views of an entire dynamic event.

...read moreread less

64 citations

Proceedings Article•DOI•

On New View Synthesis Using Multiview Stereo.

[...]

Oliver Woodford, Ian Reid, Philip H. S. Torr, Andrew Fitzgibbon

01 Jan 2007

TL;DR: It is shown that application of modern multiview stereo techniques to the newview synthesis (NVS) problem introduces a number of non-trivial complexities, and a synthesis of the two approaches which yields good results on difficult image sequences is addressed.

...read moreread less

Abstract: We show that application of modern multiview stereo techniques to the newview synthesis (NVS) problem introduces a number of non-trivial complexities. By simultaneously solving for the colour and depth of the new-view pixels we can eliminate the visual artefacts that conventional NVS-via-stereo suffers. The global occlusion reasoning which has led to considerable improvements in recent stereo algorithms can easily be included in the new algorithm, using a recently improved graph-cut-based optimizer for general multi-label conditional random fields (CRFs). However, the CRF priors that are important to success in stereo cannot be easily applied if the reconstruction is to be computed in the reference frame of the novel view. We address this problem by extending recent work on the fast optimization of texture priors in NVS to model the image edge structure, yielding a synthesis of the two approaches which yields good results on difficult image sequences.

...read moreread less

39 citations

Proceedings Article•DOI•

Depth Estimation for View Synthesis in Multiview Video Coding

[...]

Serdar Ince¹, Emin Martinian¹, Sehoon Yea¹, Anthony Vetro¹•Institutions (1)

Mitsubishi¹

07 May 2007

TL;DR: This work presents several improvements to the reference block-based depth estimation approach and demonstrates that the proposed method of depth estimation is not only efficient for view synthesis prediction, but also produces depth maps that require much fewer bits to code.

...read moreread less

Abstract: The compression of multiview video in an end-to-end 3D system is required to reduce the amount of visual information. Since multiple cameras usually have a common field of view, high compression ratios can be achieved if both the temporal and inter-view redundancy are exploited. View synthesis prediction is a new coding tool for multiview video that essentially generates virtual views of a scene using images from neighboring cameras and estimated depth values. In this work, we consider depth estimation for view synthesis in multiview video encoding. We focus on generating smooth and accurate depth maps, which can be efficiently coded. We present several improvements to the reference block-based depth estimation approach and demonstrate that the proposed method of depth estimation is not only efficient for view synthesis prediction, but also produces depth maps that require much fewer bits to code.

...read moreread less

33 citations

Proceedings Article•DOI•

An Image-Based Rendering (IBR) Approach for Realistic Stereo View Synthesis of TV Broadcast Based on Structure from Motion

[...]

Sebastian Knorr, Thomas Sikora

12 Nov 2007

TL;DR: A new approach for realistic stereo view synthesis (RSVS) of existing 2D video material is presented, based on structure from motion techniques and uses image-based rendering to reconstruct the desired stereo views for each video frame.

...read moreread less

Abstract: In the past years, the 3D display technology has become a booming branch of research with fast technical progress. Hence, the 3D conversion of already existing 2D video material increases more and more in popularity. In this paper, a new approach for realistic stereo view synthesis (RSVS) of existing 2D video material is presented. The intention of our work is not a real-time conversion of existing video material with a deduction in stereo perception, but rather a more realistic off-line conversion with high accuracy. Our approach is based on structure from motion techniques and uses image-based rendering to reconstruct the desired stereo views for each video frame. The algorithm is tested on several TV broadcast videos, as well as on sequences captured with a single handheld camera. Finally, some simulation results will show the remarkable performance of this approach.

...read moreread less

Proceedings Article•DOI•

Super-Resolution Stereo- and Multi-View Synthesis from Monocular Video Sequences

[...]

Sebastian Knorr, Matthias Kunter, Thomas Sikora

21 Aug 2007

TL;DR: A new approach for generation of super-resolution stereoscopic and multi-view video from monocular video, an extension of the realistic stereo view synthesis (RSVS) approach which is based on structure from motion techniques and image-based rendering to generate the desired stereoscopic views for each point in time.

...read moreread less

Abstract: This paper presents a new approach for generation of super-resolution stereoscopic and multi-view video from monocular video. Such multi-view video is used for instance with multi-user 3D displays or auto-stereoscopic displays with head-tracking to create a depth impression of the observed scenery. Our approach is an extension of the realistic stereo view synthesis (RSVS) approach which is based on structure from motion techniques and image-based rendering to generate the desired stereoscopic views for each point in time. The extension relies on an additional super- resolution mode which utilizes a number of frames of the original video sequence to generate a virtual stereo frame with higher resolution. The algorithm is tested on several TV broadcast videos, as well as on sequences captured with a single handheld camera and sequences from the well known BBC documentation "Planet Earth". Finally, some simulation results will show that RSVS is quite suitable for super-resolution 2D-3D conversion.

...read moreread less

Book Chapter•DOI•

Super resolution of images of 3D scenecs

[...]

Uma Mudenagudi¹, Ankit Gupta¹, Lakshya Goel¹, Avanish Kushal¹, Prem Kalra¹, Subhashis Banerjee¹ - Show less +2 more•Institutions (1)

Indian Institute of Technology Delhi¹

18 Nov 2007

TL;DR: This work addresses the problem of super resolved generation of novel views of a 3D scene with the reference images obtained from cameras in general positions with a reconstruction based approach using MRF-MAP formalism and solves using graph cut optimization.

...read moreread less

Abstract: We address the problem of super resolved generation of novel views of a 3D scene with the reference images obtained from cameras in general positions; a problem which has not been tackled before in the context of super resolution and is also of importance to the field of image based rendering. We formulate the problem as one of estimation of the color at each pixel in the high resolution novel view without explicit and accurate depth recovery. We employ a reconstruction based approach using MRF-MAP formalism and solve using graph cut optimization. We also give an effective method to handle occlusion. We present compelling results on real images.

...read moreread less

Journal Article•DOI•

Specifying Virtual Cameras in Uncalibrated View Synthesis

[...]

Andrea Fusiello¹•Institutions (1)

University of Verona¹

01 May 2007-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper proposes an automatic method for specifying the virtual camera position and orientation in an uncalibrated setting, based on the interpolation and extrapolation of the motion among the reference views.

...read moreread less

Abstract: This paper deals with the views synthesis problem and proposes an automatic method for specifying the virtual camera position and orientation in an uncalibrated setting, based on the interpolation and extrapolation of the motion among the reference views. Novel images can be rendered from virtual cameras moving on parametric trajectories. Synthetic and real experiments illustrate the approach

...read moreread less

Proceedings Article•DOI•

RD-Optimized View Synthesis Prediction for Multiview Video Coding

[...]

Sehoon Yea¹, Anthony Vetro¹•Institutions (1)

Mitsubishi¹

12 Nov 2007

TL;DR: A rate-distortion optimized framework that incorporates view synthesis for improved prediction in multiview video coding and employs variable block-size depth/motion search, optimal mode decision including view synthesis prediction, and CABAC encoding of depth and correction vectors is proposed.

...read moreread less

Abstract: We propose a rate-distortion optimized framework that incorporates view synthesis for improved prediction in multiview video coding. In the proposed scheme, block-based depth and correction vectors are encoded and used at the decoder to generate the view synthesis prediction data. The proposed method employs variable block-size depth/motion search, optimal mode decision including view synthesis prediction, and CABAC encoding of depth and correction vectors. A sub-pixel reference matching technique is also introduced to improve prediction accuracy of the view synthesis prediction. Novel variants of the skip and direct modes are presented, which infer the depth and correction vector information from neighboring blocks in a synthesized reference picture to reduce the bits needed for the view synthesis prediction mode. Experimental results demonstrate improved coding efficiency with the proposed techniques.

...read moreread less

Proceedings Article•DOI•

High-Speed Stream-Centric Dense Stereo and View Synthesis on Graphics Hardware

[...]

Jiangbo Lu¹, S. Rogmans¹, Gauthier Lafruit¹, F. Catthoor¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Oct 2007

TL;DR: This paper presents an efficient image-based rendering system capable of performing online stereo matching and view synthesis at high speed, completely on the graphics processing unit (GPU).

...read moreread less

Abstract: This paper presents an efficient image-based rendering system capable of performing online stereo matching and view synthesis at high speed, completely on the graphics processing unit (GPU). Given two rectified stereo images, our algorithm first extracts the disparity map with a stream-centric dense depth estimation approach. For high-quality view synthesis, multi-label masks are then automatically generated to postprocess occlusions and ambiguously estimated regions adaptively. To allow even faster interactive view generation, an alternative forward warping method is also integrated. The experiments show that photorealistic intermediate views of high image quality are yielded by our algorithm. The optimized implementation also provides the state-of-the-art stereo analysis and view synthesis speed, achieving over 47 fps with 450x375 stereo images and 60 disparity levels on an Nvidia GeForce 7900 graphics card.

...read moreread less

Proceedings Article•DOI•

View Synthesis for Robust Distributed Video Compression in Wireless Camera Networks

[...]

Chuohao Yeo¹, Jiajun Wang¹, Kannan Ramchandran¹•Institutions (1)

University of California, Berkeley¹

12 Nov 2007

TL;DR: This work proposes a method for delivering error-resilient video from wireless camera networks in a distributed fashion over lossy channels based on distributed source coding that exploits inter-view correlation among cameras with overlapping views.

...read moreread less

Abstract: We propose a method for delivering error-resilient video from wireless camera networks in a distributed fashion over lossy channels. Our scheme is based on distributed source coding that exploits inter-view correlation among cameras with overlapping views. The main focus in this work is on robustness which is imminently needed in a wireless setting. The proposed approach has low encoding complexity, is robust while satisfying tight latency constraints, and requires no inter-camera communication. Our system is built on and is a multi-camera extension of PRISM[1], an earlier proposed single-camera distributed video compression system. Decoder motion search, a key attribute of single-camera PRISM, is extended to the multi-view setting by using estimated scene depth information when it is available. In particular, dense stereo correspondence and view synthesis are utilized to generate side-information. When combined with decoder motion search, our proposed method can be made insensitive to small errors in camera calibration, disparity estimation and view synthesis. In experiments over a simulated wireless channel, the proposed approach achieves up to 2.1 dB gain in PSNR over a system using H.263+ with forward error correction.

...read moreread less

Multi-View Imaging and 3DTV (Special Issue Overview and Introduction)

[...]

Akira Kubota, Aljoscha Smolic, Marcus Magnor, Masayuki Tanimoto, Tsuhan Chen - Show less +1 more

01 Jan 2007

TL;DR: 3DTV and FTV are some of the most important applications of MVI and are new types of media that expand the user experience beyond what is offered by traditional media.

...read moreread less

Abstract: Multi-view imaging (MVI) has attracted increasing attention recently, thanks to the rapidly droppingcost of digital cameras. This opens a wide variety of interesting new research topics and applica-tions, such as virtual view synthesis, high-performance imaging, image/video segmentation, object track-ing/recognition, environmental surveillance, remote education, industrial inspection, 3DTV, and FreeViewpoint TV (FTV) [9], [10]. While some of these tasks can be handled with conventional singleview images/video, the availability of multiple views of the scene signiﬁcantly broadens the ﬁeld ofapplications, while enhancing performance and user experience.3DTV and FTV are some of the most important applications of MVI and are new types of media thatexpand the user experience beyond what is offered by traditional media. They have been developed bythe convergence of new technologies from computer graphics, computer vision, multimedia, and relatedﬁelds. 3DTV, also referred to as stereo TV, offers a 3D depth impression of the observed scene, whileFTV allows for an interactive selection of viewpoint and direction within a certain operating range. 3DTVand FTV are not mutually exclusive. On the contrary, they can be very well combined within a singlesystem, since they are both based on a suitable 3D scene representation. In other words, given a 3Drepresentation of a scene, if a stereo pair of images corresponding to the human eyes can be rendered,the functionality of 3DTV is provided. If a virtual view (i.e., not an actual camera view) correspondingto an arbitrary viewpoint and viewing direction can be rendered, the functionality of FTV is provided.As seen in the movie The Matrix, successive switching of multiple real images captured at differentangles can give the sensation of a ﬂying viewpoint. In a similar way, Eye Vision [11] realized a ﬂyingvirtual camera for a scene in a Super Bowl game. It used 33 cameras arranged around the stadium andcontrolled the camera directions mechanically to track the target scene. In these systems, however, nonew virtual images are generated, and the movement of the viewpoint is limited to the predeﬁned original

...read moreread less

Journal Article•DOI•

View synthesis by the parallel use of GPU and CPU

[...]

Indra Geys¹, Luc Van Gool²•Institutions (2)

Katholieke Universiteit Leuven¹, École Polytechnique Fédérale de Lausanne²

01 Jul 2007-Image and Vision Computing

TL;DR: An algorithm for efficient depth calculations and view synthesis that applies a min-cut/max-flow algorithm on a graph, implemented on the CPU, to ameliorate this result by a global optimisation.

...read moreread less

Journal Article•DOI•

Free viewpoint video synthesis and presentation from multiple sporting videos

[...]

Naho Inamoto¹, Hideo Saito¹•Institutions (1)

Keio University¹

01 Feb 2007-Electronics and Communications in Japan Part Iii-fundamental Electronic Science

TL;DR: A mixed reality presentation system for a soccer match is constructed that can overlay soccer scenes captured with multiple cameras in a soccer stadium onto a desktop field model with HMD to demonstrate the utility of the proposed method of free viewpoint video synthesis.

...read moreread less

Abstract: This paper presents a new framework for observation of sporting events using HMD by applying the method of free viewpoint video synthesis for dynamic events to mixed reality. According to the viewpoint position of an observer, a virtual view image is generated by view interpolation among multiple sporting videos captured in a stadium, and then overlaid onto the real world via HMD. This makes the observer feel as if the match is played in front of his/her eyes. The proposed method performs virtual view synthesis and geometric registration between the real world and the virtual view images using projective geometry between cameras, which can be estimated from correspondence of natural feature points. It does not require calibration of multiple cameras imaging the sporting match and HMD camera capturing the real world. In this paper, we have constructed a mixed reality presentation system for a soccer match in order to demonstrate the utility of the proposed method. This system can overlay soccer scenes captured with multiple cameras in a soccer stadium onto a desktop field model with HMD. © 2006 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 90(2): 40–49, 2007; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjc.20311

...read moreread less

Proceedings Article•DOI•

Disparity Estimation and Virtual View Synthesis from Stereo Video

[...]

Jong Dae Oh¹, Siwei Ma¹, C.-C.J. Kuo¹•Institutions (1)

University of Southern California¹

27 May 2007

TL;DR: It is demonstrated by experimental results that the proposed algorithm offers better virtual view quality at a much lower complexity than existing methods.

...read moreread less

Abstract: A new scheme for disparity vector (DV) estimation and virtual view synthesis to generate 3D video display from a pair of stereo video inputs is investigated in this work. Two performance metrics are considered for the algorithmic evaluation; i.e. quality and complexity. To enhance the overall performance, a two-stage algorithm for accurate and fast DV estimation and occlusion handling is first presented. Then, a preprocessing algorithm and a synthesis method are described. The proposed preprocessing algorithm can remove false matched regions for DV refinement effectively. The new synthesis method can reduce blurring and ghostly effects greatly. It is demonstrated by experimental results that the proposed algorithm offers better virtual view quality at a much lower complexity than existing methods.

...read moreread less

Proceedings Article•DOI•

4D view synthesis: navigating through time and space

[...]

Mingxuan Sun¹, Grant Schindler¹, Sing Bing Kang², Frank Dellaert¹•Institutions (2)

Georgia Institute of Technology¹, Microsoft²

05 Aug 2007

TL;DR: A system to visualize urban structure that is a function of the time selected, thereby allowing virtual navigation in space and time, and a 4D view synthesis technique for rendering large-scale 3D structures evolving in time, given a sparse sample of historical images.

...read moreread less

Abstract: In this sketch, we present a 4D view synthesis technique for rendering large-scale 3D structures evolving in time, given a sparse sample of historical images. We built a system to visualize urban structure that is a function of the time selected, thereby allowing virtual navigation in space and time. While there is a rich literature on image-based rendering of static 3D environments, e.g., the Facade system [Debevec et al. 1996] and Photo Tourism [Snavely et al. 2006], little has been done to address the temporal aspect (e.g., occlusion due to temporal change). We construct time-dependent geometryto handle the sparse sampling. To render, we use time-andview-dependent texture mapping and reason about visibility both in time and space. Figure 1 shows the result of view synthesis based on time-dependent geometry.

...read moreread less

Proceedings Article•DOI•

An Uncalibrated View-Synthesis Pipeline

[...]

Andrea Fusiello, L. Irsara

10 Sep 2007

TL;DR: A complete pipeline that, starting with uncalibrated images, produces a virtual sequence with viewpoint control that is based on the relative affine structure is described.

...read moreread less

Abstract: This paper deals with the process of view synthesis based on the relative affine structure. It describes a complete pipeline that, starting with uncalibrated images, produces a virtual sequence with viewpoint control. Experiments illustrate the approach.

...read moreread less

Proceedings Article•DOI•

An Uncalibrated View-Synthesis Pipeline

[...]

Fusiello, Irsara

01 Jan 2007

Journal Article•DOI•

Real-time view synthesis from a sparse set of views

[...]

George Q. Chen, Yang Liu¹, Nelson Max²•Institutions (2)

Lawrence Livermore National Laboratory¹, University of California, Davis²

01 Feb 2007-Signal Processing-image Communication

TL;DR: The essence of the method is to perform necessary depth estimation up to the level required by the minimal joint image-geometry sampling rate using off-the-shelf graphics hardware, so that real-time anti-aliased light field rendering is achieved even if the image samples are insufficient.

...read moreread less

Abstract: It is known that the pure light field approach for view synthesis relies on a large number of image samples to produce anti-aliased renderings. Otherwise, the insufficiency of image sampling needs to be compensated for by geometry sampling. Currently, geometry estimation is done either offline or using dedicated hardware. Our solution to this dilemma is based on three key ideas: a formal analysis of the equivalency between light field rendering and plane-based warping, multi focus imaging in a multi camera system by plane sweeping, and the fusion of the multi focus image using multi view stereo. The essence of our method is to perform necessary depth estimation up to the level required by the minimal joint image-geometry sampling rate using off-the-shelf graphics hardware. As a result, real-time anti-aliased light field rendering is achieved even if the image samples are insufficient.

...read moreread less

Proceedings Article•DOI•

Virtual View Generation Based on Multiple Images

[...]

Liang Zhang, Wa James Tam, Gi-Mun Um, Filippo Speranza, Namho Hur, Andre Vincent - Show less +2 more

02 Jul 2007

TL;DR: Experimental results show that the newly developed algorithm can improve image quality of synthesized virtual views with a PSNR gain of up to 0.65 dB.

...read moreread less

Abstract: A framework for virtual view synthesis based on multiple images is presented in this paper. Compared to conventional view synthesis based on stereoscopic image pairs, a postprocessing algorithm for disparity refinement is added to exploit information contained in multiple images captured with a multi-view camera configuration. The principle for disparity refinement is examined, leading to the development of a novel algorithm. Experimental results show that the newly developed algorithm can improve image quality of synthesized virtual views with a PSNR gain of up to 0.65 dB.

...read moreread less

Face view synthesis using a single image

[...]

H. Schneiderman¹, Jiang Ni¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2007

TL;DR: This thesis has implemented a statistical model combining distance in feature space (DIPS) and distance from features space (DFFS) for a pair of poses and model the relationship between the poses using a Bayesian network, which more accurately predicts small and localized features.

...read moreread less

Abstract: Face view synthesis involves using one view of a face to artificially render another view. It is an interesting problem in computer vision and computer graphics, and can be applied in the entertainment industry for animated movies and video games. The fact that the input is only a single image, makes the problem very difficult. Previous approaches learn a linear model on pair of poses from 2D training data and then predict the unknown pose in the test example. Such 2D approaches are much more practical than approaches requiring 3D data and more computationally efficient. However they perform inadequately when dealing with large angles between poses. In this thesis, we seek to improve performance through better choices in probabilistic modeling. As a first step, we have implemented a statistical model combining distance in feature space (DIPS) and distance from feature space (DFFS) for such pair of poses. Such a representation leads to better performance. As a second step, we model the relationship between the poses using a Bayesian network. This representation takes advantage of the sparse statistical structure of faces. In particular, we have observed that a given pixel is often statistically correlated with only a small number of other pixel variables. The Bayesian network provides a concise representation for this behavior reducing the susceptibility to over-fitting. Compared with the linear method, the Bayesian network more accurately predicts small and localized features.

...read moreread less

Dissertation•DOI•

Scene representation and view synthesis in image-based rendering

[...]

Xiaoyong Sun

01 Jan 2007

TL;DR: In this paper, the authors propose a method to solve the problem of plagiarism in the field of bioinformatics.x Acknowledgements xii and Xiii.x

...read moreread less

Abstract: x Acknowledgements xii

...read moreread less

Journal Article•

Intermediate View Synthesis Based on Disparity Estimation

[...]

Yuan Dun¹•Institutions (1)

Communication University of China¹

01 Jan 2007-Journal of Optoelectronics·laser

TL;DR: In order to synthesize high quality intermediate views, a new stereo matching algorithm based on adaptive weight is proposed, and the disparity smoothness constraint term is introduced in matching cost function.

...read moreread less

Abstract: In order to synthesize high quality intermediate views,a new stereo matching algorithm based on adaptive weight is proposed,and the disparity smoothness constraint term is introduced in matching cost function.After obtaining the reliable and dense disparity map,an intermediate view is synthesized by searching corresponding pixel of intermediate view image in the left and right image.Experimental results show that the proposed algorithm can provide good disparity map and obtain the intermediate views with high quality.

...read moreread less

Analyzing the Spatio-Temporal Domain: from View Synthesis to Motion Segmentation

[...]

D. Feldman

01 Jan 2007

TL;DR: In this work, a novel motion segmentation algorithm from a video sequence in general motion is developed, based on differential properties in the spatio-temporal domain, and a differential occlusion detector is presented, which detects corner-like features that are indicative of motion boundaries.

...read moreread less

Abstract: In this work I investigate spatio-temporal information in a video sequence. The advantage of considering a video sequence as a 3D spatio-temporal function with temporal continuity (rather than merely a discrete collection of 2D images) is demonstrated by two co mputer vision techniques which I have developed. View Synthesis: Each frame of the video sequence is an intersection of the spa tio-temporal video volume with a spatial plane. When a video sequence conf orms to certain geometrical constraints, intersecting the video volume with other planes or surfaces can be used to easily produce new views of the scene. This powerful view synthesis technique is base d solely on captured data and does not require scene reconstruction, as the constraint on the inpu t camera motion make it invariant to the scene structure in some respects. This technique is demonst rated with real sequences, giving visually appealing results. The technique gives rise to a novel projection model, the Crossed-Slits projection , that can be seen as a generalization of the perspective projection and sever al other models. A Crossed-Slits camera is defined by two lines which all rays must intersect. Here I st udy this new projection model and its epipolar geometry, which are shown to be quadratic equivale nts of the perspective model. Crossed-Slits images are not perspective, and thus they app ear distorted. These distortions are studied, and two frameworks are developed for handling them : First, assuming that a coarse approximation of the scene structure is known (which is used to crea te a realtime omnidirectional virtual environment); Second, without any knowledge about the scen e, based only on the set of rays. In both cases distortion is reduced by approximating the perspecti ve projection. v The work on view synthesis and the Crossed-Slits projection , presented in Chapter 3 and 4, is based on work published in [1–6]. Motion Segmentation: Analysis of anunconstrainedvideo sequence in general motion reveals a highly regular spatio-temporal structure, where moving o bjects appear as continuous structures in the temporal domain, broken by occlusion. Based on this observa tion, I developed a novel motion segmentation algorithm from a video sequence in general motion, wh ich is based on differential properties in the spatio-temporal domain. I present a differential occlusion detector, which detects corner-like features that are indicative of motion boundaries. Segmentation is achieved by integratin g the response of this detector in scale space. The algorithm is shown to give good results on real sequences tak n in general motion. Experiments with synthetic data show robustness to high levels of noise a nd illumination changes; the experiments also include cases where no intensity edge exists at the loca tion of the motion boundary, or when no parametric motion model can describe the data Next I describe two algorithms to determine depth ordering f rom twoand three-frame sequences based on observations about the scale space characteristic s of the motion boundary. An interesting property of this method is its ability compute depth orderin g from only two frames, even when no edge can be detected in a single frame. Finally, experiments show that people, like my algorithm, c an ompute depth ordering from only two frames, even when the boundary between the layers is not v isible in a single frame. The work on motion segmentation and depth ordering, present d i Chapter 5, is based on [7,8].

...read moreread less