scispace - formally typeset
Search or ask a question

Showing papers on "View synthesis published in 2004"


Journal ArticleDOI
01 Aug 2004
TL;DR: This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
Abstract: The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

1,677 citations


Proceedings ArticleDOI
TL;DR: Details of a system that allows for an evolutionary introduction of depth perception into the existing 2D digital TV framework are presented and a comparison with the classical approach of "stereoscopic" video is compared.
Abstract: This paper presents details of a system that allows for an evolutionary introduction of depth perception into the existing 2D digital TV framework. The work is part of the European Information Society Technologies (IST) project “Advanced Three-Dimensional Television System Technologies” (ATTEST), an activity, where industries, research centers and universities have joined forces to design a backwards-compatible, flexible and modular broadcast 3D-TV system. At the very heart of the described new concept is the generation and distribution of a novel data representation format, which consists of monoscopic color video and associated perpixel depth information. From these data, one or more “virtual” views of a real-world scene can be synthesized in real-time at the receiver side (i. e. a 3D-TV set-top box) by means of so-called depth-image-based rendering (DIBR) techniques. This publication will provide: (1) a detailed description of the fundamentals of this new approach on 3D-TV; (2) a comparison with the classical approach of “stereoscopic” video; (3) a short introduction to DIBR techniques in general; (4) the development of a specific DIBR algorithm that can be used for the efficient generation of high-quality “virtual” stereoscopic views; (5) a number of implementation details that are specific to the current state of the development; (6) research on the backwards-compatible compression and transmission of 3D imagery using state-of-the-art MPEG (Moving Pictures Expert Group) tools.

1,560 citations


Proceedings ArticleDOI
08 Aug 2004
TL;DR: This paper presents a self-reconfigurable camera array system that captures video sequences from an array of mobile cameras, renders novel views on the fly and reconfigures the camera positions to achieve better rendering quality.
Abstract: This paper presents a self-reconfigurable camera array system that captures video sequences from an array of mobile cameras, renders novel views on the fly and reconfigures the camera positions to achieve better rendering quality. The system is composed of 48 cameras mounted on mobile platforms. The contribution of this paper is twofold. First, we propose an efficient algorithm that is capable of rendering high-quality novel views from the captured images. The algorithm reconstructs a view-dependent multi-resolution 2D mesh model of the scene geometry on the fly and uses it for rendering. The algorithm combines region of interest (ROI) identification, JPEG image decompression, lens distortion correction, scene geometry reconstruction and novel view synthesis seamlessly on a single Intel Xeon 2.4 GHz processor, which is capable of generating novel views at 4–10 frames per second (fps). Second, we present a view-dependent adaptive capturing scheme that moves the cameras in order to show even better rendering results. Such camera reconfiguration naturally leads to a nonuniform arrangement of the cameras on the camera plane, which is both view-dependent and scene-dependent.

191 citations


Proceedings ArticleDOI
21 Jun 2004
TL;DR: This paper presents a self-reconfigurable camera array system that captures video sequences from an array of mobile cameras, renders novel views on the fly and reconfigures the camera positions to achieve better rendering quality.
Abstract: This paper presents a self-reconfigurable camera array system that captures video sequences from an array of mobile cameras, renders novel views on the fly and reconfigures the camera positions to achieve better rendering quality. The system is composed of 48 cameras mounted on mobile platforms. The contribution of this paper is twofold. First, we propose an efficient algorithm that is capable of rendering high-quality novel views from the captured images. The algorithm reconstructs a view-dependent multi-resolution 2D mesh model of the scene geometry on the fly and uses it for rendering. The algorithm combines region of interest (ROI) identification, JPEG image decompression, lens distortion correction, scene geometry reconstruction and novel view synthesis seamlessly on a single Intel Xeon 2.4 GHz processor, which is capable of generating novel views at 4-10 frames per second (fps). Second, we present a view-dependent adaptive capturing scheme that moves the cameras in order to show even better rendering results. Such camera reconfiguration naturally leads to a nonuniform arrangement of the cameras on the camera plane, which is both view-dependent and scene-dependent.

116 citations


Journal ArticleDOI
TL;DR: This paper synthesizes, using graphics hardware, a virtual video that maintains eye contact, based on stereo analysis combined with rich domain knowledge (a personalized face model), which is able to generate an eye-gaze corrected video stream at five frames per second on a commodity 1 GHz PC.
Abstract: The lack of eye contact in desktop video teleconferencing substantially reduces the effectiveness of video contents. While expensive and bulky hardware is available on the market to correct eye gaze, researchers have been trying to provide a practical software-based solution to bring video-teleconferencing one step closer to the mass market. This paper presents a novel approach: based on stereo analysis combined with rich domain knowledge (a personalized face model), we synthesize, using graphics hardware, a virtual video that maintains eye contact. A 3D stereo head tracker with a personalized face model is used to compute initial correspondences across two views. More correspondences are then added through template and feature matching. Finally, all the correspondence information is fused together for view synthesis using view morphing techniques. The combined methods greatly enhance the accuracy and robustness of the synthesized views. Our current system is able to generate an eye-gaze corrected video stream at five frames per second on a commodity 1 GHz PC.

98 citations


Journal ArticleDOI
01 Sep 2004
TL;DR: An efficient hardware‐accelerated method for novel view synthesis from a set of images or videos based on the photo hull representation, which is the maximal photo‐consistent shape, which generates more realistic rendering results than methods based on visual hulls.
Abstract: This paper presents an efficient hardware-accelerated method for novel view synthesis from a set of images or videos. Our method is based on the photo hull representation, which is the maximal photo-consistent shape. We avoid the explicit reconstruction of photo hulls by adopting a view-dependent plane-sweeping strategy. From the target viewpoint slicing planes are rendered with reference views projected onto them. Graphics hardware is exploited to verify the photo-consistency of each rasterized fragment. Visibilities with respect to reference views are properly modeled, and only photo-consistent fragments are kept and colored in the target view. We present experiments with real images and animation sequences. Thanks to the more accurate shape of the photo hull representation, our method generates more realistic rendering results than methods based on visual hulls. Currently, we achieve rendering frame rates of 2-3 fps. Compared to a pure software implementation, the performance of our hardware-accelerated method is approximately 7 times faster.

40 citations


Journal ArticleDOI
01 Feb 2004
TL;DR: A method for arbitrary view synthesis from uncalibrated multiple camera system, targeting large spaces such as soccer stadiums, and a method for merging the synthesized images with the virtual background scene in the PGS.
Abstract: We propose a method for arbitrary view synthesis from uncalibrated multiple camera system, targeting large spaces such as soccer stadiums. In Projective Grid Space (PGS), which is a three-dimensional space defined by epipolar geometry between two basis cameras in the camera system, we reconstruct three-dimensional shape models from silhouette images. Using the three-dimensional shape models reconstructed in the PGS, we obtain a dense map of the point correspondence between reference images. The obtained correspondence can synthesize the image of arbitrary view between the reference images. We also propose a method for merging the synthesized images with the virtual background scene in the PGS. We apply the proposed methods to image sequences taken by a multiple camera system, which installed in a large concert hall. The synthesized image sequences of virtual camera have enough quality to demonstrate effectiveness of the proposed method.

40 citations


Journal ArticleDOI
TL;DR: This work presents a new method for using commodity graphics hardware to achieve real-time, on-line, 2D view synthesis or 3D depth estimation from two or more calibrated cameras that combines a 3D plane-sweeping approach with 2D multi-resolution color consistency tests.
Abstract: We present a new method for using commodity graphics hardware to achieve real-time, on-line, 2D view synthesis or 3D depth estimation from two or more calibrated cameras. Our method combines a 3D plane-sweeping approach with 2D multi-resolution color consistency tests. We project camera imagery onto each plane, compute measures of color consistency throughout the plane at multiple resolutions, and then choose the color or depth (corresponding plane) that is most consistent. The key to achieving real-time performance is our use of the advanced features included with recent commodity computer graphics hardware to implement the computations simultaneously (in parallel) across all reference image pixels on a plane. Our method is relatively simple to implement, and flexible in term of the number and placement of cameras. With two cameras and

38 citations


Journal ArticleDOI
TL;DR: The development of an integrated system consisting of multiple omni-directional vision sensors that was developed to address two specific surveillance tasks: robust tracking and profiling of human activities and dynamic synthesis of virtual views for observing the environment from arbitrary vantage points.

36 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: An algorithm creating consistent, dense disparity maps from incomplete disparity data generated by a conventional stereo system used in a wide-baseline configuration and performs spline-based disparity interpolation within nonoverlapping regions are defined by discontinuity boundaries identified in the incomplete disparity map.
Abstract: We propose an algorithm creating consistent, dense disparity maps from incomplete disparity data generated by a conventional stereo system used in a wide-baseline configuration The reference application is IBR-oriented immersive videoconferencing, in which disparities are used by a view synthesis module to create instantaneous views of remote speakers consistent with the local speaker's viewpoint We perform spline-based disparity interpolation within nonoverlapping regions are defined by discontinuity boundaries identified in the incomplete disparity map We demonstrate very good results on significantly incomplete disparity data computed by a conventional correlation-based stereo algorithm on a real wide-baseline stereo pair acquired by an immersive videoconferencing system

31 citations


Journal ArticleDOI
TL;DR: This work attempts to freely control the baseline‐stretch of a stereoscopic camera by synthesizing the virtual views at the desired location of interval between two cameras based on the stereo matching and view synthesis techniques.
Abstract: In stereoscopic television, there is a trade-off between visual comfort and 3-dimensional (3D) impact with respect to the baseline-stretch of a 3DTV camera. It is necessary to adjust the baseline-stretch at an appropriate distance depending on the contents of a scene if we want to obtain a subjectively optimal quality of an image. However, it is very hard to obtain a small baseline-stretch using commercially available cameras of broadcasting quality where the sizes of the lens and CCD module are large. In order to overcome this limitation, we attempt to freely control the baseline-stretch of a stereoscopic camera by synthesizing the virtual views at the desired location of interval between two cameras. This proposed technique is based on the stereo matching and view synthesis techniques. We first obtain a dense disparity map using a hierarchical stereo matching with the edge-adaptive multiple shifted windows. Then, we synthesize the virtual views using the disparity map. Simulation results with various stereoscopic images demonstrate the effectiveness of the proposed technique.

Proceedings ArticleDOI
02 Sep 2004
TL;DR: In this article, a framework for arbitrary view synthesis and presentation of sporting events for mixed reality entertainment is presented, in which a virtual view image of sporting scene is generated by view interpolation among multiple videos captured at real stadium.
Abstract: This paper presents a new framework for arbitrary view synthesis and presentation of sporting events for mixed reality entertainment. In accordance with the viewpoint position of an observer, virtual view image of sporting scene is generated by view interpolation among multiple videos captured at real stadium. Then the synthesized sporting scene is overlaid onto a desktop stadium model in the real world via HMD. This makes it possible to watch the event in front of the observer. Projective geometry between cameras is used for virtual view generation of the dynamic scene and geometric registration between the real world and the virtual view image of sporting scene. The proposed method does not need to calibrate multiple video cameras for capturing the event and the HMD camera. Therefore it can be applied even to dynamic events in a large space and enables observation with immersive impression. The proposed approach leads to make a new type of mixed reality entertainment for sporting events.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: This work proposes a method called boundary matting, which represents each occlusion boundary as a 3D curve, and suggests that this method enables high-quality view synthesis with reduced matting artifacts.
Abstract: In the last few years, new view synthesis has emerged as an important application of 3D stereo reconstruction. While the quality of stereo has improved, it is still imperfect, and a unique depth is typically assigned to every pixel. This is problematic at object boundaries, where the pixel colors are mixtures of foreground and background colors. Interpolating views without explicitly accounting for this effect results in objects with a "cut-out" appearance. To produce seamless view interpolation, we propose a method called boundary matting, which represents each occlusion boundary as a 3D curve. We show how this method exploits multiple views to perform fully automatic alpha matting and to simultaneously refine stereo depths at the boundaries. The key to our approach is the unifying 3D representation of occlusion boundaries estimated to sub-pixel accuracy. Starting from an initial estimate derived from stereo, we optimize the curve parameters and the foreground colors near the boundaries. Our objective function maximizes consistency with the input images, favors boundaries aligned with strong edges, and damps large perturbations of the curves. Experimental results suggest that this method enables high-quality view synthesis with reduced matting artifacts.

Proceedings ArticleDOI
24 Oct 2004
TL;DR: An image-based approach to photo-realistic view synthesis is proposed by integrating field morphing and view morphing in a single framework, which provides a unified technique for synthesizing new images that include both viewpoint changes and object deformations.
Abstract: In this paper, we propose an image-based approach to photo-realistic view synthesis by integrating field morphing and view morphing in a single framework. We thus provide a unified technique for synthesizing new images that include both viewpoint changes and object deformations. For view morphing, we relax the requirement of monotonicity along epipolar lines to piecewise monotonicity, by incorporating a segmentation stage prior to interpolation. This allows for dealing with occlusions and visibility issues, and hence alleviates the "ghosting effects" that typically occur when morphing is performed between distant viewpoints. We have particularly applied our approach to the synthesis of human facial expressions, while allowing for wide change of viewing positions and directions.

Book ChapterDOI
18 Sep 2004
TL;DR: In this article, a bio-inspired approach applied to a problem of stereo images matching is presented, which is based on an artifical epidemic process, that is called "the infection algorithm".
Abstract: We present a new bio-inspired approach applied to a problem of stereo images matching. This approach is based on an artifical epidemic process, that we call “the infection algorithm.” The problem at hand is a basic one in computer vision for 3D scene reconstruction. It has many complex aspects and is known as an extremely difficult one. The aim is to match the contents of two images in order to obtain 3D informations which allow the generation of simulated projections from a viewpoint that is different from the ones of the initial photographs. This process is known as view synthesis. The algorithm we propose exploits the image contents in order to only produce the necessary 3D depth information, while saving computational time. It is based on a set of distributed rules, that propagate like an artificial epidemy over the images. Experiments on a pair of real images are presented, and realistic reprojected images have been generated.

Proceedings ArticleDOI
24 Oct 2004
TL;DR: Two methods using projection onto convex sets (POCS) and inverse filtering are presented that effectively integrate the focused regions in each view into a novel view.
Abstract: This paper presents a new approach for virtual view synthesis that does not require any information of scene geometry. Our approach first generates multiple virtual views at the same position based on multiple depths by the conventional view interpolation method. The interpolated views suffer from blurring and ghosting artifacts due to the pixel mis-correspondence. Secondly, the multiple views are integrated into a novel view where all regions are focused. This integration problem can be formulated as the problem of solving a set of linear equations that relates the multiple views. To solve this set of equations, two methods using projection onto convex sets (POCS) and inverse filtering are presented that effectively integrate the focused regions in each view into a novel view. Experimental results using real images show the validity of our methods.

Proceedings ArticleDOI
06 Sep 2004
TL;DR: A unified representation for all aspects of IBR using the space-time (x-y-t) volume is proposed, which is very robust, and allows to use IBR in general conditions even with a hand-held camera.
Abstract: Image based rendering (IBR) consists of several steps: (i) calibration (or ego-motion computation) of all input images, (ii) determination of regions in the input images used to synthesize the new view. (iii) interpolating the new view from the selected areas of the input images. We propose a unified representation for all these aspects of IBR using the space-time (x-y-t) volume. The presented approach is very robust, and allows to use IBR in general conditions even with a hand-held camera. To take care of (i), the space-time volume is constructed by placing frames at locations along the time axis so that image features create straight lines in the EPI (epipolar plane images). Different slices of the space-time volume are used to produce new views, taking care of (ii). Step (iii) is done by interpolating between image samples using the feature lines in the EPI images. IBR examples are shown for various cases: sequences taken from a driving car, from a handheld camera, or when using a tripod.

01 Jan 2004
TL;DR: This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models, and develops a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions.
Abstract: This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models. In our scenario, a real dynamic or static scene is captured by a set of un-calibrated images from different viewpoints. After automatically recovering the geometric transformations between these images, a series of photo-realistic virtual views can be rendered and a virtual environment covered by these several static cameras can be synthesized. This image-based approach has applications in object recognition, object transfer, video synthesis and video compression. In this dissertation, I have contributed to several sub-problems related to image based view synthesis. Before image-based view synthesis can be performed, images need to be segmented into individual objects. Assuming that a scene can approximately be described by multiple planar regions, I have developed a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, correctly detect the occlusion pixels over multiple consecutive frames, and accurately segment the scene into several motion layers. First, a number of seed regions using correspondences in two frames are determined, and the seed regions are expanded and outliers are rejected employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, the occlusion order constraints on multiple frames are explored, which guarantee that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Recovering the geometrical transformations among images of a scene is a prerequisite step for image-based view synthesis. I have developed a wide baseline matching algorithm to identify the correspondences between two un-calibrated images, and to further determine the geometric relationship between images, such as epipolar geometry or projective transformation. In our approach, a set of salient features, edge-corners, are detected to provide robust and consistent matching primitives. Then, based on the Singular Value Decomposition (SVD) of an affine matrix, we effectively quantize the search space into two independent subspaces for rotation angle and scaling factor, and then we use a two-stage affine matching algorithm to obtain robust matches between these two frames. (Abstract shortened by UMI.)

Proceedings ArticleDOI
07 Jan 2004
TL;DR: A practical framework for creating and visualizing interactive 3-D media using a system of uncalibrated projector-cameras and shows that adapting the rendering order of the correspondences with respect to the projector’s coordinate system ensures the correct visibility for the synthesized views.
Abstract: This paper presents a practical framework for creating and visualizing interactive 3-D media using a system of uncalibrated projector-cameras. The proposed solution uses light patterns that temporally encode the projector’s coordinate system to solve the traditionally challenging multiframe correspondence problem by straightforward decoding instead of computational multiframe optimization. Two sets of coded light patterns (black/white stripes and colored 2x2 blocks, both of varying spatial resolutions) are presented and compared. The resulting correspondences are directly used as a compelling form of interactive 3-D media through described techniques including three-frame view synthesis, multiframe view synthesis using multiple three-frame groupings, and even single-camera view interpolation. It is shown that adapting the rendering order of the correspondences with respect to the projector’s coordinate system ensures the correct visibility for the synthesized views. Experimental results demonstrate that the framework works well for various real-world scenes, even including those with multiple objects and textured surfaces. The framework, along with the resulting correspondences, also has implications in many other computer vision and image processing applications, especially those that require multiframe correspondences.

Proceedings ArticleDOI
25 Jul 2004
TL;DR: Experimental results with natural stereo pairs show that the proposed algorithms provide good disparity map and obtain the intermediate views with high quality.
Abstract: An efficient algorithm addressing robust disparity estimation for intermediate view synthesis is proposed. In the proposed method, a new adaptive-size window approach based on region information is introduced to stereo matching in order to overcome problems with fixed-size window. Dynamic programming (DP) technique is used to find optimized disparity values. The reliability of disparity estimation is then measured with a criterion based on uniqueness and smoothness constrains. In occluded areas and image points with unreliable disparity assignments, region-based interpolation strategy is applied to compensate the disparity values. After projecting the left to right and right to left disparities onto the intermediate image, an arbitrary intermediate view is synthesized. Experimental results with natural stereo pairs show that the proposed algorithms provide good disparity map and obtain the intermediate views with high quality.

Journal ArticleDOI
TL;DR: A novel method for arbitrary view synthesis that allows viewers to virtually fly through in real soccer scenes, and an application for fly-through soccer match observation is introduced.
Abstract: This paper presents a novel method for arbitrary view synthesis that allows viewers to virtually fly through in real soccer scenes. Multiple cameras situated around a stadium capture the action, and images shot from arbitrary viewpoints are generated by viewpoint interpolation, using projective geometry between neighboring cameras. The scenes are segmented according to their geometric properties.Dense correspondence matching between real views automatically occurs by applying projective geometry to each region. Superimposing the intermediate view images, synthesized in every region completes the virtual views of the entire soccer scene. Camera calibrations can be reduced, and correspondence matching requires no manual operation, allowing the proposed method to be applied to dynamic events in a large space. In addition to the view synthesis technique, we introduce an application for fly-through soccer match observation. This technology will lead to the creation of new media that can be applied to a variety of entertainment, from concerts to sporting events.

Proceedings ArticleDOI
15 Oct 2004
TL;DR: A realtime system that uses a linear array of cameras to perform Light-Field style rendering, combined with the natural restrictions of limited view volume in video teleconferencing, allow us to synthesize photo-realistic views persuser request at interactive rate.
Abstract: We present a system and techniques for synthesizing views for three dimensional video teleconferencing. Instead of performing complex 3D scene acquisition, we decided to trade storage/hardware for computation, i.e., using more cameras. While it is expensive to directly capture a scene from all possible viewpoints, we observed that the participants' viewpoints usually remain at a constant height (eye level) during video teleconferencing. Therefore we can restrict the possible viewpoint to be within a virtual plane without sacrificing much of the realism. Doing so signicantly reduces the number of cameras required. We demonstrate a realtime system that uses a linear array of cameras to perform Light-Field style rendering. The simplicity and robustness of light fielding rendering, combined with the natural restrictions of limited view volume in video teleconferencing, allow us to synthesize photo-realistic views persuser request at interactive rate.

Proceedings ArticleDOI
28 Sep 2004
TL;DR: It is shown that a novel interpolation scheme, which the authors refer to as iEBI (indexed function example-based interpolation) mechanism, could tackle the difficulty and acquire quality novel views even from only a few example views.
Abstract: Synthesizing images of a real scene at novel viewpoints other than those of some given images is important for image-based rendering, hybrid reality, and other applications that involve real scenes. Novel view synthesis has been dealt with via explicit 3D reconstruction, image transfer, or plenoptic modeling. In this paper we present an interpolation-based solution that avoids the need of explicit 3D reconstruction. The key difficulty in the interpolation approach is, as viewpoint changes, not only could a scene feature's intensity profile in the image change, its image position could also shift. We show that a novel interpolation scheme, which we refer to as iEBI (indexed function example-based interpolation) mechanism, could tackle the difficulty and acquire quality novel views even from only a few example views. Experimental result on some benchmarking image data is shown to illustrate the solution's performance.

Book ChapterDOI
11 May 2004
TL;DR: A novel approach to view interpolation from image sequences based on probabilistic depth carving that builds a multivalued representation of depth for novel views consisting of likelihoods of depth samples corresponding to either opaque or free space points.
Abstract: We describe a novel approach to view interpolation from image sequences based on probabilistic depth carving. This builds a multivalued representation of depth for novel views consisting of likelihoods of depth samples corresponding to either opaque or free space points. The likelihoods are obtained from iterative probabilistic combination of local disparity estimates about a subset of reference frames. This avoids the difficult problem of correspondence matching across distant views and leads to an explicit representation of occlusion. Novel views are generated by combining pixel values from the reference frames based on estimates of surface points within the likelihood representation. Efficient implementation is achieved using a multiresolution framework. Results of experiments on real image sequences show that the technique is effective.

Book ChapterDOI
14 May 2004
TL;DR: An object-based interpolation algorithm is developed to synthesize arbitrary intermediate views in stereoscopic videoconference system with viewpoint adaptation and experimental results show that the proposed method can obtain the intermediate views with high quality.
Abstract: A procedure is described for stereoscopic videoconference system with viewpoint adaptation. The core to such a system is to synthesize the intermediate views from stereoscopic videoconference images with rather large baseline. The foreground object is first segmented by using intensity and disparity information. For this purpose, the region growing technique is used. The reliability of disparity estimation is then measured with a criterion based on uniqueness and smoothness constrains. In occluded areas and image points with unreliable disparity assignments, region-based interpolation strategy is applied to compensate the disparity values. Finally, an object-based interpolation algorithm is developed to synthesize arbitrary intermediate views. Experimental results with natural stereoscopic image pairs show that the proposed method can obtain the intermediate views with high quality.

Proceedings ArticleDOI
06 Sep 2004
TL;DR: In this paper, a facial view synthesis technique based on explicit shape and reflectance information extracted from a single image is presented, which combines an image based reflectance estimation process with a novel method of interpolating between needle-maps recovered using shape from shading.
Abstract: We present a facial view synthesis technique based on explicit shape and reflectance information extracted from a single image. The technique combines an image based reflectance estimation process with a novel method of interpolating between needle-maps recovered using shape from shading. This allows images of a face to be synthesised under novel lighting, pose and skin reflectance given only one example image. We exploit facial symmetry by reflecting the needle-map of a rotated face to yield the needle-map of the face rotated in the opposite direction. This provides two needle-maps between which interpolation can be performed.

Journal Article
TL;DR: A new bio-inspired approach applied to a problem of stereo images matching based on an artifical epidemic process, that is based on a set of distributed rules, that propagate like an artificial epidemy over the images.
Abstract: We present a new bio-inspired approach applied to a problem of stereo images matching. This approach is based on an artifical epidemic process, that we call the infection algorithm. The problem at hand is a basic one in computer vision for 3D scene reconstruction. It has many complex aspects and is known as an extremely difficult one. The aim is to match the contents of two images in order to obtain 3D informations which allow the generation of simulated projections from a viewpoint that is different from the ones of the initial photographs. This process is known as view synthesis. The algorithm we propose exploits the image contents in order to only produce the necessary 3D depth information, while saving computational time. It is based on a set of distributed rules, that propagate like an artificial epidemy over the images. Experiments on a pair of real images are presented, and realistic reprojected images have been generated.

Proceedings ArticleDOI
26 Oct 2004
TL;DR: The purpose is not a "physically correct" but a "subjectively acceptable" view synthesis within the video-rate, in order to provide convincing images, using MPEG4 object video coding.
Abstract: We propose a simple view synthesis method based on object coding and simple modeling of disparity space In the proposed method, a scene from a binocular stereo-vision camera is divided into objects and planes Our purpose is not a "physically correct" but a "subjectively acceptable" view synthesis within the video-rate, in order to provide convincing images, using MPEG4 object video coding The proposed method uses interpolation/extrapolation to generate images from arbitrary viewpoints It also conceals void areas behind objects by using surrounding texture information

Proceedings ArticleDOI
27 Jun 2004
TL;DR: The work presents a novel method for synthesizing a novel view from two sets of differently focused images taken by a sparse camera array for a scene of two approximately constant depths, and can effectively create a dense array of pin-hole cameras, which is better than traditional method using a sparse array of cameras.
Abstract: The work presents a novel method for synthesizing a novel view from two sets of differently focused images taken by a sparse camera array for a scene of two approximately constant depths. The proposed method consists of two steps. The first step is a view interpolation to reconstruct an all-focused dense light field of the scene. The second step is to synthesize a novel view by a light-field rendering technique from the reconstructed dense light field. The view interpolation can be achieved simply by linear filters that are designed to convert defocus effects to parallax effects without estimating the depth map of the scene. The proposed method can effectively create a dense array of pin-hole cameras (i.e., all-focused images), so that the final novel view is better than traditional method using a sparse array of cameras. Experimental results on real images from four aligned cameras are shown.

Proceedings ArticleDOI
01 Jan 2004
TL;DR: The technique combines an image based reflectance estimation process with a novel method of interpolating between needle-maps recovered using shape from shading to allow images of a face to be synthesised under novel lighting, pose and skin reflectance given only one example image.
Abstract: We present a facial view synthesis technique based on explicit shape and reflectance information extracted from a single image. The technique combines an image based reflectance estimation process with a novel method of interpolating between needle-maps recovered using shape from shading. This allows images of a face to be synthesised under novel lighting, pose and skin reflectance given only one example image.