Author
Wexler
Bio: Wexler is an academic researcher. The author has contributed to research in topics: Iterative reconstruction & Image-based modeling and rendering. The author has an hindex of 1, co-authored 1 publications receiving 194 citations.
Papers
More filters
01 Jan 2003
TL;DR: The paper's second contribution is to constrain the generated views to lie in the space of images whose texture statistics are those of the input images, which amounts to an image-based prior on the reconstruction which regularizes the solution, yielding realistic synthetic views.
Abstract: Given a set of images acquired from known viewpoints, we describe a method for synthesizing the image which would be seen from a new viewpoint. In contrast to existing techniques, which explicitly reconstruct the 3D geometry of the scene, we transform the problem to the reconstruction of colour rather than depth. This retains the benefits of geometric constraints, but projects out the ambiguities in depth estimation which occur in textureless regions. On the other hand, regularization is still needed in order to generate high-quality images. The paper's second contribution is to constrain the generated views to lie in the space of images whose texture statistics are those of the input images. This amounts to an image-based prior on the reconstruction which regularizes the solution, yielding realistic synthetic views. Examples are given of new view generation for cameras interpolated between the acquisition viewpoints - which enables synthetic steadicam stabilization of a sequence with a high level of realism.
195 citations
Cited by
More filters
27 Jul 2009
TL;DR: This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearest-neighbor matches between image patches, and proposes additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.
Abstract: This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearest-neighbor matches between image patches. Previous research in graphics and vision has leveraged such nearest-neighbor searches to provide a variety of high-level digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20-100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools -- image retargeting, completion and reshuffling -- that can be used together in the context of a high-level image editing application. Finally, we propose additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.
2,888 citations
01 Aug 2004
TL;DR: This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
Abstract: The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.
1,677 citations
20 Jun 2005
TL;DR: A framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks, developed using a Products-of-Experts framework.
Abstract: We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach extends traditional Markov random field (MRF) models by learning potential functions over extended pixel neighborhoods. Field potentials are modeled using a Products-of-Experts framework that exploits nonlinear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field of Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with and even outperform specialized techniques.
1,167 citations
[...]
TL;DR: The approach provides a practical method for learning high-order Markov random field models with potential functions that extend over large pixel neighborhoods with non-linear functions of many linear filter responses.
Abstract: We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach provides a practical method for learning high-order Markov random field (MRF) models with potential functions that extend over large pixel neighborhoods. These clique potentials are modeled using the Product-of-Experts framework that uses non-linear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field-of-Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with specialized techniques.
848 citations
Posted Content•
TL;DR: Empirical evaluation demonstrates the effectiveness of the unsupervised learning framework for monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and pose estimation performs favorably compared to established SLAM systems under comparable input settings.
Abstract: We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. We achieve this by simultaneously training depth and camera pose estimation networks using the task of view synthesis as the supervisory signal. The networks are thus coupled via the view synthesis objective during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performing comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performing favorably with established SLAM systems under comparable input settings.
717 citations