scispace - formally typeset
Search or ask a question
Author

Matthew T. Uyttendaele

Other affiliations: Facebook
Bio: Matthew T. Uyttendaele is an academic researcher from Microsoft. The author has contributed to research in topics: Pixel & Rendering (computer graphics). The author has an hindex of 39, co-authored 84 publications receiving 8812 citations. Previous affiliations of Matthew T. Uyttendaele include Facebook.


Papers
More filters
Journal ArticleDOI
01 Aug 2004
TL;DR: This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
Abstract: The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

1,677 citations

Proceedings ArticleDOI
29 Jul 2007
TL;DR: It is demonstrated that in cases, such as those above, the available high resolution input image may be leveraged as a prior in the context of a joint bilateral upsampling procedure to produce a better high resolution solution.
Abstract: Image analysis and enhancement tasks such as tone mapping, colorization, stereo depth, and photomontage, often require computing a solution (e.g., for exposure, chromaticity, disparity, labels) over the pixel grid. Computational and memory costs often require that a smaller solution be run over a downsampled image. Although general purpose upsampling methods can be used to interpolate the low resolution solution to the full resolution, these methods generally assume a smoothness prior for the interpolation. We demonstrate that in cases, such as those above, the available high resolution input image may be leveraged as a prior in the context of a joint bilateral upsampling procedure to produce a better high resolution solution. We show results for each of the applications above and compare them to traditional upsampling methods.

1,185 citations

Journal ArticleDOI
01 Dec 2008
TL;DR: The results show that augmenting photographs with already available 3D models of the world supports a wide variety of new ways for us to experience and interact with the authors' everyday snapshots.
Abstract: In this paper, we introduce a novel system for browsing, enhancing, and manipulating casual outdoor photographs by combining them with already existing georeferenced digital terrain and urban models. A simple interactive registration process is used to align a photograph with such a model. Once the photograph and the model have been registered, an abundance of information, such as depth, texture, and GIS data, becomes immediately available to our system. This information, in turn, enables a variety of operations, ranging from dehazing and relighting the photograph, to novel view synthesis, and overlaying with geographic information. We describe the implementation of a number of these applications and discuss possible extensions. Our results show that augmenting photographs with already available 3D models of the world supports a wide variety of new ways for us to experience and interact with our everyday snapshots.

745 citations

Proceedings ArticleDOI
01 Jul 2003
TL;DR: This paper describes the approach to generate high dynamic range (HDR) video from an image sequence of a dynamic scene captured while rapidly varying the exposure of each frame, and how to compensate for scene and camera movement when creating an HDR still from a series of bracketed still photographs.
Abstract: Typical video footage captured using an off-the-shelf camcorder suffers from limited dynamic range. This paper describes our approach to generate high dynamic range (HDR) video from an image sequence of a dynamic scene captured while rapidly varying the exposure of each frame. Our approach consists of three parts: automatic exposure control during capture, HDR stitching across neighboring frames, and tonemapping for viewing. HDR stitching requires accurately registering neighboring frames and choosing appropriate pixels for computing the radiance map. We show examples for a variety of dynamic scenes. We also show how we can compensate for scene and camera movement when creating an HDR still from a series of bracketed still photographs.

641 citations

Proceedings ArticleDOI
01 Dec 2001
TL;DR: A method for dealing with objects that move between different views of a dynamic scene and a method for continuously adjusting exposure across multiple images in order to eliminate visible shifts in brightness or hue.
Abstract: As panoramic photography becomes increasingly popular, there is a greater need for high-quality software to automatically create panoramic images. Existing algorithms either produce a rough "stitch" that cannot deal with common artifacts, or require user input. This paper presents methods for dealing with two artifacts that often occur in practice. Our first contribution is a method for dealing with objects that move between different views of a dynamic scene. If such moving objects are left in, they will appear blurry and "ghosted". Treating such regions as nodes in a graph, we use a vertex cover algorithm to selectively remove all but one instance of each object. Our second contribution is a method for continuously adjusting exposure across multiple images in order to eliminate visible shifts in brightness or hue. We compute exposure corrections on a block-by block basis, then smoothly interpolate the parameters using a spline to get spatially continuous exposure adjustment. Our enhancements, combined with previously published techniques for automatic image stitching, result in a high-quality automated stitcher that exhibits far fewer artifacts than existing software.

384 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The guided filter is a novel explicit image filter derived from a local linear model that can be used as an edge-preserving smoothing operator like the popular bilateral filter, but it has better behaviors near edges.
Abstract: In this paper, we propose a novel explicit image filter called guided filter. Derived from a local linear model, the guided filter computes the filtering output by considering the content of a guidance image, which can be the input image itself or another different image. The guided filter can be used as an edge-preserving smoothing operator like the popular bilateral filter [1], but it has better behaviors near edges. The guided filter is also a more generic concept beyond smoothing: It can transfer the structures of the guidance image to the filtering output, enabling new filtering applications like dehazing and guided feathering. Moreover, the guided filter naturally has a fast and nonapproximate linear time algorithm, regardless of the kernel size and the intensity range. Currently, it is one of the fastest edge-preserving filters. Experiments show that the guided filter is both effective and efficient in a great variety of computer vision and computer graphics applications, including edge-aware smoothing, detail enhancement, HDR compression, image matting/feathering, dehazing, joint upsampling, etc.

4,730 citations

Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Journal ArticleDOI
TL;DR: A simple but effective image prior - dark channel prior to remove haze from a single input image is proposed, based on a key observation - most local patches in haze-free outdoor images contain some pixels which have very low intensities in at least one color channel.
Abstract: In this paper, we propose a simple but effective image prior-dark channel prior to remove haze from a single input image. The dark channel prior is a kind of statistics of outdoor haze-free images. It is based on a key observation-most local patches in outdoor haze-free images contain some pixels whose intensity is very low in at least one color channel. Using this prior with the haze imaging model, we can directly estimate the thickness of the haze and recover a high-quality haze-free image. Results on a variety of hazy images demonstrate the power of the proposed prior. Moreover, a high-quality depth map can also be obtained as a byproduct of haze removal.

3,668 citations

Journal ArticleDOI
01 Jul 2006
TL;DR: This work presents a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface that consists of an image-based modeling front end that automatically computes the viewpoint of each photograph and a sparse 3D model of the scene and image to model correspondences.
Abstract: We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.

3,398 citations

Journal ArticleDOI
TL;DR: This paper describes the Semi-Global Matching (SGM) stereo method, which uses a pixelwise, Mutual Information based matching cost for compensating radiometric differences of input images and demonstrates a tolerance against a wide range of radiometric transformations.
Abstract: This paper describes the semiglobal matching (SGM) stereo method. It uses a pixelwise, mutual information (Ml)-based matching cost for compensating radiometric differences of input images. Pixelwise matching is supported by a smoothness constraint that is usually expressed as a global cost function. SGM performs a fast approximation by pathwise optimizations from all directions. The discussion also addresses occlusion detection, subpixel refinement, and multibaseline matching. Additionally, postprocessing steps for removing outliers, recovering from specific problems of structured environments, and the interpolation of gaps are presented. Finally, strategies for processing almost arbitrarily large images and fusion of disparity images using orthographic projection are proposed. A comparison on standard stereo images shows that SGM is among the currently top-ranked algorithms and is best, if subpixel accuracy is considered. The complexity is linear to the number of pixels and disparity range, which results in a runtime of just 1-2 seconds on typical test images. An in depth evaluation of the Ml-based matching cost demonstrates a tolerance against a wide range of radiometric transformations. Finally, examples of reconstructions from huge aerial frame and pushbroom images demonstrate that the presented ideas are working well on practical problems.

3,302 citations