scispace - formally typeset
Search or ask a question
Author

Tomoo Mitsunaga

Other affiliations: Columbia University
Bio: Tomoo Mitsunaga is an academic researcher from Sony Broadcast & Professional Research Laboratories. The author has contributed to research in topics: Pixel & Image processing. The author has an hindex of 27, co-authored 132 publications receiving 4455 citations. Previous affiliations of Tomoo Mitsunaga include Columbia University.


Papers
More filters
Proceedings ArticleDOI
23 Jun 1999
TL;DR: A simple algorithm is described that computes the radiometric response function of an imaging system, from images of an arbitrary scene taken using different exposures, to fuse the multiple images into a single high dynamic range radiance image.
Abstract: A simple algorithm is described that computes the radiometric response function of an imaging system, from images of an arbitrary scene taken using different exposures. The exposure is varied by changing either the aperture setting or the shutter speed. The algorithm does not require precise estimates of the exposures used. Rough estimates of the ratios of the exposures (e.g. F-number settings on an inexpensive lens) are sufficient for accurate recovery of the response function as well as the actual exposure ratios. The computed response function is used to fuse the multiple images into a single high dynamic range radiance image. Robustness is tested using a variety of scenes and cameras as well as noisy synthetic images generated using 100 randomly selected response curves. Automatic rejection of image areas that have large vignetting effects or temporal scene variations make the algorithm applicable to not just photographic but also video cameras.

837 citations

Journal ArticleDOI
TL;DR: A comprehensive optimization method to arrive at the spatial and spectral layout of the color filter array of a GAP camera is presented and a novel algorithm for reconstructing the under-sampled channels of the image while minimizing aliasing artifacts is developed.
Abstract: We propose the concept of a generalized assorted pixel (GAP) camera, which enables the user to capture a single image of a scene and, after the fact, control the tradeoff between spatial resolution, dynamic range and spectral detail. The GAP camera uses a complex array (or mosaic) of color filters. A major problem with using such an array is that the captured image is severely under-sampled for at least some of the filter types. This leads to reconstructed images with strong aliasing. We make four contributions in this paper: 1) we present a comprehensive optimization method to arrive at the spatial and spectral layout of the color filter array of a GAP camera. 2) We develop a novel algorithm for reconstructing the under-sampled channels of the image while minimizing aliasing artifacts. 3) We demonstrate how the user can capture a single image and then control the tradeoff of spatial resolution to generate a variety of images, including monochrome, high dynamic range (HDR) monochrome, RGB, HDR RGB, and multispectral images. 4) Finally, the performance of our GAP camera has been verified using extensive simulations that use multispectral images of real world scenes. A large database of these multispectral images has been made available at http://wwwl.cs.columbia.edu/ CAVE/projects/gap_camera/ for use by the research community.

833 citations

Proceedings ArticleDOI
15 Jun 2000
TL;DR: In this article, an optical mask is placed adjacent to a conventional image detector array to sample the spatial and exposure dimensions of image irradiance, and then the mask is mapped to a high dynamic range image using an efficient image reconstruction algorithm.
Abstract: While real scenes produce a wide range of brightness variations, vision systems use low dynamic range image detectors that typically provide 8 bits of brightness data at each pixel. The resulting low quality images greatly limit what vision can accomplish today. This paper proposes a very simple method for significantly enhancing the dynamic range of virtually any imaging system. The basic principle is to simultaneously sample the spatial and exposure dimensions of image irradiance. One of several ways to achieve this is by placing an optical mask adjacent to a conventional image detector array. The mask has a pattern with spatially varying transmittance, thereby giving adjacent pixels on the detector different exposures to the scene. The captured image is mapped to a high dynamic range image using an efficient image reconstruction algorithm. The end result is an imaging system that can measure a very wide range of scene radiance and produce a substantially larger number of brightness levels, with a slight reduction in spatial resolution. We conclude with several examples of high dynamic range images computed using spatially varying pixel exposures.

691 citations

Proceedings ArticleDOI
06 Nov 2011
TL;DR: It is shown that the proposed techniques for sampling, representing and reconstructing the space-time volume can effectively reconstruct a video from a single image maintaining high spatial resolution.
Abstract: Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.

260 citations

Patent
26 May 2000
TL;DR: In this article, a variable-transmittance mask is used to generate a spatially varying light attenuation pattern across the image sensor, which can be interpolated to account for image sensor pixels that are either under or over exposed to enhance the dynamic range.
Abstract: Apparatus and methods are provided for obtaining high dynamic range images using a low dynamic range image sensor. The scene is exposed to the image sensor in a spatially varying manner. A variable-transmittance mask, which is interposed between the scene and the image sensor, imposes a spatially varying attenuation on the scene light incident on the image sensor. The mask includes light transmitting cells whose transmittance is controlled by application of suitable control signals. The mask is configured to generate a spatially varying light attenuation pattern across the image sensor. The image frame sensed by the image sensor is normalized with respect to the spatially varying light attenuation pattern. The normalized image data can be interpolated to account for image sensor pixels that are either under or over exposed to enhance the dynamic range of the image sensor.

193 citations


Cited by
More filters
Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Proceedings Article
01 Jan 1989
TL;DR: A scheme is developed for classifying the types of motion perceived by a humanlike robot and equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented.
Abstract: A scheme is developed for classifying the types of motion perceived by a humanlike robot. It is assumed that the robot receives visual images of the scene using a perspective system model. Equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented. >

2,000 citations

Journal ArticleDOI
TL;DR: A new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail, is presented, based on a two-scale decomposition of the image into a base layer.
Abstract: We present a new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail. It is based on a two-scale decomposition of the image into a base layer,...

1,715 citations

Proceedings ArticleDOI
01 Jul 2002
TL;DR: A new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail, is presented, based on a two-scale decomposition of the image into a base layer, encoding large-scale variations, and a detail layer.
Abstract: We present a new technique for the display of high-dynamic-range images, which reduces the contrast while preserving detail. It is based on a two-scale decomposition of the image into a base layer, encoding large-scale variations, and a detail layer. Only the base layer has its contrast reduced, thereby preserving detail. The base layer is obtained using an edge-preserving filter called the bilateral filter. This is a non-linear filter, where the weight of each pixel is computed using a Gaussian in the spatial domain multiplied by an influence function in the intensity domain that decreases the weight of pixels with large intensity differences. We express bilateral filtering in the framework of robust statistics and show how it relates to anisotropic diffusion. We then accelerate bilateral filtering by using a piecewise-linear approximation in the intensity domain and appropriate subsampling. This results in a speed-up of two orders of magnitude. The method is fast and requires no parameter setting.

1,612 citations

Proceedings ArticleDOI
01 Jul 2002
TL;DR: The results demonstrate that the method is capable of drastic dynamic range compression, while preserving fine details and avoiding common artifacts, such as halos, gradient reversals, or loss of local contrast.
Abstract: We present a new method for rendering high dynamic range images on conventional displays. Our method is conceptually simple, computationally efficient, robust, and easy to use. We manipulate the gradient field of the luminance image by attenuating the magnitudes of large gradients. A new, low dynamic range image is then obtained by solving a Poisson equation on the modified gradient field. Our results demonstrate that the method is capable of drastic dynamic range compression, while preserving fine details and avoiding common artifacts, such as halos, gradient reversals, or loss of local contrast. The method is also able to significantly enhance ordinary images by bringing out detail in dark regions.

1,441 citations