scispace - formally typeset
Search or ask a question
Journal ArticleDOI

High performance imaging using large camera arrays

01 Jul 2005-Vol. 24, Iss: 3, pp 765-776
TL;DR: A unique array of 100 custom video cameras that are built are described, and their experiences using this array in a range of imaging applications are summarized.
Abstract: The advent of inexpensive digital image sensors and the ability to create photographs that combine information from a number of sensed images are changing the way we think about photography. In this paper, we describe a unique array of 100 custom video cameras that we have built, and we summarize our experiences using this array in a range of imaging applications. Our goal was to explore the capabilities of a system that would be inexpensive to produce in the future. With this in mind, we used simple cameras, lenses, and mountings, and we assumed that processing large numbers of images would eventually be easy and cheap. The applications we have explored include approximating a conventional single center of projection video camera with high performance along one or more axes, such as resolution, dynamic range, frame rate, and/or large aperture, and using multiple cameras to approximate a video camera with a large synthetic aperture. This permits us to capture a video light field, to which we can apply spatiotemporal view interpolation algorithms in order to digitally simulate time dilation and camera motion. It also permits us to create video sequences using custom non-uniform synthetic apertures.
Citations
More filters
01 Jan 2005
TL;DR: The plenoptic camera as mentioned in this paper uses a microlens array between the sensor and the main lens to measure the total amount of light deposited at that location, but how much light arrives along each ray.
Abstract: This paper presents a camera that samples the 4D light field on its sensor in a single photographic exposure. This is achieved by inserting a microlens array between the sensor and main lens, creating a plenoptic camera. Each microlens measures not just the total amount of light deposited at that location, but how much light arrives along each ray. By re-sorting the measured rays of light to where they would have terminated in slightly different, synthetic cameras, we can compute sharp photographs focused at different depths. We show that a linear increase in the resolution of images under each microlens results in a linear increase in the sharpness of the refocused photographs. This property allows us to extend the depth of field of the camera without reducing the aperture, enabling shorter exposures and lower image noise. Especially in the macrophotography regime, we demonstrate that we can also compute synthetic photographs from a range of different viewpoints. These capabilities argue for a different strategy in designing photographic imaging systems. To the photographer, the plenoptic camera operates exactly like an ordinary hand-held camera. We have used our prototype to take hundreds of light field photographs, and we present examples of portraits, high-speed action and macro close-ups.

2,252 citations


Additional excerpts

  • ...A different approach to capturing light fields in a single exposure is an array of cameras [Wilburn et al. 2005]....

    [...]

Journal ArticleDOI
Marc Levoy1, Ren Ng1, Andrew Adams1, Matthew J. Footer1, Mark Horowitz1 
01 Jul 2006
TL;DR: The Light Field Microscope (LFM) as discussed by the authors uses 3D deconvolution to produce a set of cross-sections, which can then be visualized using volume rendering.
Abstract: By inserting a microlens array into the optical train of a conventional microscope, one can capture light fields of biological specimens in a single photograph. Although diffraction places a limit on the product of spatial and angular resolution in these light fields, we can nevertheless produce useful perspective views and focal stacks from them. Since microscopes are inherently orthographic devices, perspective views represent a new way to look at microscopic specimens. The ability to create focal stacks from a single photograph allows moving or light-sensitive specimens to be recorded. Applying 3D deconvolution to these focal stacks, we can produce a set of cross sections, which can be visualized using volume rendering. In this paper, we demonstrate a prototype light field microscope (LFM), analyze its optical performance, and show perspective views, focal stacks, and reconstructed volumes for a variety of biological specimens. We also show that synthetic focusing followed by 3D deconvolution is equivalent to applying limited-angle tomography directly to the 4D light field.

818 citations

Journal ArticleDOI
31 Jan 2011
TL;DR: An overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC is provided and a summary of the coding performance achieved by MVC for both stereo- and multiview video is provided.
Abstract: Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.

683 citations

Proceedings ArticleDOI
29 Jul 2007
TL;DR: A novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras is presented.
Abstract: We describe a theoretical framework for reversibly modulating 4D light fields using an attenuating mask in the optical path of a lens based camera. Based on this framework, we present a novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras. The patterned mask attenuates light rays inside the camera instead of bending them, and the attenuation recoverably encodes the rays on the 2D sensor. Our mask-equipped camera focuses just as a traditional camera to capture conventional 2D photos at full sensor resolution, but the raw pixel values also hold a modulated 4D light field. The light field can be recovered by rearranging the tiles of the 2D Fourier transform of sensor values into 4D planes, and computing the inverse Fourier transform. In addition, one can also recover the full resolution image information for the in-focus parts of the scene. We also show how a broadband mask placed at the lens enables us to compute refocused images at full sensor resolution for layered Lambertian scenes. This partial encoding of 4D ray-space data enables editing of image contents by depth, yet does not require computational recovery of the complete 4D light field.

660 citations

Journal ArticleDOI
Marc Levoy1
TL;DR: A survey of the theory and practice of light field imaging emphasizes the devices researchers in computer graphics and computer vision have built to capture light fields photographically and the techniques they have developed to compute novel images from them.
Abstract: A survey of the theory and practice of light field imaging emphasizes the devices researchers in computer graphics and computer vision have built to capture light fields photographically and the techniques they have developed to compute novel images from them

615 citations

References
More filters
Proceedings ArticleDOI
01 Aug 1996
TL;DR: This paper describes a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views, and describes a compression system that is able to compress the light fields generated by more than a factor of 100:1 with very little loss of fidelity.
Abstract: A number of techniques have been proposed for flying through scenes by redisplaying previously rendered or digitized views. Techniques have also been proposed for interpolating between views by warping input images, using depth information or correspondences between multiple images. In this paper, we describe a simple and robust method for generating new views from arbitrary camera positions without depth information or feature matching, simply by combining and resampling the available images. The key to this technique lies in interpreting the input images as 2D slices of a 4D function the light field. This function completely characterizes the flow of light through unobstructed space in a static scene with fixed illumination. We describe a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views. We hav e created light fields from large arrays of both rendered and digitized images. The latter are acquired using a video camera mounted on a computer-controlled gantry. Once a light field has been created, new views may be constructed in real time by extracting slices in appropriate directions. Since the success of the method depends on having a high sample rate, we describe a compression system that is able to compress the light fields we have generated by more than a factor of 100:1 with very little loss of fidelity. We also address the issues of antialiasing during creation, and resampling during slice extraction. CR Categories: I.3.2 [Computer Graphics]: Picture/Image Generation — Digitizing and scanning, Viewing algorithms; I.4.2 [Computer Graphics]: Compression — Approximate methods Additional keywords: image-based rendering, light field, holographic stereogram, vector quantization, epipolar analysis

4,426 citations


"High performance imaging using larg..." refers background in this paper

  • ...The earliest systems for capturing scenes from multiple perspectives used a single translating camera [Levoy and Hanrahan 1996] and were limited to static scenes....

    [...]

  • ...or dynamic scene [Levoy and Hanrahan 1996; Gortler et al. 1996; Rander et al. 1997; Matusik et al. 2000]....

    [...]

  • ...Shifting the aligned images varies the focal depth for the system [Levoy and Hanrahan 1996; Isaksen et al. 2000; Vaish et al. 2004]....

    [...]

Proceedings ArticleDOI
01 Aug 1996
TL;DR: A new method for capturing the complete appearance of both synthetic and real world objects and scenes, representing this information, and then using this representation to render images of the object from new camera positions.
Abstract: This paper discusses a new method for capturing the complete appearance of both synthetic and real world objects and scenes, representing this information, and then using this representation to render images of the object from new camera positions. Unlike the shape capture process traditionally used in computer vision and the rendering process traditionally used in computer graphics, our approach does not rely on geometric representations. Instead we sample and reconstruct a 4D function, which we call a Lumigraph. The Lumigraph is a subset of the complete plenoptic function that describes the flow of light at all positions in all directions. With the Lumigraph, new images of the object can be generated very quickly, independent of the geometric or illumination complexity of the scene or object. The paper discusses a complete working system including the capture of samples, the construction of the Lumigraph, and the subsequent rendering of images from this new representation.

2,986 citations

Proceedings ArticleDOI
03 Aug 1997
TL;DR: This work discusses how this work is applicable in many areas of computer graphics involving digitized photographs, including image-based modeling, image compositing, and image processing, and demonstrates a few applications of having high dynamic range radiance maps.
Abstract: We present a method of recovering high dynamic range radiance maps from photographs taken with conventional imaging equipment. In our method, multiple photographs of the scene are taken with different amounts of exposure. Our algorithm uses these differently exposed photographs to recover the response function of the imaging process, up to factor of scale, using the assumption of reciprocity. With the known response function, the algorithm can fuse the multiple photographs into a single, high dynamic range radiance map whose pixel values are proportional to the true radiance values in the scene. We demonstrate our method on images acquired with both photochemical and digital imaging processes. We discuss how this work is applicable in many areas of computer graphics involving digitized photographs, including image-based modeling, image compositing, and image processing. Lastly, we demonstrate a few applications of having high dynamic range radiance maps, such as synthesizing realistic motion blur and simulating the response of the human visual system.

2,967 citations

Patent
05 Mar 1975
TL;DR: In this article, a mosaic of selectively transmissive filters is superposed in registration with a solid state imaging array having a broad range of light sensitivity, the distribution of filter types in the mosaic being in accordance with the above-described patterns.
Abstract: A sensing array for color imaging includes individual luminance- and chrominance-sensitive elements that are so intermixed that each type of element (i.e., according to sensitivity characteristics) occurs in a repeated pattern with luminance elements dominating the array. Preferably, luminance elements occur at every other element position to provide a relatively high frequency sampling pattern which is uniform in two perpendicular directions (e.g., horizontal and vertical). The chrominance patterns are interlaid therewith and fill the remaining element positions to provide relatively lower frequencies of sampling. In a presently preferred implementation, a mosaic of selectively transmissive filters is superposed in registration with a solid state imaging array having a broad range of light sensitivity, the distribution of filter types in the mosaic being in accordance with the above-described patterns.

2,153 citations

Journal ArticleDOI
01 Aug 2004
TL;DR: This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
Abstract: The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

1,677 citations


"High performance imaging using larg..." refers methods in this paper

  • ...For example, segmentation-based stereo methods have recently been proven very useful for spatial view interpolation [Zitnick et al. 2004] and analysis of structure and motion in dynamic scenes [Tao et al....

    [...]