scispace - formally typeset
Search or ask a question

Showing papers in "electronic imaging in 2004"


Proceedings ArticleDOI
TL;DR: Details of a system that allows for an evolutionary introduction of depth perception into the existing 2D digital TV framework are presented and a comparison with the classical approach of "stereoscopic" video is compared.
Abstract: This paper presents details of a system that allows for an evolutionary introduction of depth perception into the existing 2D digital TV framework. The work is part of the European Information Society Technologies (IST) project “Advanced Three-Dimensional Television System Technologies” (ATTEST), an activity, where industries, research centers and universities have joined forces to design a backwards-compatible, flexible and modular broadcast 3D-TV system. At the very heart of the described new concept is the generation and distribution of a novel data representation format, which consists of monoscopic color video and associated perpixel depth information. From these data, one or more “virtual” views of a real-world scene can be synthesized in real-time at the receiver side (i. e. a 3D-TV set-top box) by means of so-called depth-image-based rendering (DIBR) techniques. This publication will provide: (1) a detailed description of the fundamentals of this new approach on 3D-TV; (2) a comparison with the classical approach of “stereoscopic” video; (3) a short introduction to DIBR techniques in general; (4) the development of a specific DIBR algorithm that can be used for the efficient generation of high-quality “virtual” stereoscopic views; (5) a number of implementation details that are specific to the current state of the development; (6) research on the backwards-compatible compression and transmission of 3D imagery using state-of-the-art MPEG (Moving Pictures Expert Group) tools.

1,560 citations


Proceedings ArticleDOI
TL;DR: Mean interpupillary distance (IPD) is an important and oft-quoted measure in stereoscopic work as discussed by the authors, however, there is startlingly little agreement on what it should be.
Abstract: Mean interpupillary distance (IPD) is an important and oft-quoted measure in stereoscopic work. However, there is startlingly little agreement on what it should be. Mean IPD has been quoted in the stereoscopic literature as being anything from 58 mm to 70 mm. It is known to vary with respect to age, gender and race. Furthermore, the stereoscopic industry requires information on not just mean IPD, but also its variance and its extrema, because our products need to be able to cope with all possible users, including those with the smallest and largest IPDs. This paper brings together those statistics on IPD which are available. The key results are that mean adult IPD is around 63 mm, the vast majority of adults have IPDs in the range 50-75 mm, the wider range of 45-80 mm is likely to include (almost) all adults, and the minimum IPD for children (down to five years old) is around 40 mm.

454 citations


Proceedings ArticleDOI
TL;DR: A novel and fully automatic technique to estimate depth information from a single input image, based on a new image classification technique able to classify digital images as indoor, outdoor with geometric elements or outdoor without geometric elements.
Abstract: This paper presents a novel and fully automatic technique to estimate depth information from a single input image. The proposed method is based on a new image classification technique able to classify digital images (also in Bayer pattern format) as indoor, outdoor with geometric elements or outdoor without geometric elements. Using the information collected in the classification step a suitable depth map is estimated. The proposed technique is fully unsupervised and is able to generate depth map from a single view of the scene, requiring low computational resources.

163 citations


Proceedings ArticleDOI
TL;DR: Analysis of dynamic-range and signal-to-noise-ratio (SNR) for high fidelity, high-dynamic-range (HDR) image sensor architectures is presented and examples of SNR in the extended DR and implementation and power consumption issues for each scheme are presented.
Abstract: Analysis of dynamic-range (DR) and signal-to-noise-ratio (SNR) for high fidelity, high-dynamic-range (HDR) image sensor architectures is presented. Four architectures are considered: (i) time-to-saturation, (ii) multiple-capture, (iii) asynchronous self-reset with multiple capture, and (iv) synchronous self-reset with residue readout. The analysis takes into account circuit nonidealities such as quantization noise and the effects of limited pixel area on building block and reference signal performance and accuracy. Examples that demonstrate the behavior of SNR in the extended DR and implementation and power consumption issues for each scheme are presented.

116 citations


Proceedings ArticleDOI
TL;DR: This paper considers the application of the linear observation model to multi-frame super-resolution restoration under conditions of non-affine image registration and spatially varying PSF and illustrates the application using a Bayesian inference framework to solve the ill-posed restoration inverse problem.
Abstract: Multi-frame super-resolution restoration refers to techniques for still-image and video restoration which utilize multiple observed images of an underlying scene to achieve the restoration of super-resolved imagery. An observation model which relates the measured data to the unknowns to be estimated is formulated to account for the registration of the multiple observations to a fixed reference frame as well as for spatial and temporal degradations resulting from characteristics of the optical system, sensor system and scene motion. Linear observation models, in which the observation process is described by a linear transformation, have been widely adopted. In this paper we consider the application of the linear observation model to multi-frame super-resolution restoration under conditions of non-affine image registration and spatially varying PSF . Reviewing earlier results, we show how these conditions relate to the technique of image warping from the computer graphics literature and how these ideas may be applied to multi-frame restoration. We illustrate the application of these methods to multi-frame super-resolution restoration using a Bayesian inference framework to solve the ill-posed restoration inverse problem.

114 citations


Proceedings ArticleDOI
TL;DR: A novel algorithm for stereoscopic image capture is presented and an implementation for the widely used ray-tracing package POV-Ray is described that allows a defined region of interest in scene depth to have an improved perceived depth representation compared to other regions of the scene.
Abstract: The usable perceived depth range of a stereoscopic 3D display is limited by human factors considerations to a defined range around the screen plane. There is therefore a need in stereoscopic image creation to map depth from the scene to a target display without exceeding these limits. Recent image capture methods provide precise control over this depth mapping but map a single range of scene depth as a whole and are unable to give preferential stereoscopic representation to a particular region of interest in the scene. A new approach to stereoscopic image creation is described that allows a defined region of interest in scene depth to have an improved perceived depth representation compared to other regions of the scene. For example in a game this may be the region of depth around a game character, or in a scientific visualization the region around a particular feature of interest. To realize this approach we present a novel algorithm for stereoscopic image capture and describe an implementation for the widely used ray-tracing package POV-Ray. Results demonstrate how this approach provides content creators with improved control over perceived depth representation in stereoscopic images.

104 citations


Proceedings ArticleDOI
TL;DR: In this paper, an artificial compound eye manufactured by micro-optics technology is demonstrated, which consists of a microlens array on a thin silica-substrate with a Pinhole array in a metal layer on the backside.
Abstract: An artificial apposition compound eye manufactured by micro-optics technology is demonstrated. The overall thickness of the imaging system is only 320 μm, the field of view is 21° on diagonal, the F/# is 2.6. The monolithic device consists of a microlens array on a thin silica-substrate with a pinhole array in a metal layer on the backside. The image formation can be explained by the moire-effect or static sampling. The master structures for the microlens arrays are manufactured by lithographic patterning of photo-resist and a subsequent reflow process. These master structures are replicated by moulding into UV-curing polymer. The pitch of the pinholes differs from the lens array pitch to enable an individual viewing angle for each channel. The required precision is guaranteed by using a lithographic process also for the assembly. Thus, problems with accuracy of other attempts to develop similar systems using discrete components have been overcome. Imaging systems with different sizes of pinholes, numbers of channels and separation of the viewing direction of the channels are realized and tested. A method to generate nontransparent walls between the optical channels for prevention of crosstalk is proposed. Theoretical limitations of resolution and sensitivity are discussed. Imaged test patterns are presented and measurements of the angular sensitivity function are compared to calculations using commercial raytracing software. The resolution achievable with the fabricated artificial compound eye is analyzed.

104 citations


Proceedings ArticleDOI
TL;DR: The DepthCube 3D Volumetric Display is a solid state, rear projection, volumetric display that consists of two main components: a high-speed video projector, and a multiplanar optical element composed of a air-spaced stack of liquid crystal scattering shutters.
Abstract: The DepthCube 3D Volumetric Display is a solid state, rear projection, volumetric display that consists of two main components: a high-speed video projector, and a multiplanar optical element composed of a air-spaced stack of liquid crystal scattering shutters. The high-speed video projector projects a sequence of slices of the 3D image into the multiplanar optical element where each slice is halted at the proper depth. Proprietary multiplanar anti-aliasing algorithms smooth the appearance of the resultant stack of image slices to produce a continuous appearing truly three-dimensional image. The resultant 3D image is of exceptional quality and provides all the 3D vision cues found in viewing real object.

95 citations


Proceedings ArticleDOI
TL;DR: In this article, a virtual digital camera sensor is used to simulate a real (physical) image capturing sensor to simulate the entire process of photon sensing and charge generation in the sensor device.
Abstract: We present a virtual digital camera sensor, whose aim is to simulate a real (physical) image capturing sensorTo accomplish this task, the virtual sensor operates in two steps First, it accepts a physical description of agiven scene and simulates the entire process of photon sensing and charge generation in the sensor device This process is affected by noise, mostly photon noise Second, it adds to the image the noise that results from the electronic circuitry We present a model for the different sources of noise relative to each sensor-based image formation step, and use measurements of real digital camera images to validate the model

92 citations


Proceedings ArticleDOI
TL;DR: This work compared the Spatial Frequency Response (SFR) of image sensors that use the Bayer color filter pattern and Foveon X3 technology for color image capture and applied the SFR method using a red/blue edge.
Abstract: We compared the Spatial Frequency Response (SFR) of image sensors that use the Bayer color filter pattern and Foveon X3 technology for color image capture. Sensors for both consumer and professional cameras were tested. The results show that the SFR for Foveon X3 sensors is up to 2.4x better. In addition to the standard SFR method, we also applied the SFR method using a red/blue edge. In this case, the X3 SFR was 3-5x higher than that for Bayer filter pattern devices.

89 citations


Proceedings ArticleDOI
TL;DR: The results show that stereoscopic images are comfortable to view for an angular parallax of up to about 60 minutes and that visual comfort is achieved if discontinuous temporal changes are angle of 60 minutes or less.
Abstract: The problems associated with watching stereoscopic HDTV have been classified into three groups, one of which is how natural/unnatural stereoscopic pictures look to viewers. It is known that the shooting and viewing conditions affect the depth of a stereoscopic image, and this depth distortion is a major factor influencing the viewer's stereoscopic perception. The second group concerns the visual comfort/discomfort. Visual discomfort is caused by the difficulty of fusing left and right images because of excessive binocular parallax and its temporal changes. We have studied how visual comfort is affected by the range of parallax distribution and temporal parallax changes. The results show that stereoscopic images are comfortable to view for an angular parallax of up to about 60 minutes and that visual comfort is achieved if discontinuous temporal changes are angle of 60 minutes or less. The third group concerns visual fatigue that a viewer experiences after viewing a stereoscopic HDTV program, which is thought to be mainly caused by the mismatch between the eyes' convergence and accommodation. We confirmed that, after observing stereoscopic images for about an hour, the fusion range diminishes and the viewer's visual functions deteriorate as a result.

Proceedings ArticleDOI
TL;DR: An advanced wavelet transform (aDWT) method that incorporates principal component analysis (PCA) and morphological processing into a regular DWT fusion algorithm is presented, and was found to perform the best when tested on the four input image types.
Abstract: There are numerous applications for image fusion, some of which include medical imaging, remote sensing, nighttime operations and multi-spectral imaging In general, the discrete wavelet transform (DWT) and various pyramids (such as Laplacian, ratio, contrast, gradient and morphological pyramids) are the most common and effective methods For quantitative evaluation of the quality of fused imagery, the root mean square error (RMSE) is the most suitable measure of quality if there is a “ground truth” image available; otherwise, the entropy, spatial frequency or image quality index of the input images and the fused images can be calculated and compared Here, after analyzing the pyramids’ performance with the four measures mentioned, an advanced wavelet transform (aDWT) method that incorporates principal component analysis (PCA) and morphological processing into a regular DWT fusion algorithm is presented Specifically, at each scale of the wavelet transformed images, a principle vector was derived from two input images and then applied to two of the images’ approximation coefficients (ie, they were fused by using the principal eigenvector) For the detail coefficients (ie, three quarters of the coefficients), the larger absolute values were chosen and subjected to a neighborhood morphological processing procedure which served to verify the selected pixels by using a “filling” and “cleaning” operation (this operation filled or removed isolated pixels in a 3-by-3 local region) The fusion performance of the advanced DWT (aDWT) method proposed here was compared with six other common methods, and, based on the four quantitative measures, was found to perform the best when tested on the four input image types Since the different image sources used here varied with respect to intensity, contrast, noise, and intrinsic characteristics, the aDWT is a promising image fusion procedure for inhomogeneous imagery

Proceedings ArticleDOI
TL;DR: This work achieved a further parallelization in the x-direction of the complete depth-scan of confocal topography measurements by realizing a chromatic confocal line sensor using a line focus and a spectrometer.
Abstract: The chromatic confocal approach enables the parallelization of the complete depth-scan of confocal topography measurements. Therefore, mechanical movement can be reduced or completely avoided and the measurement times shortened. Chromatic confocal point sensors are already commercially available but they need lateral scanning in x- and y-direction in order to measure surface topographies. We achieved a further parallelization in the x-direction by realizing a chromatic confocal line sensor using a line focus and a spectrometer. In a second setup, we realized an area measuring chromatic confocal microscope, which is capable of one-shot measurements without any mechanical scanning. The depth resolution of this setup can be improved by measuring in a small number of different heights. Additional information about the color distribution of the object is gained.

Proceedings ArticleDOI
TL;DR: This work proposes a novel digitization method that uses a movie captured by a hand-held camera to realize easy and high quality digitization of documents and photographs.
Abstract: Recently, document and photograph digitization from a paper is very important for digital archiving and personal data transmission through the internet. Though many people wish to digitize documents on a paper easily, now heavy and large image scanners are required to obtain high quality digitization. To realize easy and high quality digitization of documents and photographs, we propose a novel digitization method that uses a movie captured by a hand-held camera. In our method, first, 6-DOF(Degree Of Freedom) position and posture parameters of the mobile camera are estimated in each frame by tracking image features automatically. Next, re-appearing feature points in the image sequence are detected and stitched for minimizing accumulated estimation errors. Finally, all the images are merged as a high-resolution mosaic image using the optimized parameters. Experiments have successfully demonstrated the feasibility of the proposed method. Our prototype system can acquire initial estimates of extrinsic camera parameters in real-time with capturing images.

Proceedings ArticleDOI
TL;DR: Advances described include novel Fourier mode variants of diffraction specific algorithms and parallel binarisation techniques for design of the CGH patterns; computer architectures for effective implementation of these algorithms for interactive CGH calculation; the latest developments in the Active Tiling spatial light modulator technology and novel replay optics arrangements including folded mirror geometries, viewer tracking alternatives and new horizontal parallax configurations.
Abstract: This paper will give an overview of some recent developments in electroholography for applications in interactive 3D visualisation. Arguably the ultimate technology for this task, it is the only approach having the potential to deliver full depth cue, 3D images, having resolutions beyond that which can be perceived by the human eye. Despite significant advances by many researchers, the high pixel counts required by the computer generated hologram (CGH) patterns in these systems remain daunting - in practice, systems able to calculate and display reconfigurable CGH having pixel counts of more than one billion may be required for 300 mm width, 3D images. Advances described include novel Fourier mode variants of diffraction specific algorithms and parallel binarisation techniques for design of the CGH patterns; computer architectures for effective implementation of these algorithms for interactive CGH calculation; the latest developments in the Active Tiling spatial light modulator technology and novel replay optics arrangements including folded mirror geometries, viewer tracking alternatives and new horizontal parallax configurations. Throughout, the emphasis is optimisation towards implementation as an interactive electroholography system having practical utility. Some recent results from demonstrations of aspects of the technology will be shown. These include monochrome and colour, static and dynamic, horizontal parallax only (HPO) and full parallax, 3D images, generated from true CGH systems with up to 24 billion pixels.

Proceedings ArticleDOI
TL;DR: This paper proposes a solution based on creating digital images with specific properties, called a Copy-detection patterns (CDP), that is printed on arbitrary documents, packages, etc, that can take a decision on the authenticity of the document.
Abstract: Technologies for making high-quality copies of documents are getting more available, cheaper, and more efficient. As a result, the counterfeiting business engenders huge losses, ranging to 5% to 8% of worldwide sales of brand products, and endangers the reputation and value of the brands themselves. Moreover, the growth of the Internet drives the business of counterfeited documents (fake IDs, university diplomas, checks, and so on), which can be bought easily and anonymously from hundreds of companies on the Web. The incredible progress of digital imaging equipment has put in question the very possibility of verifying the authenticity of documents: how can we discern genuine documents from seemingly “perfect” copies? This paper proposes a solution based on creating digital images with specific properties, called a Copy-detection patterns (CDP), that is printed on arbitrary documents, packages, etc. CDPs make an optimal use of an "information loss principle": every time an imae is printed or scanned, some information is lost about the original digital image. That principle applies even for the highest quality scanning, digital imaging, printing or photocopying equipment today, and will likely remain true for tomorrow. By measuring the amount of information contained in a scanned CDP, the CDP detector can take a decision on the authenticity of the document.

Proceedings ArticleDOI
TL;DR: It is objectively established that the O3DHD (even in its VQ version) outperforms all other SDs, and the 2D/3D CSS-based descriptor exhibits a highly discriminant behavior outperforming the other both 3D and 2D-3D approaches.
Abstract: This paper proposes a comparative study of 3D and 2D/3D shape descriptors (SDs) for indexing and retrieval of 3D mesh models. Seven state of the art SDs are considered and compared, among which five are 3D (Optimized 3D Hough Descriptor - O3DHD, Extended Gaussian Images - EGIs, cords length and spherical angles histograms, random triangles histogram, MPEG-7 3D shape spectrum descriptor - 3DSSD), and two 2D/3D, based on the MPEG-7 2D SDs (Contour Scale Space- CSS, and Angular Radial Transform - ART). A low complexity vector quantized (VQ) OH3DD is also proposed and considered for this comparison. Experimental results were carried out upon the categorized MPEG-7 3D test database. By computing Bull-Eye Score (BES) and First Tier (FT) criteria, it is objectively established that the O3DHD (even in its VQ version) outperforms (BES = 81% or 79%).all other SDs. The 2D/3D CSS-based descriptor exhibits a highly discriminant behavior (BES = 74%) outperforming the other both 3D and 2D/3D approaches. Apply to the industrial framework of the RNRT SEMANTIC-3D Project, the O3DHD demonstrated its relevance together with its scalability and robustness properties.

Proceedings ArticleDOI
TL;DR: The paper describes both, the general hardware architecture and the software concept of the new high-speed controller solution, which enables advanced applications of DMD technology in metrology, testing and beyond.
Abstract: The paper presents a current development in the field of high-speed spatial light modulators. The Digital Micromirror Device (DMD) developed and produced by Texas Instruments Inc. (TI) stimulated new approaches in photonics. Recently, TI introduced the Discovery general purpose chipset to support new business areas in addition to the mainstream application of DMD technology in digital projection. ViALUX developed the ALP parallel interface controller board as a Discovery 1100 accessory for high speed micromirror operation. ALP (Accessory Light Modulator Package) has been designed for use in optical metrology but is widely open for numerous applications. It allows for rapid launch into new DMD applications and can be integrated instantly into existing systems or may initiate new developments. The paper describes both, the general hardware architecture and the software concept of the new high-speed controller solution. Binary and gray-value patterns of variable bit-depth can be pre-loaded to on-board SDRAM via USB and transferred to DMD at high speed (up to 6900 XGA frames per second). Three examples are to illustrate how the approach enables advanced applications of DMD technology in metrology, testing and beyond.

Proceedings ArticleDOI
TL;DR: SiPix's Microcup® roll-to-roll manufacturing processes using ITO/PET films has shown outstanding environmental stability and excellent physico-mechanical properties such as scratch, impact and flexure resistances even in high temperature and humidity conditions.
Abstract: Plastic passive matrix (PM) and active matrix (AM) electronic paper displays (EPDs) have been prepared by SiPix's Microcup® roll-to-roll manufacturing processes using ITO/PET films. The Microcup® displays have shown outstanding environmental stability and excellent physico-mechanical properties such as scratch, impact and flexure resistances even in high temperature and humidity conditions. A PMEPD recently prepared on inexpensive row-and-column patterned ITO/PET films has shown a contrast ratio of >10, ta n 15 V. More than 8 levels of grayscale with outstanding bistability have been demonstrated by either pulse width or pulse count modulation. No noticeable degradation of the mid-tone images has been observed even after the power was turned off for more than 5 days. Moreover, the electro-optical responses, particularly the threshold voltage and gamma of the PMEPDs remain essentially the same within a wide range (20-60°C) of operation temperature.

Proceedings ArticleDOI
TL;DR: The experimental results for this imaging system demonstrate the superiority of the optimal un sharp mask compared to a conventional unsharp mask with fixed strength.
Abstract: We consider the problem of restoring a noisy blurred image using an adaptive unsharp mask filter. Starting with a set of very high quality images, we use models for both the blur and the noise to generate a set of degraded images. With these image pairs, we optimally train the strength parameter of the unsharp mask to smooth flat areas of the image and to sharpen areas with detail. We characterize the blur and the noise for a specific hybrid analog/digital imaging system in which the original image is captured on film with a low-cost analog camera. A silver-halide print is made from this negative; and this is scanned to obtain a digital image. Our experimental results for this imaging system demonstrate the superiority of our optimal unsharp mask compared to a conventional unsharp mask with fixed strength.

Proceedings ArticleDOI
TL;DR: A real-time object segmentation method for MPEG encoded video that exploits the macro-block structure of the encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load.
Abstract: We propose a real-time object segmentation method for MPEG encoded video. Computational superiority is the main advantage of compressed domain processing. We exploit the macro-block structure of the encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load. Further reduction is achieved by temporal grouping of the intro-coded and estimated frames into a single feature layer. In addition to computational advantage, compressed-domain video possesses important features attractive for object analysis. Texture characteristics are provided by the DCT coefficients. Motion information is readily available without incurring cost of estimating a motion field. To achieve segmentation, the DCT coefficients for I-frames and block motion vectors for P-frames are combined and a frequency-temporal data structure is constructed. Starting from the blocks where the ac-coefficient energy and local inter-block dc-coefficient variance is small, the homogeneous volumes are enlarged by evaluating the distance of candidate vectors to the volume characteristics. Affine motion models are fit to volumes. Finally, a hierarchical clustering stage iteratively merges the most similar parts to generate an object partition tree as an output.

Proceedings ArticleDOI
TL;DR: The results show that the adaptive K-nearest neighbor filter outperforms the none-adaptive one, as well as some other state-of-the-art spatio-temporal filters such as the 3D alpha-trimmed mean and the state- of theart rational filter by Ramponi from both a PSNR and visual quality point of view.
Abstract: Non-linear techniques for denoising images and video are known to be superior to linear ones. In addition video denoising using spatio-temporal information is considered to be more efficient compared with the use of just temporal information in the presence of fast motion and low noise. Earlier, we introduced a 3-D extension of the K-nearest neighbor filter and have investigated its properties. In this paper we propose a new, motion- and detail-adaptive filter, which solves some of the potential drawbacks of the non-adaptive version: motion caused artifacts and the loss of fine details and texture. We also introduce a novel noise level estimation technique for automatic tuning of the noise-level dependent parameters. The results show that the adaptive K-nearest neighbor filter outperforms the none-adaptive one, as well as some other state-of-the-art spatio-temporal filters such as the 3D alpha-trimmed mean and the state-of-the-art rational filter by Ramponi from both a PSNR and visual quality point of view.

Proceedings ArticleDOI
TL;DR: This algorithm is initially discussed as a rigorous procedure to obtain fields behind a tilted planar surface by using the rotational transformation of wave fields, and results in the silhouette approximation for reduction of computation time.
Abstract: A novel algorithm is presented for hidden-surface removal in digitally synthetic holograms. The algorithm is able to work with full-parallax holograms and remove obstructed fields in the object wave emitted from threedimensional (3-D) surface objects. This algorithm is initially discussed as a rigorous procedure to obtain fields behind a tilted planar surface by using the rotational transformation of wave fields, and finally results in the silhouette approximation for reduction of computation time. Reconstructions of holograms created by using the algorithm are demonstrated.

Proceedings ArticleDOI
TL;DR: A full-color stereoscopic display that varies the focus of objects at different distances in a displayed scene to match vergence and stereoscopic retinal disparity demands, better approximating natural vision is described.
Abstract: We describe a full-color stereoscopic display that varies the focus of objects at different distances in a displayed scene to match vergence and stereoscopic retinal disparity demands, better approximating natural vision. In natural vision, the oculomotor processes of accommodation (eye focus) and vergence (angle between lines of sight of two eyes) are reflexively linked such that a change in one drives a matching change in the other. Conventional stereoscopic displays require viewers to decouple these processes, and accommodate at a fixed distance while dynamically varying vergence to view objects at different stereoscopic distances. This decoupling generates eye fatigue and compromises image quality when viewing such displays. In contrast, our display overcomes this cue conflict by using a deformable membrane mirror to dynamically vary the focus of luminance-modulated RGB laser beams before they are raster-scanned and projected directly onto the retina. The display has a large focal range (closer than the viewer's near point to infinity) and presents high-resolution (1280x480) full-color images at 60 Hz. A viewer of our display can shift accommodation naturally from foreground to background of a stereo image, thereby bringing objects at different distances into and out of focus. Design considerations and human factors data are discussed.

Proceedings ArticleDOI
TL;DR: In this article, the authors demonstrate the construction and operation of a prototype compound eye sensor which currently consists of up to 20 eyelets, each of which forms an image of approximately 150 pixels in diameter on a single CMOS image sensor.
Abstract: Compound eyes are a highly successful natural solution to the issue of wide field of view and high update rate for vision systems. Applications for an electronic implementation of a compound eye sensor include high-speed object tracking and depth perception. In this paper we demonstrate the construction and operation of a prototype compound eye sensor which currently consists of up to 20 eyelets, each of which forms an image of approximately 150 pixels in diameter on a single CMOS image sensor. Post-fabrication calibration of such a sensor is discussed in detail with reference to experimental measurements of accuracy and repeatability.

Proceedings ArticleDOI
Laurent Najman1, Michel Couprie1
TL;DR: This paper proposes a simple to implement quasi-linear algorithm for computing the component tree on symmetric graphs, based on Tarjan's union-find principle, which is proved to run in 0(nln(n)) complexity.
Abstract: The level sets of a map are the sets of points with level a bove a given threshold. The connected components ofthe level sets, thanks to the inclusion relation, can be organized in a tree structure, that is called the componenttree . This tree, under several variations, has been used in numerous applications. Various algorithms have beenproposed in the literature for computing the compon ent tree. The fastest ones have been proved to run in0( n ln( n )) complexity. In this paper, we propose a simple to implement quasi-linear algorithm for computingthe component tree on symmetric graphs, based on Tarjans union-“nd principle.Keywords: Component tree, mathematical morphology 1. INTRODUCTION The level sets of a map are the sets of points with level a bove a given threshold. The connected components ofthe level sets, thanks to the inclusion relation, can be organized in a tree structure, that is called the componenttree . The component tree captures some essential features of the map. Thus, it has been used (under severalvariations) in numerous applications among which we can cite: image “ltering and segmentation,

Proceedings ArticleDOI
TL;DR: A fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori (MAP) estimation technique by minimizing a multi-term cost function is proposed.
Abstract: In the last two decades, two related categories of problems have been studied independently in the image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and as conventional color digital cameras suffer from both low-spatial resolution and color filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori (MAP) estimation technique by minimizing a multi-term cost function. The L1 norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the chrominance component. Finally, an additional regularization term is used to force similar edge orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.

Proceedings ArticleDOI
TL;DR: In this paper, the spectral response curves of several different display types and several different brands of anaglyph glasses were measured using a spectroradiometer or spectrophotometer.
Abstract: Anaglyphic 3D images are an easy way of displaying stereoscopic 3D images on a wide range of display types, e.g. CRT, LCD, print, etc. While the anaglyphic 3D image method is cheap and accessible, its use requires a compromise in stereoscopic image quality. A common problem with anaglyphic 3D images is ghosting. Ghosting (or crosstalk) is the leaking of an image to one eye, when it is intended exclusively for the other eye. Ghosting degrades the ability of the observer to fuse the stereoscopic image and hence the quality of the 3D image is reduced. Ghosting is present in various levels with most stereoscopic displays, however it is often particularly evident with anaglyphic 3D images. This paper describes a project whose aim was to characterize the presence of ghosting in anaglyphic 3D images due to spectral issues. The spectral response curves of several different display types and several different brands of anaglyph glasses were measured using a spectroradiometer or spectrophotometer. A mathematical model was then developed to predict the amount of crosstalk in anaglyphic 3D images when different combinations of displays and glasses are used, and therefore predict the best type of anaglyph glasses for use with a particular display type.© (2004) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
TL;DR: This paper proposes using commodity computer graphics hardware for fast image interpolation using the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.
Abstract: Motion estimation and compensation is the key to high quality video coding. Block matching motion estimation is used in most video codecs, including MPEG-2, MPEG-4, H.263 and H.26L. Motion estimation is also a key component in the digital restoration of archived video and for post-production and special effects in the movie industry. Sub-pixel accurate motion vectors can improve the quality of the vector field and lead to more efficient video coding. However sub-pixel accuracy requires interpolation of the image data. Image interpolation is a key requirement of many image processing algorithms. Often interpolation can be a bottleneck in these applications, especially in motion estimation due to the large number pixels involved. In this paper we propose using commodity computer graphics hardware for fast image interpolation. We use the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.

Proceedings ArticleDOI
TL;DR: It is found that illuminant-dependent characterization produced the best results, sharpened camera RGB and native camera RGB were next best, XYZ and CAM02 were often not far behind, and balancing in the -709 primaries was significantly worse.
Abstract: Six different methods for white-balancing digital images were compared in terms of their ability to produce white-balanced colors close to those viewed under a specific viewing illuminant. The six methods were: native camera RGB, XYZ, CAM02, ITU Rec BT.709 RGB, sharpened camera RGB, and illuminant-dependent. 4096 different sets of camera sensitivities were synthesized; 170 objects were evaluated under a canonical viewing illuminant (D65) and six additional taking illuminants (A, D50, D75, F2, F7, and F11). Each white balancing method was exercised in turn, and the mean and 90th percentile ΔE*ab were determined. We found that illuminant-dependent characterization produced the best results, sharpened camera RGB and native camera RGB were next best, XYZ and CAM02 were often not far behind, and balancing in the -709 primaries was significantly worse. We recommend that, whenever the illuminant is identified, the illuminant-dependent technique be employed because of its superior performance.