scispace - formally typeset
Search or ask a question
Author

Wa James Tam

Bio: Wa James Tam is an academic researcher from University of York. The author has contributed to research in topics: Image quality & Stereoscopy. The author has an hindex of 26, co-authored 72 publications receiving 3477 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Results are presented to show that the proposed system provides an improvement in image quality of stereoscopic virtual views while maintaining reasonably good depth quality.
Abstract: A depth-image-based rendering system for generating stereoscopic images is proposed. One important aspect of the proposed system is that the depth maps are pre-processed using an asymmetric filter to smoothen the sharp changes in depth at object boundaries. In addition to ameliorating the effects of blocky artifacts and other distortions contained in the depth maps, the smoothing reduces or completely removes newly exposed (disocclusion) areas where potential artifacts can arise from image warping which is needed to generate images from new viewpoints. The asymmetric nature of the filter reduces the amount of geometric distortion that might be perceived otherwise. We present some results to show that the proposed system provides an improvement in image quality of stereoscopic virtual views while maintaining reasonably good depth quality.

562 citations

Journal ArticleDOI
TL;DR: A concise overview of the main topics relevant to comfort in viewing stereoscopic television and survey the key factors influencing visual comfort are presented.
Abstract: Among the key topics of discussion and research on three-dimensional television (3D-TV), visual comfort is certainly one of the most critical. This is because it is well known that some viewers experience visual discomfort when looking at stereoscopic displays. It is important to properly address the issue of visual comfort to avoid possible delays in the deployment of 3D-TV. Here we present a concise overview of the main topics relevant to comfort in viewing stereoscopic television and survey the key factors influencing visual comfort. Potential end users of 3D-TV, content creators, program providers, broadcasters, display manufacturers and researchers will find this overview useful.

309 citations

Patent
25 Jul 2006
TL;DR: In this paper, depth maps are generated from a monoscopic source images and asymmetrically smoothed to a near-saturation level, each depth map contains depth values focused on edges of local regions in the source image, each edge is defined by a predetermined image parameter having an estimated value exceeding a predefined threshold.
Abstract: Depth maps are generated from a monoscopic source images and asymmetrically smoothed to a near-saturation level. Each depth map contains depth values focused on edges of local regions in the source image. Each edge is defined by a predetermined image parameter having an estimated value exceeding a predefined threshold. The depth values are based on the corresponding estimated values of the image parameter. The depth map is used to process the source image by a depth image based rendering algorithm to create at least one deviated image, which forms with the source image a set of monoscopic images. At least one stereoscopic image pair is selected from such a set for use in generating different viewpoints for multiview and stereoscopic purposes, including still and moving images.

294 citations

Patent
23 Jul 2009
TL;DR: In this article, a method and a graphical user interface for modifying a depth map for a digital monoscopic color image is presented, which includes interactively selecting a region of the depth map based on color of a target region in the color image, and modifying depth values in the thereby selected region using a depth modification rule.
Abstract: The invention relates to a method and a graphical user interface for modifying a depth map for a digital monoscopic color image. The method includes interactively selecting a region of the depth map based on color of a target region in the color image, and modifying depth values in the thereby selected region of the depth map using a depth modification rule. The color-based pixel selection rules for the depth map and the depth modification rule selected based on one color image from a video sequence may be saved and applied to automatically modify depths maps of other color images from the same sequence.

235 citations

Journal ArticleDOI
TL;DR: It was found that spatial filtering of one channel of a stereo video-sequence may be an effective means of reducing the transmission bandwidth: the overall sensation of depth was unaffected by low-pass filtering, while ratings of quality and of sharpness were strongly weighted towards the eye with the greater spatial resolution.
Abstract: We explored the response of the human visual system to mixed-resolution stereo video-sequences, in which one eye view was spatially or temporally low-pass filtered. It was expected that the perceived quality, depth, and sharpness would be relatively unaffected by low-pass filtering, compared to the case where both eyes viewed a filtered image. Subjects viewed two 10-second stereo video-sequences, in which the right-eye frames were filtered vertically (V) and horizontally (H) at 1/2 H, 1/2 V, 1/4 H, 1/4 V, 1/2 H 1/2 V, 1/2 H 1/4 V, 1/4 H 1/2 V, and 1/4 H 1/4 V resolution. Temporal filtering was implemented for a subset of these conditions at 1/2 temporal resolution, or with drop-and-repeat frames. Subjects rated the overall quality, sharpness, and overall sensation of depth. It was found that spatial filtering produced acceptable results: the overall sensation of depth was unaffected by low-pass filtering, while ratings of quality and of sharpness were strongly weighted towards the eye with the greater spatial resolution. By comparison, temporal filtering produced unacceptable results: field averaging and drop-and-repeat frame conditions yielded images with poor quality and sharpness, even though perceived depth was relatively unaffected. We conclude that spatial filtering of one channel of a stereo video-sequence may be an effective means of reducing the transmission bandwidth.

217 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The importance of various causes and aspects of visual discomfort is clarified and three-dimensional artifacts resulting from insufficient depth information in the incoming data signal yielding spatial and temporal inconsistencies are believed to be the most pertinent.
Abstract: Visual discomfort has been the subject of considerable research in relation to stereoscopic and autostereoscopic displays. In this paper, the importance of various causes and aspects of visual discomfort is clarified. When disparity values do not surpass a limit of 1°, which still provides sufficient range to allow satisfactory depth perception in stereoscopic television, classical determinants such as excessive binocular parallax and accommodation-vergence conflict appear to be of minor importance. Visual discomfort, however, may still occur within this limit and we believe the following factors to be the most pertinent in contributing to this: (1) temporally changing demand of accommodation-vergence linkage, e.g., by fast motion in depth; (2) three-dimensional artifacts resulting from insufficient depth information in the incoming data signal yielding spatial and temporal inconsistencies; and (3) unnatural blur. In order to ad- equately characterize and understand visual discomfort, multiple types of measurements, both objective and subjective, are required. © 2009 Society for Imaging Science and Technology. DOI: 10.2352/J.ImagingSci.Technol.2009.53.3.030201

990 citations

Journal ArticleDOI
TL;DR: This tutorial focuses on the sense of touch within the context of a fully active human observer and describes an extensive body of research on “what” and “where” channels, the former dealing with haptic perception of objects, surfaces, and their properties, and the latter with perception of spatial layout on the skin and in external space relative to the perceiver.
Abstract: This tutorial focuses on the sense of touch within the context of a fully active human observer. It is intended for graduate students and researchers outside the discipline who seek an introduction to the rapidly evolving field of human haptics. The tutorial begins with a review of peripheral sensory receptors in skin, muscles, tendons, and joints. We then describe an extensive body of research on “what” and “where” channels, the former dealing with haptic perception of objects, surfaces, and their properties, and the latter with perception of spatial layout on the skin and in external space relative to the perceiver. We conclude with a brief discussion of other significant issues in the field, including vision-touch interactions, affective touch, neural plasticity, and applications.

822 citations

Journal ArticleDOI
TL;DR: A general-purpose usefulness of the algorithm is suggested in improving compression ratios of unconstrained video, based on a nonlinear integration of low-level visual cues, mimicking processing in primate occipital, and posterior parietal cortex.
Abstract: We evaluate the applicability of a biologically-motivated algorithm to select visually-salient regions of interest in video streams for multiply-foveated video compression. Regions are selected based on a nonlinear integration of low-level visual cues, mimicking processing in primate occipital, and posterior parietal cortex. A dynamic foveation filter then blurs every frame, increasingly with distance from salient locations. Sixty-three variants of the algorithm (varying number and shape of virtual foveas, maximum blur, and saliency competition) are evaluated against an outdoor video scene, using MPEG-1 and constant-quality MPEG-4 (DivX) encoding. Additional compression radios of 1.1 to 8.5 are achieved by foveation. Two variants of the algorithm are validated against eye fixations recorded from four to six human observers on a heterogeneous collection of 50 video clips (over 45 000 frames in total). Significantly higher overlap than expected by chance is found between human and algorithmic foveations. With both variants, foveated clips are, on average, approximately half the size of unfoveated clips, for both MPEG-1 and MPEG-4. These results suggest a general-purpose usefulness of the algorithm in improving compression ratios of unconstrained video.

796 citations

Journal ArticleDOI
TL;DR: This target article presents an information processing model for the control of these movements, with some close parallels to established physiological processes in the oculomotor system, for a number of well-established phenomena in target-elicited saccades.
Abstract: During active vision, the eyes continually scan the visual environment using saccadic scanning movements. This target article presents an information processing model for the control of these movements, with some close parallels to established physiological pro- cesses in the oculomotor system. Two separate pathways are concerned with the spatial and the temporal programming of the move- ment. In the temporal pathway there is spatially distributed coding and the saccade target is selected from a "salience map." Both path- ways descend through a hierarchy of levels, the lower ones operating automatically. Visual onsets have automatic access to the eye control system via the lower levels. Various centres in each pathway are interconnected via reciprocal inhibition. The model accounts for a num- ber of well-established phenomena in target-elicited saccades: the gap effect, express saccades, the remote distractor effect, and the global effect. High-level control of the pathways in tasks such as visual search and reading is discussed; it operates through spatial se- lection and search selection, which generally combine in an automated way. The model is examined in relation to data from patients with unilateral neglect.

739 citations

Journal ArticleDOI
31 Jan 2011
TL;DR: An overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC is provided and a summary of the coding performance achieved by MVC for both stereo- and multiview video is provided.
Abstract: Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.

683 citations