scispace - formally typeset
Search or ask a question
Author

Shing-Chow Chan

Bio: Shing-Chow Chan is an academic researcher from University of Hong Kong. The author has contributed to research in topics: Rendering (computer graphics) & Image-based modeling and rendering. The author has an hindex of 25, co-authored 128 publications receiving 3333 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2000
TL;DR: From a spectral analysis of light field signals and using the sampling theorem, the analytical functions to determine the minimum sampling rate for light field rendering are derived and this approach bridges the gap between image- based rendering and traditional geometry-based rendering.
Abstract: This paper studies the problem of plenoptic sampling in image-based rendering (IBR). From a spectral analysis of light field signals and using the sampling theorem, we mathematically derive the analytical functions to determine the minimum sampling rate for light field rendering. The spectral support of a light field signal is bounded by the minimum and maximum depths only, no matter how complicated the spectral support might be because of depth variations in the scene. The minimum sampling rate for light field rendering is obtained by compacting the replicas of the spectral support of the sampled light field within the smallest interval. Given the minimum and maximum depths, a reconstruction filter with an optimal and constant depth can be designed to achieve anti-aliased light field rendering.Plenoptic sampling goes beyond the minimum number of images needed for anti-aliased light field rendering. More significantly, it utilizes the scene depth information to determine the minimum sampling curve in the joint image and geometry space. The minimum sampling curve quantitatively describes the relationship among three key elements in IBR systems: scene complexity (geometrical and textural information), the number of image samples, and the output resolution. Therefore, plenoptic sampling bridges the gap between image-based rendering and traditional geometry-based rendering. Experimental results demonstrate the effectiveness of our approach.

793 citations

Journal ArticleDOI
TL;DR: The techniques for image-based rendering (IBR), in which 3-D geometry of the scene is known, are surveyed and the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies are explored.
Abstract: We survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.

310 citations

Journal ArticleDOI
TL;DR: A novel distance metric, superpixel earth mover's distance (SP-EMD), is proposed to measure the dissimilarity between the hand gestures, which is robust to distortion and articulation, but also invariant to scaling, translation and rotation with proper preprocessing.
Abstract: This paper presents a new superpixel-based hand gesture recognition system based on a novel superpixel earth mover's distance metric, together with Kinect depth camera The depth and skeleton information from Kinect are effectively utilized to produce markerless hand extraction The hand shapes, corresponding textures and depths are represented in the form of superpixels, which effectively retain the overall shapes and color of the gestures to be recognized Based on this representation, a novel distance metric, superpixel earth mover's distance (SP-EMD), is proposed to measure the dissimilarity between the hand gestures This measurement is not only robust to distortion and articulation, but also invariant to scaling, translation and rotation with proper preprocessing The effectiveness of the proposed distance metric and recognition algorithm are illustrated by extensive experiments with our own gesture dataset as well as two other public datasets Simulation results show that the proposed system is able to achieve high mean accuracy and fast recognition speed Its superiority is further demonstrated by comparisons with other conventional techniques and two real-life applications

271 citations

Journal ArticleDOI
TL;DR: Simulation results show that the transversal RLM and the H-PEF-LSL algorithms have better performance than the conventional RLS and other RLS-like robust adaptive algorithms tested when the desired and input signals are corrupted by impulsive noise.
Abstract: This paper studies the problem of robust adaptive filtering in impulsive noise environment using a recursive least M-estimate algorithm (RLM). The RLM algorithm minimizes a robust M-estimator-based cost function instead of the conventional mean square error function (MSE). Previous work has showed that the RLM algorithm offers improved robustness to impulses over conventional recursive least squares (RLS) algorithm. In this paper, the mean and mean square convergence behaviors of the RLM algorithm under the contaminated Gaussian impulsive noise model is analyzed. A lattice structure-based fast RLM algorithm, called the Huber Prior Error Feedback-Least Squares Lattice (H-PEF-LSL) algorithm is derived. Part of the H-PEF-LSL algorithm was presented in ICASSP 2001. It has an order O(N) arithmetic complexity, where N is the length of the adaptive filter, and can be viewed as a fast implementation of the RLM algorithm based on the modified Huber M-estimate function and the conventional PEF-LSL adaptive filtering algorithm. Simulation results show that the transversal RLM and the H-PEF-LSL algorithms have better performance than the conventional RLS and other RLS-like robust adaptive algorithms tested when the desired and input signals are corrupted by impulsive noise. Furthermore, the theoretical and simulation results on the convergence behaviors agree very well with each other.

183 citations

Journal ArticleDOI
TL;DR: A class of subspace-based methods for direction-of-arrival (DOA) estimation and tracking in the case of uniform linear arrays (ULAs) with mutual coupling with high flexibility and effectiveness is proposed.
Abstract: A class of subspace-based methods for direction-of-arrival (DOA) estimation and tracking in the case of uniform linear arrays (ULAs) with mutual coupling is proposed. By treating the angularly-independent mutual coupling as angularly-dependent complex array gains, the middle subarray is found to have the same complex array gains. Using this property, a new way for parameterizing the steering vector is proposed and the corresponding method for joint estimation of DOAs and mutual coupling matrix (MCM) using the whole array data is derived based on subspace principle. Simulation results show that the proposed algorithm has a better performance than the conventional subarray-based method especially for weak signals. Furthermore, to achieve low computational complexity for online and time-varying DOA estimation, three subspace tracking algorithms with different arithmetic complexities and tracking abilities are developed. More precisely, by introducing a better estimate of the subspace to the conventional tracking algorithms, two modified methods, namely modified projection approximate subspace tracking (PAST) (MPAST) and modified orthonormal PAST (MOPAST), are developed for slowly changing subspace, whereas a Kalman filter with a variable number of measurements (KFVM) method for rapidly changing subspace is introduced. Simulation results demonstrate that these algorithms offer high flexibility and effectiveness for tracking DOAs in the presence of mutual coupling.

167 citations


Cited by
More filters
Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Book ChapterDOI
11 Dec 2012

1,704 citations

Journal ArticleDOI
01 Jul 2005
TL;DR: A unique array of 100 custom video cameras that are built are described, and their experiences using this array in a range of imaging applications are summarized.
Abstract: The advent of inexpensive digital image sensors and the ability to create photographs that combine information from a number of sensed images are changing the way we think about photography. In this paper, we describe a unique array of 100 custom video cameras that we have built, and we summarize our experiences using this array in a range of imaging applications. Our goal was to explore the capabilities of a system that would be inexpensive to produce in the future. With this in mind, we used simple cameras, lenses, and mountings, and we assumed that processing large numbers of images would eventually be easy and cheap. The applications we have explored include approximating a conventional single center of projection video camera with high performance along one or more axes, such as resolution, dynamic range, frame rate, and/or large aperture, and using multiple cameras to approximate a video camera with a large synthetic aperture. This permits us to capture a video light field, to which we can apply spatiotemporal view interpolation algorithms in order to digitally simulate time dilation and camera motion. It also permits us to create video sequences using custom non-uniform synthetic apertures.

1,285 citations

Journal ArticleDOI
TL;DR: A complete system to build visual models from camera images is presented and a combined approach with view-dependent geometry and texture is presented, as an application fusion of real and virtual scenes is also shown.
Abstract: In this paper a complete system to build visual models from camera images is presented. The system can deal with uncalibrated image sequences acquired with a hand-held camera. Based on tracked or matched features the relations between multiple views are computed. From this both the structure of the scene and the motion of the camera are retrieved. The ambiguity on the reconstruction is restricted from projective to metric through self-calibration. A flexible multi-view stereo matching scheme is used to obtain a dense estimation of the surface geometry. From the computed data different types of visual models are constructed. Besides the traditional geometry- and image-based approaches, a combined approach with view-dependent geometry and texture is presented. As an application fusion of real and virtual scenes is also shown.

1,029 citations

Proceedings ArticleDOI
01 Aug 2001
TL;DR: An image based rendering approach that generalizes many current imagebased rendering algorithms, including light field rendering and view-dependent texture mapping, that allows for lumigraph-style rendering from a set of input cameras in arbitrary configurations.
Abstract: We describe an image based rendering approach that generalizes many current image based rendering algorithms, including light field rendering and view-dependent texture mapping. In particular, it allows for lumigraph-style rendering from a set of input cameras in arbitrary configurations (i.e., not restricted to a plane or to any specific manifold). In the case of regular and planar input camera positions, our algorithm reduces to a typical lumigraph approach. When presented with fewer cameras and good approximate geometry, our algorithm behaves like view-dependent texture mapping. The algorithm achieves this flexibility because it is designed to meet a set of specific goals that we describe. We demonstrate this flexibility with a variety of examples.

984 citations