scispace - formally typeset
Search or ask a question
Author

King Ngi Ngan

Bio: King Ngi Ngan is an academic researcher from University of Electronic Science and Technology of China. The author has contributed to research in topics: Image segmentation & Segmentation. The author has an hindex of 43, co-authored 423 publications receiving 9249 citations. Previous affiliations of King Ngi Ngan include National University of Singapore & University of Western Australia.


Papers
More filters
Journal ArticleDOI
TL;DR: It is explained how the face-segmentation results can be used to improve the perceptual quality of a videophone sequence encoded by the H.261-compliant coder.
Abstract: This paper addresses our proposed method to automatically segment out a person's face from a given image that consists of a head-and-shoulders view of the person and a complex background scene. The method involves a fast, reliable, and effective algorithm that exploits the spatial distribution characteristics of human skin color. A universal skin-color map is derived and used on the chrominance component of the input image to detect pixels with skin-color appearance. Then, based on the spatial distribution of the detected skin-color pixels and their corresponding luminance values, the algorithm employs a set of novel regularization processes to reinforce regions of skin-color pixels that are more likely to belong to the facial regions and eliminate those that are not. The performance of the face-segmentation algorithm is illustrated by some simulation results carried out on various head-and-shoulders test images. The use of face segmentation for video coding in applications such as videotelephony is then presented. We explain how the face-segmentation results can be used to improve the perceptual quality of a videophone sequence encoded by the H.261-compliant coder.

797 citations

Journal ArticleDOI
TL;DR: A generic model for unsupervised extraction of viewer's attention objects from color images by integrating computational visual attention mechanisms with attention object growing techniques and describes the MRF by a Gibbs random field with an energy function.
Abstract: This paper proposes a generic model for unsupervised extraction of viewer's attention objects from color images. Without the full semantic understanding of image content, the model formulates the attention objects as a Markov random field (MRF) by integrating computational visual attention mechanisms with attention object growing techniques. Furthermore, we describe the MRF by a Gibbs random field with an energy function. The minimization of the energy function provides a practical way to obtain attention objects. Experimental results on 880 real images and user subjective evaluations by 16 subjects demonstrate the effectiveness of the proposed approach.

408 citations

Journal ArticleDOI
TL;DR: A method to detect co-saliency from an image pair that may have some objects in common and employ a normalized single-pair SimRank algorithm to compute the similarity score is introduced.
Abstract: In this paper, we introduce a method to detect co-saliency from an image pair that may have some objects in common. The co-saliency is modeled as a linear combination of the single-image saliency map (SISM) and the multi-image saliency map (MISM). The first term is designed to describe the local attention, which is computed by using three saliency detection techniques available in literature. To compute the MISM, a co-multilayer graph is constructed by dividing the image pair into a spatial pyramid representation. Each node in the graph is described by two types of visual descriptors, which are extracted from a representation of some aspects of local appearance, e.g., color and texture properties. In order to evaluate the similarity between two nodes, we employ a normalized single-pair SimRank algorithm to compute the similarity score. Experimental evaluation on a number of image pairs demonstrates the good performance of the proposed method on the co-saliency detection task.

322 citations

Journal ArticleDOI
TL;DR: A new automatic video sequence segmentation algorithm that extracts moving objects from the sequence using an object tracker that matches a two-dimensional binary model of the object against subsequent frames using the Hausdorff distance.
Abstract: The new video coding standard MPEG-4 is enabling content-based functionalities. It takes advantage of a prior decomposition of sequences into video object planes (VOPs) so that each VOP represents one moving object. A comprehensive review summarizes some of the most important motion segmentation and VOP generation techniques that have been proposed. Then, a new automatic video sequence segmentation algorithm that extracts moving objects is presented. The core of this algorithm is an object tracker that matches a two-dimensional (2-D) binary model of the object against subsequent frames using the Hausdorff distance. The best match found indicates the translation the object has undergone, and the model is updated every frame to accommodate for rotation and changes in shape. The initial model is derived automatically, and a new model update method based on the concept of moving connected components allows for comparatively large changes in shape. The proposed algorithm is improved by a filtering technique that removes stationary background. Finally, the binary model sequence guides the extraction objects of the VOPs from the sequence. Experimental results demonstrate the performance of our algorithm.

293 citations

Journal ArticleDOI
TL;DR: A DCT based JND model for monochrome pictures is proposed that incorporates the spatial contrast sensitivity function (CSF), the luminance adaptation effect, and the contrast masking effect based on block classification and is consistent with the human visual system.
Abstract: In image and video processing field, an effective compression algorithm should remove not only the statistical redundancy information but also the perceptually insignificant component from the pictures. Just-noticeable distortion (JND) profile is an efficient model to represent those perceptual redundancies. Human eyes are usually not sensitive to the distortion below the JND threshold. In this paper, a DCT based JND model for monochrome pictures is proposed. This model incorporates the spatial contrast sensitivity function (CSF), the luminance adaptation effect, and the contrast masking effect based on block classification. Gamma correction is also considered to compensate the original luminance adaptation effect which gives more accurate results. In order to extend the proposed JND profile to video images, the temporal modulation factor is included by incorporating the temporal CSF and the eye movement compensation. Moreover, a psychophysical experiment was designed to parameterize the proposed model. Experimental results show that the proposed model is consistent with the human visual system (HVS). Compared with the other JND profiles, the proposed model can tolerate more distortion and has much better perceptual quality. This model can be easily applied in many related areas, such as compression, watermarking, error protection, perceptual distortion metric, and so on.

257 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: In this article, the authors categorize and evaluate face detection algorithms and discuss relevant issues such as data collection, evaluation metrics and benchmarking, and conclude with several promising directions for future research.
Abstract: Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.

3,894 citations

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper introduces a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects that outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.
Abstract: Detection of visually salient image regions is useful for applications like object segmentation, adaptive compression, and object recognition. In this paper, we introduce a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects. These boundaries are preserved by retaining substantially more frequency content from the original image than other existing techniques. Our method exploits features of color and luminance, is simple to implement, and is computationally efficient. We compare our algorithm to five state-of-the-art salient region detection methods with a frequency domain analysis, ground truth, and a salient object segmentation application. Our method outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.

3,723 citations