scispace - formally typeset
Search or ask a question

Showing papers by "Stan Z. Li published in 2009"


Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper presents a subspace learning framework named Coupled Spectral Regression (CSR) to solve the challenge problem of coupling the two types of face images and matching between them, and shows that the proposed CSR method significantly outperforms the existing methods.
Abstract: Face recognition algorithms need to deal with variable lighting conditions. Near infrared (NIR) image based face recognition technology has been proposed to effectively overcome this difficulty. However, it requires that the enrolled face images be captured using NIR images whereas many applications require visual (VIS) images for enrollment templates. To take advantage of NIR face images for illumination-invariant face recognition and allow the use of VIS face images for enrollment, we encounter a new face image pattern recognition problem, that is, heterogeneous face matching between NIR versus VIS faces. In this paper, we present a subspace learning framework named Coupled Spectral Regression (CSR) to solve this challenge problem of coupling the two types of face images and matching between them. CSR first models the properties of different types of data separately and then learns two associated projections to project heterogeneous data (e.g. VIS and NIR) respectively into a discriminative common subspace in which classification is finally performed. Compared to other existing methods, CSR is computational efficient, benefiting from the efficiency of spectral regression and has better generalization performance. Experimental results on VIS-NIR face database show that the proposed CSR method significantly outperforms the existing methods.

182 citations


Book ChapterDOI
Shengcai Liao1, Dong Yi1, Zhen Lei1, Rui Qin1, Stan Z. Li1 
04 Jun 2009
TL;DR: MB-LBP, an extension of LBP operator, is applied to encode the local image structures in the transformed domain, and further learn the most discriminant local features for recognition in heterogeneous face images.
Abstract: Heterogeneous face images come from different lighting conditions or different imaging devices, such as visible light (VIS) and near infrared (NIR) based. Because heterogeneous face images can have different skin spectra-optical properties, direct appearance based matching is no longer appropriate for solving the problem. Hence we need to find facial features common in heterogeneous images. For this, first we use Difference-of-Gaussian filtering to obtain a normalized appearance for all heterogeneous faces. We then apply MB-LBP, an extension of LBP operator, to encode the local image structures in the transformed domain, and further learn the most discriminant local features for recognition. Experiments show that the proposed method significantly outperforms existing ones in matching between VIS and NIR face images.

143 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: A novel method for synthesizing VIS images from NIR images based on learning the mappings between images of different spectra is proposed, which reduces the inter-spectral differences significantly, thus allowing effective matching between faces taken under different imaging conditions.
Abstract: This paper deals with a new problem in face recognition research, in which the enrollment and query face samples are captured under different lighting conditions. In our case, the enrollment samples are visual light (VIS) images, whereas the query samples are taken under near infrared (NIR) condition. It is very difficult to directly match the face samples captured under these two lighting conditions due to their different visual appearances. In this paper, we propose a novel method for synthesizing VIS images from NIR images based on learning the mappings between images of different spectra (i.e., NIR and VIS). In our approach, we reduce the inter-spectral differences significantly, thus allowing effective matching between faces taken under different imaging conditions. Face recognition experiments clearly show the efficacy of the proposed approach.

137 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: The apparatuses, environments and procedure of the data collection and present baseline performances of the standard PCA and LDA methods on the database are described and presented.
Abstract: A face database, composed of visual (VIS), near infrared (NIR) and three-dimensional (3D) face images, is collected. Called the HFB face database, it is released now to promote research and development of heterogeneous face biometrics (HFB). This release of version 1 contains a total of 992 images from 100 subjects; there are 4 VIS, 4 NIR, and 1 or 2 3D face images per subject. In this paper, we describe the apparatuses, environments and procedure of the data collection and present baseline performances of the standard PCA and LDA methods on the database.

127 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: In this framework, the detected moving objects are first classified as pedestrians or vehicles via a co-trained classifier which takes advantage of the multiview information of objects and can automatically learn motion patterns respectively for pedestrians and vehicles.
Abstract: Activity analysis is a basic task in video surveillance and has become an active research area. However, due to the diversity of moving objects category and their motion patterns, developing robust semantic scene models for activity analysis remains a challenging problem in traffic scenarios. This paper proposes a novel framework to learn semantic scene models. In this framework, the detected moving objects are first classified as pedestrians or vehicles via a co-trained classifier which takes advantage of the multiview information of objects. As a result, the framework can automatically learn motion patterns respectively for pedestrians and vehicles. Then, a graph is proposed to learn and cluster the motion patterns. To this end, trajectory is parameterized and the image is cut into multiple blocks which are taken as the nodes in the graph. Based on the parameters of trajectories, the primary motion patterns in each node (block) are extracted via Gaussian mixture model (GMM), and supplied to this graph. The graph cut algorithm is finally employed to group the motion patterns together, and trajectories are clustered to learn semantic scene models. Experimental results and applications to real world scenes show the validity of our proposed method.

119 citations


Journal ArticleDOI
TL;DR: A novel hierarchical selecting scheme embedded in linear discriminant analysis (LDA) and AdaBoost learning is proposed to select the most effective and most robust features and to construct a strong classifier for face recognition systems.

108 citations


Book
06 Aug 2009
TL;DR: This comprehensive and innovative handbook covers aspects of biometrics from the perspective of recognizing individuals at a distance, in motion, and under a surveillance scenario.
Abstract: This comprehensive and innovative handbook covers aspects of biometrics from the perspective of recognizing individuals at a distance, in motion, and under a surveillance scenario. Features: Starts with a thorough introductory chapter; Provides topics from a range of different perspectives offered by an international collection of leading researchers in the field; Contains selected expanded contributions from the 5th IAPR International Summer School for Advanced Studies on Biometrics for Secure Authentication; Investigates issues of iris recognition, gait recognition, and touchless fingerprint recognition, as well as various aspects of face recognition; Discusses multibiometric systems, and machine learning techniques; Examines biometrics ethics and policy; Presents international standards in biometrics, including those under preparation. This state-of-the-art volume is designed to help form and inform professionals, young researchers, and graduate students in advanced biometric technologies.

96 citations



Book ChapterDOI
04 Jun 2009
TL;DR: A novel approach for human gait recognition that inherently combines appearance and motion is presented and a new coding of multiresolution uniform Local Binary Patterns is proposed and used in the construction of spatiotemporal LBP histograms.
Abstract: We present a novel approach for human gait recognition that inherently combines appearance and motion. Dynamic texture descriptors, Local Binary Patterns from Three Orthogonal Planes (LBP-TOP), are used to describe human gait in a spatiotemporal way. We also propose a new coding of multiresolution uniform Local Binary Patterns and use it in the construction of spatiotemporal LBP histograms. We show the suitability of the representation for gait recognition and test our method on a popular CMU MoBo dataset. We then compare our result to the state of the art methods.

58 citations


Book ChapterDOI
04 Jun 2009
TL;DR: An enhanced BioHash algorithm is developed by imposing an NXOR mask onto the input to the subsequent error correcting code (ECC) and it enables reliable binding of face biometric features and the biometric key.
Abstract: Biometric encryption is the basis for biometric template protection and information security. While existing methods are based on iris or fingerprint modality, face has so far been considered not reliable enough to meet the requirement for error correcting ability. In this paper, we present a novel biometric key binding method based on near infrared (NIR) face biometric. An enhanced BioHash algorithm is developed by imposing an NXOR mask onto the input to the subsequent error correcting code (ECC). This way, when combined with ECC and NIR face features, it enables reliable binding of face biometric features and the biometric key. Its ability for template protection and information cryptography is guarantied by the theory of encryption. The security level of NIR face recognition system is thereby improved. Experimental results show that the security benefit is gained with a sacrifice of 1-2% drop in the recognition performance.

56 citations


Book ChapterDOI
04 Jun 2009
TL;DR: This paper presents a new method, called face analogy, in the analysis-by-synthesis framework, for heterogeneous face mapping, that is, transforming face images from one type to another, and thereby performingheterogeneous face matching.
Abstract: Face images captured in different spectral bands, e.g. , in visual (VIS) and near infrared (NIR), are said to be heterogeneous. Although a person's face looks different in heterogeneous images, it should be classified as being from the same individual. In this paper, we present a new method, called face analogy , in the analysis-by-synthesis framework, for heterogeneous face mapping, that is, transforming face images from one type to another, and thereby performing heterogeneous face matching. Experiments show promising results.

Book ChapterDOI
04 Jun 2009
TL;DR: Assessment of camera focus is done based on discrete cosine transform (DCT), and several algorithms for face image quality assessment are presented, validated by experiments.
Abstract: Face recognition performance can be significantly influenced by face image quality. The approved ISO/IEC standard 19794-5 has specified recommendations for face photo taking for E-passport and related applications. Standardization of face image quality, ISO/IEC 29794-5, is in progress. Bad illumination, facial pose and out-of-focus are among main reasons that disqualify a face image sample. This paper presents several algorithms for face image quality assessment. Illumination conditions and facial pose are evaluated in terms of facial symmetry, and implemented based on Gabor wavelet features. Assessment of camera focus is done based on discrete cosine transform (DCT). These methods are validated by experiments.

Book ChapterDOI
Dong Yi1, Shengcai Liao1, Zhen Lei1, Jitao Sang1, Stan Z. Li1 
04 Jun 2009
TL;DR: The proposed approach, without knowing statistical characteristics of the subjects or data, outperforms the methods of contrast significantly, with ten-fold higher verification rates at FAR of 0.1%.
Abstract: The latest multi-biometric grand challenge (MBGC 2008) sets up a new experiment in which near infrared (NIR) face videos containing partial faces are used as a probe set and the visual (VIS) images of full faces are used as the target set. This is challenging for two reasons: (1) it has to deal with partially occluded faces in the NIR videos, and (2) the matching is between heterogeneous NIR and VIS faces. Partial face matching is also a problem often confronted in many video based face biometric applications. In this paper, we propose a novel approach for solving this challenging problem. For partial face matching, we propose a local patch based method to deal with partial face data. For heterogeneous face matching, we propose the philosophy of enhancing common features in heterogeneous images while reducing differences. This is realized by using edge-enhancing filters, which at the same time is also beneficial for partial face matching. The approach requires neither learning procedures nor training data. Experiments are performed using the MBGC portal challenge data, comparing with several known state-of-the-arts methods. Extensive results show that the proposed approach, without knowing statistical characteristics of the subjects or data, outperforms the methods of contrast significantly, with ten-fold higher verification rates at FAR of 0.1%.

Book ChapterDOI
01 Jan 2009
TL;DR: The notion of remote biometrics or biometric at a distance is today of paramount importance to provide a secure mean for user-friendly identification and surveillance.
Abstract: The notion of remote biometrics or biometrics at a distance is today of paramount importance to provide a secure mean for user-friendly identification and surveillance.

Book ChapterDOI
01 Jan 2009
TL;DR: This chapter analyzes issues in FRAD system design, which are not addressed in near-distance face recognition, and presents effective solutions for making FRAD systems for practical deployments.
Abstract: Face recognition at a distance (FRAD) is one of the most challenging forms of face recognition applications. In this chapter, we analyze issues in FRAD system design, which are not addressed in near-distance face recognition, and present effective solutions for making FRAD systems for practical deployments. Evaluation of FRAD systems is discussed.

Journal ArticleDOI
TL;DR: P-LDA is developed, in which perturbation random vectors are introduced to learn the effect of the difference between the class empirical mean and its expectation in Fisher criterion, and which outperforms the popular Fisher's LDA-based algorithms in the undersampled case.

Book ChapterDOI
04 Jun 2009
TL;DR: A Bayesian method based on Markov Random Fields modeling for face recognition provides a new perspective for modeling the face recognition problem and demonstrates promising results.
Abstract: In this paper, a Bayesian method for face recognition is proposed based on Markov Random Fields (MRF) modeling. Constraints on image features as well as contextual relationships between them are explored and encoded into a cost function derived based on a statistical model of MRF. Gabor wavelet coefficients are used as the base features, and relationships between Gabor features at different pixel locations are used to provide higher order contextual constraints. The posterior probability of matching configuration is derived based on MRF modeling. Local search and discriminate analysis are used to evaluate local matches, and a contextual constraint is applied to evaluate mutual matches between local matches. The proposed MRF method provides a new perspective for modeling the face recognition problem. Experiments demonstrate promising results.

Book ChapterDOI
04 Jun 2009
TL;DR: Experiments show the effectiveness of the proposed method for partial face alignment based on scale invariant feature transform, especially when PCA subspace, shape and temporal constraint are utilized.
Abstract: Face recognition with partial face images is an important problem in face biometrics. The necessity can arise in not so constrained environments such as in surveillance video, or portal video as provided in Multiple Biometrics Grand Challenge (MBGC). Face alignment with partial face images is a key step toward this challenging problem. In this paper, we present a method for partial face alignment based on scale invariant feature transform (SIFT). We first train a reference model using holistic faces, in which the anchor points and their corresponding descriptor subspaces are learned from initial SIFT keypoints and the relationships between the anchor points are also derived. In the alignment stage, correspondences between the learned holistic face model and an input partial face image are established by matching keypoints of the partial face to the anchor points of the learned face model. Furthermore, shape constraint is used to eliminate outlier correspondences and temporal constraint is explored to find more inliers. Alignment is finally accomplished by solving a similarity transform. Experiments on the MBGC near infrared video sequences show the effectiveness of the proposed method, especially when PCA subspace, shape and temporal constraint are utilized.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This work designs an adaptive object classification framework which automatically adjust to different scenes and proposes a Bayesian classifier based method to detect and remove outliers to cope with contingent generalization disaster resulted from utilizing high confidence but incorrectly classified training samples.
Abstract: Surveillance system involving hundreds of cameras becomes very popular. Due to various positions and orientations of camera, object appearance changes dramatically in different scenes. Traditional appearance based object classification methods tend to fail under these situations. We approach the problem by designing an adaptive object classification framework which automatically adjust to different scenes. Firstly, a baseline object classifier is applied to specific scene, generating training samples with extracted scene-specific features (such as object position). Based on that, bilateral weighted LDA is trained under the guide of sample confidence. Moreover, we propose a Bayesian classifier based method to detect and remove outliers to cope with contingent generalization disaster resulted from utilizing high confidence but incorrectly classified training samples. To validate these ideas, we realize the framework into an intelligent surveillance system. Experimental results demonstrate the effectiveness of this adaptive object classification framework.

Book ChapterDOI
Zhen Lei1, Shengcai Liao1, Dong Yi1, Rui Qin1, Stan Z. Li1 
04 Jun 2009
TL;DR: The proposed LSR-LDA method improves the recognition accuracy over the conventional LDA by using the LSR step and the final projection matrix is obtained by multiply the pre-transformation matrix and the projective directions of LDA.
Abstract: Linear discriminant analysis (LDA) is a popular method in pattern recognition and is equivalent to Bayesian method when the sample distributions of different classes are obey to the Gaussian with the same covariance matrix. However, in real world, the distribution of data is usually far more complex and the assumption of Gaussian density with the same covariance is seldom to be met which greatly affects the performance of LDA. In this paper, we propose an effective and efficient two step LDA, called LSR-LDA, to alleviate the affection of irregular distribution to improve the result of LDA. First, the samples are normalized so that the variances of variables in each class are consistent, and a pre-transformation matrix from the original data to the normalized one is learned using least squares regression (LSR); second, conventional LDA is conducted on the normalized data to find the most discriminant projective directions. The final projection matrix is obtained by multiply the pre-transformation matrix and the projective directions of LDA. Experimental results on FERET and FRGC ver 2.0 face databases show the proposed LSR-LDA method improves the recognition accuracy over the conventional LDA by using the LSR step.


Proceedings ArticleDOI
20 Jun 2009
TL;DR: Stochastic gradient mean-shift (SG-MS) is presented along with its approximation performance analysis and applied to the speedup of Gaussian blurring mean- shift (GBMS) clustering, showing that the former significantly outperforms the latter in running time.
Abstract: As a well known fixed-point iteration algorithm for kernel density mode-seeking, mean-shift has attracted wide attention in pattern recognition field. To date, mean-shift algorithm is typically implemented in a batch way with the entire data set known at once. In this paper, based on stochastic gradient optimization technique, we present the stochastic gradient mean-shift (SG-MS) along with its approximation performance analysis. We apply SG-MS to the speedup of Gaussian blurring mean-shift (GBMS) clustering. Experiments in toy problems and image segmentation show that, while the clustering accuracy is comparable between SG-GBMS and Naive-GBMS, the former significantly outperforms the latter in running time.

Book ChapterDOI
01 Jan 2009
TL;DR: In this article, a probabilistic distribution function has two essential elements: the form of the function and the involved parameters, which are the essential elements for estimating the in-volved parameters.
Abstract: A probabilistic distribution function has two essential elements: the form of the function and the involved parameters For example, the joint distribu­tion of an MRF is characterized by a Gibbs function with a set of clique potential parameters; and the noise by a zero-mean Gaussian distribution parameterized by a variance A probability model is incomplete if the in­volved parameters are not all specified even if the functional form of the distribution is known While formulating the forms of objective functions such as the posterior distribution has long been a subject of research for in vision, estimating the involved parameters has a much shorter history Generally, it is performed by optimizing a statistical criterion, eg using existing techniques such as maximum likelihood, coding, pseudo-likelihood, expectation-maximization, Bayes

Proceedings ArticleDOI
01 Sep 2009
TL;DR: Experimental results shows that the proposed GRF-NMF algorithm significantly outperforms other NMF related algorithms in sparsity, smoothness, and locality of the learned components.
Abstract: In this paper, we present a Gibbs Random Field (GRF) modeling based Nonnegative Matrix Factorization (NMF) algorithm, called GRF-NMF. We propose to treat the component matrix of NMF as a Gibbs random field. Since each component presents a localized object part, as usually expected, we propose an energy function with the prior knowledge of smoothness and locality. This way of directly modeling on the structure of components makes the algorithm able to learn sparse, smooth, and localized object parts. Furthermore, we find that at each update iteration, the constrained term can be processed conveniently via local filtering on components. Finally we give a well established convergence proof for the derived algorithm. Experimental results on both synthesized and real image databases shows that the proposed GRF-NMF algorithm significantly outperforms other NMF related algorithms in sparsity, smoothness, and locality of the learned components.

Proceedings ArticleDOI
01 Sep 2009
TL;DR: Following traditional linear discriminant analysis (LDA), the between and with-in class scatter with correlation metric is redefined and an efficient Stepwise Correlation metric based Discriminant Analysis (SCDA) method is proposed to derive the sub-optimal discriminant subspace to be classified with correlation similarity.
Abstract: Face recognition is a great challenge in practice. Subspace learning method is one of the dominant methods and has achieved great success in face recognition area. In subspace learning, many researches have found that correlation similarity (e.g. cosine distance) usually achieves better classification results than L2 distance with nearest neighbor (NN) classifier in Euclidean space. However, in traditional methods, most of them are devoted to optimize the objective function based on L2 distance, which is not coincident with the classification rule. It is reasonable to obtain better results by optimizing the objective function with correlation metric directly. In this paper, following traditional linear discriminant analysis (LDA), we redefine the between and with-in class scatter with correlation metric and propose an efficient Stepwise Correlation metric based Discriminant Analysis (SCDA) method to derive the sub-optimal discriminant subspace to be classified with correlation similarity. Moreover, we propose a novel weighted fusion mechanism to learn the optimal combination of multi-probe images to be classified. Extensive experiments on PIE and extended Yale-B databases validate the effectiveness of SCDA and the learning based weighted image fusion method.


Book ChapterDOI
01 Jan 2009
TL;DR: This chapter presents MAP-MRF formulations for solving high level vision tasks, such as object matching and recognition and pose estimation, which fall into categories LP3 and LP4.
Abstract: High level vision tasks, such as object matching and recognition and pose estimation, are performed on features extracted from images. The arrangements of such features are usually irregular and hence the problems fall into categories LP3 and LP4. In this chapter, we present MAP-MRF formulations for solving these problems.

Book ChapterDOI
01 Jan 2009
TL;DR: The minimal solution is usually defined as the global one or one of them when there are multiple global minima when the energy function contains many local minima.
Abstract: The minimal solution is usually defined as the global one or one of them when there are multiple global minima. Finding a global minimum is non-trivial if the energy function contains many local minima. Whereas methods for local minimization are quite mature with commercial software on market, the study of global minimization is still young. There are no efficient algorithms which guarantee to find globally minimal solutions as are there for local minimization.

01 Jan 2009
TL;DR: P-LDA is developed, in which perturbation random vectors are introduced to learn the effect of the difference between the class empirical mean and its expectation in Fisher criterion, and which outperforms the popular Fisher's LDA-based algorithms in the undersampled case.
Abstract: Fisher's linear discriminant analysis (LDA) is popular for dimension reduction and extraction of discriminant features in many pattern recognition applications, especially biometric learning. In deriving the Fisher's LDA formulation, there is an assumption that the class empirical mean is equal to its expectation. However, this assumption may not be valid in practice. In this paper, from the “perturbation” perspective, we develop a new algorithm, called perturbation LDA (P-LDA), in which perturbation random vectors are introduced to learn the effect of the difference between the class empirical mean and its expectation in Fisher criterion. This perturbation learning in Fisher criterion would yield new forms of within-class and between-class covariance matrices integrated with some perturbation factors. Moreover, a method is proposed for estimation of the covariance matrices of perturbation random vectors for practical implementation. The proposed P-LDA is evaluated on both synthetic data sets and real face image data sets. Experimental results show that P-LDA outperforms the popular Fisher's LDA-based algorithms in the undersampled case.