scispace - formally typeset
Search or ask a question

Showing papers by "Shiguang Shan published in 2008"


Journal ArticleDOI
01 Jan 2008
TL;DR: The evaluation protocol based on the CAS-PEAL-R1 database is discussed and the performance of four algorithms are presented as a baseline to do the following: elementarily assess the difficulty of the database for face recognition algorithms; preference evaluation results for researchers using the database; and identify the strengths and weaknesses of the commonly used algorithms.
Abstract: In this paper, we describe the acquisition and contents of a large-scale Chinese face database: the CAS-PEAL face database. The goals of creating the CAS-PEAL face database include the following: 1) providing the worldwide researchers of face recognition with different sources of variations, particularly pose, expression, accessories, and lighting (PEAL), and exhaustive ground-truth information in one uniform database; 2) advancing the state-of-the-art face recognition technologies aiming at practical applications by using off-the-shelf imaging equipment and by designing normal face variations in the database; and 3) providing a large-scale face database of Mongolian. Currently, the CAS-PEAL face database contains 99 594 images of 1040 individuals (595 males and 445 females). A total of nine cameras are mounted horizontally on an arc arm to simultaneously capture images across different poses. Each subject is asked to look straight ahead, up, and down to obtain 27 images in three shots. Five facial expressions, six accessories, and 15 lighting changes are also included in the database. A selected subset of the database (CAS-PEAL-R1, containing 30 863 images of the 1040 subjects) is available to other researchers now. We discuss the evaluation protocol based on the CAS-PEAL-R1 database and present the performance of four algorithms as a baseline to do the following: 1) elementarily assess the difficulty of the database for face recognition algorithms; 2) preference evaluation results for researchers using the database; and 3) identify the strengths and weaknesses of the commonly used algorithms.

971 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: The proposed MMD method outperforms the competing methods on the task of Face Recognition based on Image Set, and a novel manifold learning approach is proposed, which expresses a manifold by a collection of local linear models, each depicted by a subspace.
Abstract: In this paper, we address the problem of classifying image sets, each of which contains images belonging to the same class but covering large variations in, for instance, viewpoint and illumination. We innovatively formulate the problem as the computation of Manifold-Manifold Distance (MMD), i.e., calculating the distance between nonlinear manifolds each representing one image set. To compute MMD, we also propose a novel manifold learning approach, which expresses a manifold by a collection of local linear models, each depicted by a subspace. MMD is then converted to integrating the distances between pair of subspaces respectively from one of the involved manifolds. The proposed MMD method is evaluated on the task of Face Recognition based on Image Set (FRIS). In FRIS, each known subject is enrolled with a set of facial images and modeled as a gallery manifold, while a testing subject is modeled as a probe manifold, which is then matched against all the gallery manifolds by MMD. Identification is achieved by seeking the minimum MMD. Experimental results on two public face databases, Honda/UCSD and CMU MoBo, demonstrate that the proposed MMD method outperforms the competing methods.

443 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: A novel type of feature for fast and accurate face detection called Locally Assembled Binary (LAB) Haar feature, which is basically inspired by the success ofHaar feature and Local Binary Pattern for face detection, but it is far beyond a simple combination.
Abstract: In this paper, we describe a novel type of feature for fast and accurate face detection The feature is called Locally Assembled Binary (LAB) Haar feature LAB feature is basically inspired by the success of Haar feature and Local Binary Pattern (LBP) for face detection, but it is far beyond a simple combination In our method, Haar features are modified to keep only the ordinal relationship (named by binary Haar feature) rather than the difference between the accumulated intensities Several neighboring binary Haar features are then assembled to capture their co-occurrence with similar idea to LBP We show that the feature is more efficient than Haar feature and LBP both in discriminating power and computational cost Furthermore, a novel efficient detection method called feature-centric cascade is proposed to build an efficient detector, which is developed from the feature-centric method Experimental results on the CMU+MIT frontal face test set and CMU profile test set show that the proposed method can achieve very good results and amazing detection speed

116 citations


Proceedings ArticleDOI
01 Sep 2008
TL;DR: The experimental results in this paper show that designing feature set for age estimation under the guidance of hierarchical face model is a promising method and a flexible framework as well.
Abstract: A key point in automatic age estimation is to design feature set essential to age perception. To achieve this goal, this paper builds up a hierarchical graphical face model for faces appearing at low, middle and high resolution respectively. Along the hierarchy, a face image is decomposed into detailed parts from coarse to fine. Then four types of features are extracted from this graph representation guided by the priors of aging process embedded in the graphical model: topology, geometry, photometry and configuration. On age estimation, this paper follows the popular regression formulation for mapping feature vectors to its age label. The effectiveness of the presented feature set is justified by testing results on two datasets using different kinds of regression methods. The experimental results in this paper show that designing feature set for age estimation under the guidance of hierarchical face model is a promising method and a flexible framework as well.

92 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: Although the proposed Weber local descriptor is a simple, yet very powerful and robust local descriptor, the classifier obtains a comparable performance to state-of-the-art methods on MIT+CMU frontal face test set, AR face dataset and CMU profile test set.
Abstract: Inspired by Weberpsilas law, this paper proposes a simple, yet very powerful and robust local descriptor, Weber local descriptor (WLD). It is based on the fact that human perception of a pattern depends on not only the change of a stimulus (such as sound, lighting, et al.) but also the original intensity of the stimulus. Specifically, WLD consists of two components: its differential excitation and orientation. A differential excitation is a function of the ratio between two terms: one is the relative intensity differences of its neighbors against a current pixel; the other is the intensity of the current pixel. An orientation is the gradient orientation of the current pixel. For a given image, we use the differential excitation and the orientation components to construct a concatenated WLD histogram feature. Experimental results on Brodatz textures show that WLD impressively outperforms the other classical descriptors (e.g., Gabor). Especially, experimental results on face detection show a promising performance. Although we train only one classifier based on WLD features, the classifier obtains a comparable performance to state-of-the-art methods on MIT+CMU frontal face test set, AR face dataset and CMU profile test set.

50 citations


Proceedings ArticleDOI
01 Dec 2008
TL;DR: This paper proposes volume based local Gabor binary patterns (V-LGBP) for face representation and recognition, which converts the Gabor transformed images into multiple index maps and redefined its uniform patterns via statistical analysis.
Abstract: In this paper, we propose volume based local Gabor binary patterns (V-LGBP) for face representation and recognition. In our method, the Gabor feature set of each gray image is regarded as a three dimensional ldquovolumerdquo, where the first two dimensions are spatial domain and the third dimension is the Gabor filter index. Then, the neighborhood order relationship in the ldquovolumerdquo is encoded by local binary patterns (LBP), which converts the Gabor transformed images into multiple index maps. Finally, the spatial histograms of all the V-LGBP index maps are concatenated together to represent the facial appearances. In addition, in order to reflect the uniform appearances of V-LGBP, its uniform patterns are redefined via statistical analysis. Extensive experiments on FERET dataset validate the effectiveness of our approach.

36 citations


Journal ArticleDOI
01 Dec 2008
TL;DR: It is argued and revealed that the asymmetry in the intensities of each row of the face image is closely relevant to the yaw rotation of the head and, at the same time, evidently insensitive to the identity of the input face.
Abstract: This paper proposes a novel method to estimate the head yaw rotations based on the asymmetry of 2-D facial appearance. In traditional appearance-based pose estimation methods, features are typically extracted holistically by subspace analysis such as principal component analysis, linear discriminant analysis (LDA), etc., which are not designed to directly model the pose variations. In this paper, we argue and reveal that the asymmetry in the intensities of each row of the face image is closely relevant to the yaw rotation of the head and, at the same time, evidently insensitive to the identity of the input face. Specifically, to extract the asymmetry information, 1-D Gabor filters and Fourier transform are exploited. LDA is further applied to the asymmetry features to enhance the discrimination ability. By using the simple nearest centroid classifier, experimental results on two multipose databases show that the proposed features outperform other features. In particular, the generalization of the proposed asymmetry features is verified by the impressive performance when the training and the testing data sets are heterogeneous.

32 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: A novel framework called unified principal component analysis (UPCA) is proposed, from which the traditional PCA and its 2D counterparts can be deduced as special cases, and a novel concept, named generalized covariance matrix (GCM), is introduced.
Abstract: Recently, 2DPCA and its variants have attracted much attention in face recognition area. In this paper, some efforts are made to discover the underlying fundaments of these methods, and a novel framework called unified principal component analysis (UPCA) is proposed. First, we introduce a novel concept, named generalized covariance matrix (GCM), which is naturally derived from the traditional covariance matrix (CM). Each element of GCM is a generalized covariance of two random vectors rather than two scalar variables in CM. Based on GCM, the UPCA framework is proposed, from which the traditional PCA and its 2D counterparts can be deduced as special cases. Furthermore, under the UPCA framework, we not only revisit the existing 2D PCA methods and their limitations, but also propose two new methods: the grid-sampling method (GridPCA) and the intra-group correlation reduction method. Extensive experimental results on the FERET face database support the theoretical analysis and validate the feasibility of the proposed methods.

26 citations


Proceedings ArticleDOI
01 Sep 2008
TL;DR: A novel homomorphic wavelet filtering based illumination transfer technique to change the dominant lighting of one face image to another face image (reference face image) and is quite effective for both illumination transfer and illumination-insensitive face recognition.
Abstract: In this paper, we propose a novel homomorphic wavelet filtering based illumination transfer technique to change the dominant lighting of one face image (source face image) to another face image (reference face image ). Specifically, in the proposed method, based on the ldquoreflectance-illuminationrdquo imaging model, we first obtain an approximate estimate of the illumination component of the face image through a wavelet-based Multiresolution Analysis (MRA) in the logarithm domain of the input image. Then, a homomorphic filtering procedure is applied to improve the accuracy of the illumination component estimation. Finally, the source face image is re-lighted by substituting the estimated illumination component of the reference image for that of the source image. The proposed method is entirely an image processing based method without any 3D geometry modeling steps, so it is simple but effective. The method is also applied easily to illumination invariant face recognition by transferring a standard illumination to all the face images. Experimental results show that our method is quite effective for both illumination transfer and illumination-insensitive face recognition.

20 citations


Proceedings ArticleDOI
01 Dec 2008
TL;DR: A robust hierarchical background subtraction technique which takes the spatial relations of neighboring pixels in a local region into account to detect objects in difficult conditions is proposed.
Abstract: We propose a robust hierarchical background subtraction technique which takes the spatial relations of neighboring pixels in a local region into account to detect objects in difficult conditions. Our algorithm combines a per-pixel with a per-region background model in a hierarchical manner, which accentuates the advantages of each. This is a natural combination because the two models have complementary strengths. The per-pixel background model is achieved by mixture of Gaussians models (GMM) with RGB feature. Although precisely describing background change in high resolution, it suffers from the sensitivity to quick variations in dynamic environment. To tolerate these quick variations, we further develop a novel GMM based per-region background model, which is updated by the cluster centers obtained from a k-means clustering of the pixelspsila RGB feature in the region. Numerical and qualitative experimental results on challenging videos demonstrate the robustness of the proposed method.

15 citations


Proceedings ArticleDOI
01 Dec 2008
TL;DR: A new method is proposed which predicts high-resolution images and the corresponding features simultaneously and does not require super-resolution as an explicit preprocessing step, and explores a constrained hallucination that considers the local consistency in the image grid.
Abstract: In facial image analysis, image resolution is an important factor which has great influence on the performance of face recognition systems. As for low-resolution face recognition problem, traditional methods usually carry out super-resolution firstly before passing the super-resolved image to a face recognition system. In this paper, we propose a new method which predicts high-resolution images and the corresponding features simultaneously. More specifically, we propose ldquofeature hallucinationrdquo to project facial images with low-resolution into an expected feature space. As a result, the proposed method does not require super-resolution as an explicit preprocessing step. In addition, we explore a constrained hallucination that considers the local consistency in the image grid. In our method, we use the index of local visual primitives as features and a block-based histogram distance to measure the similarity for the face recognition. Experimental results on FERET face database verify that the proposed method can improve both visual quality and recognition rate for low-resolution facial images.

Proceedings ArticleDOI
Annan Li, Shiguang Shan, Xilin Chen, Xiujuan Chai1, Wen Gao2 
01 Sep 2008
TL;DR: By learning the projection onto hidden subspaces based on maximum correlation criteria and optimizing the linear transform between the hidden spaces, 3D facial shape is inferred from the intensity image.
Abstract: This paper presents a method for recovering 3D facial shape from single image via learning the relationship between the 2D intensity images and the 3D facial shapes With a coupled training set, the intensity images and their corresponding facial shapes make up two vector spaces respectively But only the correlated components in both spaces are useful for inference, so there must be embedded hidden subspaces in each space which preserve the inter-space correlation information Thus by learning the projection onto hidden subspaces based on maximum correlation criteria and optimizing the linear transform between the hidden spaces, 3D facial shape is inferred from the intensity image The effectiveness of the method is demonstrated on both synthesized and real world data

Proceedings ArticleDOI
01 Sep 2008
TL;DR: In CDDA, the perceptional distance between two classes is exploited to weight the outer product in the between-class scatter computation, to concentrate more on the classes difficult to separate.
Abstract: Traditional discriminate analysis treats all the involved classes equally in the computation of the between-class scatter matrix. However, we find that for many vision tasks, the classes to be processed are not equal in perception, i.e. a distance metric can be defined between the classes. Typical examples include head pose classification and age estimation. Aiming at this category of classification problem, this paper proposes a novel discriminant analysis method, called Class Distance based Discriminant Analysis (CDDA). In CDDA, the perceptional distance between two classes is exploited to weight the outer product in the between-class scatter computation, to concentrate more on the classes difficult to separate. Another novelty of CDDA is that to preserve the within-class local structure of multimodal labeled data, the within-class scatter is re-defined by complementing the similarity of the samples pairs in the nearby classes. The method is then applied to head pose classification and age estimation problem, and experimental results demonstrate the effectiveness of CDDA.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: This paper makes an effort to propose a new computational paradigm named optimal discriminatory projection pursuit (ODPP), which is totally different from the traditional LDA and its variants, and shows that the new ldquoprojection pursuitrdquo paradigm not only does not suffer from the limitations of theTraditional LDA but also inherits good generalizability from the boundary attribute of candidate projections.
Abstract: Linear discriminant analysis (LDA) might be the most widely used linear feature extraction method in pattern recognition. Based on the analysis on the several limitations of traditional LDA, this paper makes an effort to propose a new computational paradigm named optimal discriminatory projection pursuit (ODPP), which is totally different from the traditional LDA and its variants. Only two simple steps are involved in the proposed ODPP: one is the construction of candidate projection set; the other is the optimal discriminatory projection pursuit. For the former step, candidate projections are generated as the difference vectors between nearest between-class boundary samples with redundancy well-controlled, while the latter is efficiently achieved by classifiability-based AdaBoost learning from the large candidate projection set. We show that the new ldquoprojection pursuitrdquo paradigm not only does not suffer from the limitations of the traditional LDA but also inherits good generalizability from the boundary attribute of candidate projections. Extensive experimental comparisons with LDA and its variants on synthetic and real data sets show that the proposed method consistently has better performances.

Proceedings ArticleDOI
05 Dec 2008
TL;DR: A novel background subtraction algorithm, which takes both texture and motion information into account, and combines the texture pattern- based and motion pattern-based background model.
Abstract: In this paper, we propose a novel background subtraction algorithm, which takes both texture and motion information into account. Texture information is represented by local binary pattern (LBP), which is tolerant of illumination changes and is computational simplicity. Assuming that there is significant structure in the correlations between observations across time, we propose a novel operator to extract motion information. Then, each pixel is modeled as a group of texture pattern histograms and motion pattern histograms respectively. Finally, we combine the texture pattern-based and motion pattern-based background model. Experimental results on challenging videos demonstrate the robustness and effectiveness of the proposed method.

Proceedings ArticleDOI
01 Dec 2008
TL;DR: A deep investigation on the non-parametric density estimation finds that minimizing/maximizing the distances between each data sample and its nearby similar/dissimilar samples is equivalent to minimizing an upper bound of the Bayesian error rate.
Abstract: In this paper, we propose a non-parametric discriminant analysis method (no assumption on the distributions of classes), called Parzen discriminant analysis (PDA). Through a deep investigation on the non-parametric density estimation, we find that minimizing/maximizing the distances between each data sample and its nearby similar/dissimilar samples is equivalent to minimizing an upper bound of the Bayesian error rate. Based on this theoretical analysis, we define our criterion as maximizing the average local dissimilarity scatter with respect to a fixed average local similarity scatter. All local scatters are calculated in fixed size local regions, resembling the idea of Parzen estimation. Experiments in UCI machine learning database show that our method impressively outperforms other related neighbor based non-parametric methods.