scispace - formally typeset
Search or ask a question
Author

Shiguang Shan

Bio: Shiguang Shan is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Facial recognition system & Face (geometry). The author has an hindex of 76, co-authored 475 publications receiving 23566 citations. Previous affiliations of Shiguang Shan include University of Maryland, College Park & Media Research Center.


Papers
More filters
Proceedings ArticleDOI
17 Oct 2005
TL;DR: A novel non-statistics based face representation approach, local Gabor binary pattern histogram sequence (LGBPHS), in which training procedure is unnecessary to construct the face model, so that the generalizability problem is naturally avoided.
Abstract: For years, researchers in face recognition area have been representing and recognizing faces based on subspace discriminant analysis or statistical learning. Nevertheless, these approaches are always suffering from the generalizability problem. This paper proposes a novel non-statistics based face representation approach, local Gabor binary pattern histogram sequence (LGBPHS), in which training procedure is unnecessary to construct the face model, so that the generalizability problem is naturally avoided. In this approach, a face image is modeled as a "histogram sequence" by concatenating the histograms of all the local regions of all the local Gabor magnitude binary pattern maps. For recognition, histogram intersection is used to measure the similarity of different LGBPHSs and the nearest neighborhood is exploited for final classification. Additionally, we have further proposed to assign different weights for each histogram piece when measuring two LGBPHSes. Our experimental results on AR and FERET face database show the validity of the proposed approach especially for partially occluded face images, and more impressively, we have achieved the best result on FERET face database.

1,093 citations

Journal ArticleDOI
TL;DR: Experimental results on the Brodatz and KTH-TIPS2-a texture databases show that WLD impressively outperforms the other widely used descriptors (e.g., Gabor and SIFT), and experimental results on human face detection also show a promising performance comparable to the best known results onThe MIT+CMU frontal face test set, the AR face data set, and the CMU profile test set.
Abstract: Inspired by Weber's Law, this paper proposes a simple, yet very powerful and robust local descriptor, called the Weber Local Descriptor (WLD). It is based on the fact that human perception of a pattern depends not only on the change of a stimulus (such as sound, lighting) but also on the original intensity of the stimulus. Specifically, WLD consists of two components: differential excitation and orientation. The differential excitation component is a function of the ratio between two terms: One is the relative intensity differences of a current pixel against its neighbors, the other is the intensity of the current pixel. The orientation component is the gradient orientation of the current pixel. For a given image, we use the two components to construct a concatenated WLD histogram. Experimental results on the Brodatz and KTH-TIPS2-a texture databases show that WLD impressively outperforms the other widely used descriptors (e.g., Gabor and SIFT). In addition, experimental results on human face detection also show a promising performance comparable to the best known results on the MIT+CMU frontal face test set, the AR face data set, and the CMU profile test set.

1,007 citations

Journal ArticleDOI
01 Jan 2008
TL;DR: The evaluation protocol based on the CAS-PEAL-R1 database is discussed and the performance of four algorithms are presented as a baseline to do the following: elementarily assess the difficulty of the database for face recognition algorithms; preference evaluation results for researchers using the database; and identify the strengths and weaknesses of the commonly used algorithms.
Abstract: In this paper, we describe the acquisition and contents of a large-scale Chinese face database: the CAS-PEAL face database. The goals of creating the CAS-PEAL face database include the following: 1) providing the worldwide researchers of face recognition with different sources of variations, particularly pose, expression, accessories, and lighting (PEAL), and exhaustive ground-truth information in one uniform database; 2) advancing the state-of-the-art face recognition technologies aiming at practical applications by using off-the-shelf imaging equipment and by designing normal face variations in the database; and 3) providing a large-scale face database of Mongolian. Currently, the CAS-PEAL face database contains 99 594 images of 1040 individuals (595 males and 445 females). A total of nine cameras are mounted horizontally on an arc arm to simultaneously capture images across different poses. Each subject is asked to look straight ahead, up, and down to obtain 27 images in three shots. Five facial expressions, six accessories, and 15 lighting changes are also included in the database. A selected subset of the database (CAS-PEAL-R1, containing 30 863 images of the 1040 subjects) is available to other researchers now. We discuss the evaluation protocol based on the CAS-PEAL-R1 database and present the performance of four algorithms as a baseline to do the following: 1) elementarily assess the difficulty of the database for face recognition algorithms; 2) preference evaluation results for researchers using the database; and 3) identify the strengths and weaknesses of the commonly used algorithms.

971 citations

Proceedings ArticleDOI
27 Jun 2016
TL;DR: A novel Deep Supervised Hashing method to learn compact similarity-preserving binary code for the huge body of image data and extensive experiments show the promising performance of the method compared with the state-of-the-arts.
Abstract: In this paper, we present a new hashing method to learn compact binary codes for highly efficient image retrieval on large-scale datasets. While the complex image appearance variations still pose a great challenge to reliable retrieval, in light of the recent progress of Convolutional Neural Networks (CNNs) in learning robust image representation on various vision tasks, this paper proposes a novel Deep Supervised Hashing (DSH) method to learn compact similarity-preserving binary code for the huge body of image data. Specifically, we devise a CNN architecture that takes pairs of images (similar/dissimilar) as training inputs and encourages the output of each image to approximate discrete values (e.g. +1/-1). To this end, a loss function is elaborately designed to maximize the discriminability of the output space by encoding the supervised information from the input image pairs, and simultaneously imposing regularization on the real-valued outputs to approximate the desired discrete values. For image retrieval, new-coming query images can be easily encoded by propagating through the network and then quantizing the network outputs to binary codes representation. Extensive experiments on two large scale datasets CIFAR-10 and NUS-WIDE show the promising performance of our method compared with the state-of-the-arts.

699 citations

Journal ArticleDOI
TL;DR: The proposed method is extended for attribute style manipulation in an unsupervised manner and outperforms the state-of-the-art on realistic attribute editing with other facial details well preserved.
Abstract: Facial attribute editing aims to manipulate single or multiple attributes on a given face image, i.e., to generate a new face image with desired attributes while preserving other details. Recently, the generative adversarial net (GAN) and encoder–decoder architecture are usually incorporated to handle this task with promising results. Based on the encoder–decoder architecture, facial attribute editing is achieved by decoding the latent representation of a given face conditioned on the desired attributes. Some existing methods attempt to establish an attribute-independent latent representation for further attribute editing. However, such attribute-independent constraint on the latent representation is excessive because it restricts the capacity of the latent representation and may result in information loss, leading to over-smooth or distorted generation. Instead of imposing constraints on the latent representation, in this work, we propose to apply an attribute classification constraint to the generated image to just guarantee the correct change of desired attributes, i.e., to change what you want. Meanwhile, the reconstruction learning is introduced to preserve attribute-excluding details, in other words, to only change what you want. Besides, the adversarial learning is employed for visually realistic editing. These three components cooperate with each other forming an effective framework for high quality facial attribute editing, referred as AttGAN . Furthermore, the proposed method is extended for attribute style manipulation in an unsupervised manner. Experiments on two wild datasets, CelebA and LFW, show that the proposed method outperforms the state-of-the-art on realistic attribute editing with other facial details well preserved.

633 citations


Cited by
More filters
Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work revisits both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network.
Abstract: In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify. We revisit both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network. This deep network involves more than 120 million parameters using several locally connected layers without weight sharing, rather than the standard convolutional layers. Thus we trained it on the largest facial dataset to-date, an identity labeled dataset of four million facial images belonging to more than 4, 000 identities. The learned representations coupling the accurate model-based alignment with the large facial database generalize remarkably well to faces in unconstrained environments, even with a simple classifier. Our method reaches an accuracy of 97.35% on the Labeled Faces in the Wild (LFW) dataset, reducing the error of the current state of the art by more than 27%, closely approaching human-level performance.

6,132 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.
Abstract: We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.

6,122 citations

01 Oct 2008
TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.
Abstract: Most face databases have been created under controlled conditions to facilitate the study of specific parameters on the face recognition problem. These parameters include such variables as position, pose, lighting, background, camera quality, and gender. While there are many applications for face recognition technology in which one can control the parameters of image acquisition, there are also many applications in which the practitioner has little or no control over such parameters. This database, Labeled Faces in the Wild, is provided as an aid in studying the latter, unconstrained, recognition problem. The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life. The database exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background. In addition to describing the details of the database, we provide specific experimental paradigms for which the database is suitable. This is done in an effort to make research performed with the database as consistent and comparable as possible. We provide baseline results, including results of a state of the art face recognition system combined with a face alignment system. To facilitate experimentation on the database, we provide several parallel databases, including an aligned version.

5,742 citations

Journal ArticleDOI
TL;DR: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features that is assessed in the face recognition problem under different challenges.
Abstract: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features. The face image is divided into several regions from which the LBP feature distributions are extracted and concatenated into an enhanced feature vector to be used as a face descriptor. The performance of the proposed method is assessed in the face recognition problem under different challenges. Other applications and several extensions are also discussed

5,563 citations