scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 2012"


Proceedings ArticleDOI
16 Jun 2012
TL;DR: It is shown that tree-structured models are surprisingly effective at capturing global elastic deformation, while being easy to optimize unlike dense graph structures, in real-world, cluttered images.
Abstract: We present a unified model for face detection, pose estimation, and landmark estimation in real-world, cluttered images. Our model is based on a mixtures of trees with a shared pool of parts; we model every facial landmark as a part and use global mixtures to capture topological changes due to viewpoint. We show that tree-structured models are surprisingly effective at capturing global elastic deformation, while being easy to optimize unlike dense graph structures. We present extensive results on standard face benchmarks, as well as a new “in the wild” annotated dataset, that suggests our system advances the state-of-the-art, sometimes considerably, for all three tasks. Though our model is modestly trained with hundreds of faces, it compares favorably to commercial systems trained with billions of examples (such as Google Picasa and face.com).

2,340 citations


01 Jan 2012
TL;DR: The principles of the LBP method and implementation to perform face recognition are presented and high recognition rates are obtained, especially compared to other face recognition methods.
Abstract: This paper is about providing efficient face recognition i.e. feature extraction and face matching system using local binary patterns (LBP) method. It is a texture based algorithm for face recognition which describes the texture and shape of digital images. The preprocessed or facial image is first divided into small blocks from which LBP histograms are formed and then concatenated into a single feature vector. This feature vector plays a vital role in efficient representation of the face and is used to measure similarities by calculating the distance between Images. This paper presents the principles of the method and implementation to perform face recognition. Experiments have been carried out on Yale data set; high recognition rates are obtained, especially compared to other face recognition methods. Also few extensions are investigated and implemented successfully to further improve the performance of the method.

717 citations


Journal ArticleDOI
TL;DR: This work proposes a conceptually simple face recognition system that achieves a high degree of robustness and stability to illumination variation, image misalignment, and partial occlusion, and demonstrates how to capture a set of training images with enough illumination variation that they span test images taken under uncontrolled illumination.
Abstract: Many classic and contemporary face recognition algorithms work well on public data sets, but degrade sharply when they are used in a real recognition system. This is mostly due to the difficulty of simultaneously handling variations in illumination, image misalignment, and occlusion in the test image. We consider a scenario where the training images are well controlled and test images are only loosely controlled. We propose a conceptually simple face recognition system that achieves a high degree of robustness and stability to illumination variation, image misalignment, and partial occlusion. The system uses tools from sparse representation to align a test face image to a set of frontal training images. The region of attraction of our alignment algorithm is computed empirically for public face data sets such as Multi-PIE. We demonstrate how to capture a set of training images with enough illumination variation that they span test images taken under uncontrolled illumination. In order to evaluate how our algorithms work under practical testing conditions, we have implemented a complete face recognition system, including a projector-based training acquisition system. Our system can efficiently and effectively recognize faces under a variety of realistic conditions, using only frontal images under the proposed illuminations as training.

669 citations


Journal ArticleDOI
TL;DR: Experimental results on the AR and FERET databases show that ESRC has better generalization ability than SRC for undersampled face recognition under variable expressions, illuminations, disguises, and ages.
Abstract: Sparse Representation-Based Classification (SRC) is a face recognition breakthrough in recent years which has successfully addressed the recognition problem with sufficient training images of each gallery subject. In this paper, we extend SRC to applications where there are very few, or even a single, training images per subject. Assuming that the intraclass variations of one subject can be approximated by a sparse linear combination of those of other subjects, Extended Sparse Representation-Based Classifier (ESRC) applies an auxiliary intraclass variant dictionary to represent the possible variation between the training and testing images. The dictionary atoms typically represent intraclass sample differences computed from either the gallery faces themselves or the generic faces that are outside the gallery. Experimental results on the AR and FERET databases show that ESRC has better generalization ability than SRC for undersampled face recognition under variable expressions, illuminations, disguises, and ages. The superior results of ESRC suggest that if the dictionary is properly constructed, SRC algorithms can generalize well to the large-scale face recognition problem, even with a single training image per class.

577 citations


Journal ArticleDOI
TL;DR: Two large facial-expression databases depicting challenging real-world conditions were constructed using a semi-automatic approach via a recommender system based on subtitles.
Abstract: Two large facial-expression databases depicting challenging real-world conditions were constructed using a semi-automatic approach via a recommender system based on subtitles.

552 citations


Book ChapterDOI
07 Oct 2012
TL;DR: This paper revisits the classical Bayesian face recognition method by Baback Moghaddam et al. and proposes a new joint formulation that leads to an EM-like model learning at the training time and an efficient, closed-formed computation at the test time.
Abstract: In this paper, we revisit the classical Bayesian face recognition method by Baback Moghaddam et al. and propose a new joint formulation. The classical Bayesian method models the appearance difference between two faces. We observe that this "difference" formulation may reduce the separability between classes. Instead, we model two faces jointly with an appropriate prior on the face representation. Our joint formulation leads to an EM-like model learning at the training time and an efficient, closed-formed computation at the test time. On extensive experimental evaluations, our method is superior to the classical Bayesian face and many other supervised approaches. Our method achieved 92.4% test accuracy on the challenging Labeled Face in Wild (LFW) dataset. Comparing with current best commercial system, we reduced the error rate by 10%.

487 citations


Journal ArticleDOI
TL;DR: This paper is going to study a method for representing face which is based on the features which uses geometric relationship among the facial features like mouth, nose and eyes called Principal Component Analysis followed by Feed Forward Neural Network called PCA-NN.
Abstract: Today in Modern Society Face Recognition has gained much attention in the field of network multimedia access. After the 9/11 tragedy in India, the need for technologies for identification, detection and recognition of suspects has increased. One of the most common biometric recognition techniques is face recognition since face is the convenient way used by the people to identify each other. In this paper we are going to study a method for representing face which is based on the features which uses geometric relationship among the facial features like mouth, nose and eyes .Feature based face representation is done by independently matching templates of three facial regions i.e eyes, mouth and nose .Principal Component Analysis method which is also called Eigen faces is appearance based technique used widely for the dimensionality reduction and recorded a greater performance in face recognition. Here we are going to study about PCA followed by Feed Forward Neural Network called PCA-NN.

485 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed method outperforms the existing methods, in terms of image quality and recognition accuracy, as well as face super-resolution methods.
Abstract: This paper addresses the very low resolution (VLR) problem in face recognition in which the resolution of the face image to be recognized is lower than 16 × 16. With the increasing demand of surveillance camera-based applications, the VLR problem happens in many face application systems. Existing face recognition algorithms are not able to give satisfactory performance on the VLR face image. While face super-resolution (SR) methods can be employed to enhance the resolution of the images, the existing learning-based face SR methods do not perform well on such a VLR face image. To overcome this problem, this paper proposes a novel approach to learn the relationship between the high-resolution image space and the VLR image space for face SR. Based on this new approach, two constraints, namely, new data and discriminative constraints, are designed for good visuality and face recognition applications under the VLR problem, respectively. Experimental results show that the proposed SR algorithm based on relationship learning outperforms the existing algorithms in public face databases.

467 citations


Proceedings ArticleDOI
01 Jul 2012
TL;DR: The preliminary simulation results show that optimal task partitioning algorithms significantly affect response time with heterogeneous latencies and server compute powers, and high-powered cloudlets are technically feasible and indeed help reduce overall processing time when face recognition applications run on mobile devices using the cloud as the backend servers.
Abstract: Face recognition applications for airport security and surveillance can benefit from the collaborative coupling of mobile and cloud computing as they become widely available today. This paper discusses our work with the design and implementation of face recognition applications using our mobile-cloudlet-cloud architecture named MOCHA and its initial performance results. The challenge lies with how to perform task partitioning from mobile devices to cloud and distribute compute load among cloud servers (cloudlet) to minimize the response time given diverse communication latencies and server compute powers. Our preliminary simulation results show that optimal task partitioning algorithms significantly affect response time with heterogeneous latencies and compute powers. Motivated by these results, we design, implement, and validate the basic functionalities of MOCHA as a proof-of-concept, and develop algorithms that minimize the overall response time for face recognition. Our experimental results demonstrate that high-powered cloudlets are technically feasible and indeed help reduce overall processing time when face recognition applications run on mobile devices using the cloud as the backend servers.

437 citations


Journal ArticleDOI
TL;DR: It is shown that an alternative to dynamic face matcher selection is to train face recognition algorithms on datasets that are evenly distributed across demographics, as this approach offers consistently high accuracy across all cohorts.
Abstract: This paper studies the influence of demographics on the performance of face recognition algorithms. The recognition accuracies of six different face recognition algorithms (three commercial, two nontrainable, and one trainable) are computed on a large scale gallery that is partitioned so that each partition consists entirely of specific demographic cohorts. Eight total cohorts are isolated based on gender (male and female), race/ethnicity (Black, White, and Hispanic), and age group (18-30, 30-50, and 50-70 years old). Experimental results demonstrate that both commercial and the nontrainable algorithms consistently have lower matching accuracies on the same cohorts (females, Blacks, and age group 18-30) than the remaining cohorts within their demographic. Additional experiments investigate the impact of the demographic distribution in the training set on the performance of a trainable face recognition algorithm. We show that the matching accuracy for race/ethnicity and age cohorts can be improved by training exclusively on that specific cohort. Operationally, this leads to a scenario, called dynamic face matcher selection, where multiple face recognition algorithms (each trained on different demographic cohorts) are available for a biometric system operator to select based on the demographic information extracted from a probe image. This procedure should lead to improved face recognition accuracy in many intelligence and law enforcement face recognition scenarios. Finally, we show that an alternative to dynamic face matcher selection is to train face recognition algorithms on datasets that are evenly distributed across demographics, as this approach offers consistently high accuracy across all cohorts.

426 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: It is shown that a recognition system using only representations obtained from deep learning can achieve comparable accuracy with a system using a combination of hand-crafted image descriptors, and empirically show that learning weights not only is necessary for obtaining good multilayer representations, but also provides robustness to the choice of the network architecture parameters.
Abstract: Most modern face recognition systems rely on a feature representation given by a hand-crafted image descriptor, such as Local Binary Patterns (LBP), and achieve improved performance by combining several such representations. In this paper, we propose deep learning as a natural source for obtaining additional, complementary representations. To learn features in high-resolution images, we make use of convolutional deep belief networks. Moreover, to take advantage of global structure in an object class, we develop local convolutional restricted Boltzmann machines, a novel convolutional learning model that exploits the global structure by not assuming stationarity of features across the image, while maintaining scalability and robustness to small misalignments. We also present a novel application of deep learning to descriptors other than pixel intensity values, such as LBP. In addition, we compare performance of networks trained using unsupervised learning against networks with random filters, and empirically show that learning weights not only is necessary for obtaining good multilayer representations, but also provides robustness to the choice of the network architecture parameters. Finally, we show that a recognition system using only representations obtained from deep learning can achieve comparable accuracy with a system using a combination of hand-crafted image descriptors. Moreover, by combining these representations, we achieve state-of-the-art results on a real-world face verification database.

Journal ArticleDOI
TL;DR: This paper explores the effectiveness of sparse representations obtained by learning a set of overcomplete basis (dictionary) in the context of action recognition in videos and presents the idea of a new local spatio-temporal feature that is distinctive, scale invariant, and fast to compute.
Abstract: This paper explores the effectiveness of sparse representations obtained by learning a set of overcomplete basis (dictionary) in the context of action recognition in videos. Although this work concentrates on recognizing human movements-physical actions as well as facial expressions-the proposed approach is fairly general and can be used to address other classification problems. In order to model human actions, three overcomplete dictionary learning frameworks are investigated. An overcomplete dictionary is constructed using a set of spatio-temporal descriptors (extracted from the video sequences) in such a way that each descriptor is represented by some linear combination of a small number of dictionary elements. This leads to a more compact and richer representation of the video sequences compared to the existing methods that involve clustering and vector quantization. For each framework, a novel classification algorithm is proposed. Additionally, this work also presents the idea of a new local spatio-temporal feature that is distinctive, scale invariant, and fast to compute. The proposed approach repeatedly achieves state-of-the-art results on several public data sets containing various physical actions and facial expressions.

01 Jan 2012
TL;DR: In this article, the authors implemented a robust face recognition system via sparse representation and convex optimization, which treated each test sample as sparse linear combination of training samples, and got the sparse solution via L1-minimization.
Abstract: In this project, we implement a robust face recognition system via sparse representation and convex optimization We treat each test sample as sparse linear combination of training samples, and get the sparse solution via L1-minimization We also explore the group sparseness (L2-norm) as well as normal L1-norm regularizationWe discuss the role of feature extraction and classification robustness to occlusion or pixel corruption of face recognition system The experiments demonstrate the choice of features is no longer critical once the sparseness is properly harnessed We also verify that the proposed algorithm outperforms other methods

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A two-stage multi-task sparse learning (MTSL) framework is proposed to efficiently locate the common and specific patches which are important to discriminate all the expressions and only a particular expression, respectively.
Abstract: In this paper, we present a new idea to analyze facial expression by exploring some common and specific information among different expressions. Inspired by the observation that only a few facial parts are active in expression disclosure (e.g., around mouth, eye), we try to discover the common and specific patches which are important to discriminate all the expressions and only a particular expression, respectively. A two-stage multi-task sparse learning (MTSL) framework is proposed to efficiently locate those discriminative patches. In the first stage MTSL, expression recognition tasks, each of which aims to find dominant patches for each expression, are combined to located common patches. Second, two related tasks, facial expression recognition and face verification tasks, are coupled to learn specific facial patches for individual expression. Extensive experiments validate the existence and significance of common and specific patches. Utilizing these learned patches, we achieve superior performances on expression recognition compared to the state-of-the-arts.

Journal ArticleDOI
01 Feb 2012
TL;DR: The proposed fully automatic method enables the detection of a much larger range of facial behavior by recognizing facial muscle actions [action units (AUs)] that compound expressions.
Abstract: Past work on automatic analysis of facial expressions has focused mostly on detecting prototypic expressions of basic emotions like happiness and anger. The method proposed here enables the detection of a much larger range of facial behavior by recognizing facial muscle actions [action units (AUs)] that compound expressions. AUs are agnostic, leaving the inference about conveyed intent to higher order decision making (e.g., emotion recognition). The proposed fully automatic method not only allows the recognition of 22 AUs but also explicitly models their temporal characteristics (i.e., sequences of temporal segments: neutral, onset, apex, and offset). To do so, it uses a facial point detector based on Gabor-feature-based boosted classifiers to automatically localize 20 facial fiducial points. These points are tracked through a sequence of images using a method called particle filtering with factorized likelihoods. To encode AUs and their temporal activation models based on the tracking data, it applies a combination of GentleBoost, support vector machines, and hidden Markov models. We attain an average AU recognition rate of 95.3% when tested on a benchmark set of deliberately displayed facial expressions and 72% when tested on spontaneous expressions.

Journal ArticleDOI
TL;DR: Two applications of the proposed multitask joint sparse representation model to combine the strength of multiple features and/or instances for recognition are investigated: fusing multiple kernel features for object categorization and robust face recognition in video with an ensemble of query images.
Abstract: We address the problem of visual classification with multiple features and/or multiple instances. Motivated by the recent success of multitask joint covariate selection, we formulate this problem as a multitask joint sparse representation model to combine the strength of multiple features and/or instances for recognition. A joint sparsity-inducing norm is utilized to enforce class-level joint sparsity patterns among the multiple representation vectors. The proposed model can be efficiently optimized by a proximal gradient method. Furthermore, we extend our method to the setup where features are described in kernel matrices. We then investigate into two applications of our method to visual classification: 1) fusing multiple kernel features for object categorization and 2) robust face recognition in video with an ensemble of query images. Extensive experiments on challenging real-world data sets demonstrate that the proposed method is competitive to the state-of-the-art methods in respective applications.

Proceedings ArticleDOI
Xudong Cao1, Yichen Wei1, Fang Wen1, Jian Sun1
16 Jun 2012
TL;DR: This paper presents a very efficient, highly accurate, “Explicit Shape Regression” approach for face alignment that significantly outperforms the state-of-the-art in terms of both accuracy and efficiency.
Abstract: We present a very efficient, highly accurate, “Explicit Shape Regression” approach for face alignment. Unlike previous regression-based approaches, we directly learn a vectorial regression function to infer the whole facial shape (a set of facial landmarks) from the image and explicitly minimize the alignment errors over the training data. The inherent shape constraint is naturally encoded into the regressor in a cascaded learning framework and applied from coarse to fine during the test, without using a fixed parametric shape model as in most previous methods. To make the regression more effective and efficient, we design a two-level boosted regression, shape-indexed features and a correlation-based feature selection method. This combination enables us to learn accurate models from large training data in a short time (20 minutes for 2,000 training images), and run regression extremely fast in test (15 ms for a 87 landmarks shape). Experiments on challenging data show that our approach significantly outperforms the state-of-the-art in terms of both accuracy and efficiency.

Journal ArticleDOI
01 Aug 2012
TL;DR: A meta-analysis of the first periodical challenge in automatic recognition of facial expressions, held during the IEEE conference on Face and Gesture Recognition 2011, details the challenge data, evaluation protocol, and the results attained in two subchallenges.
Abstract: Automatic facial expression recognition has been an active topic in computer science for over two decades, in particular facial action coding system action unit (AU) detection and classification of a number of discrete emotion states from facial expressive imagery. Standardization and comparability have received some attention; for instance, there exist a number of commonly used facial expression databases. However, lack of a commonly accepted evaluation protocol and, typically, lack of sufficient details needed to reproduce the reported individual results make it difficult to compare systems. This, in turn, hinders the progress of the field. A periodical challenge in facial expression recognition would allow such a comparison on a level playing field. It would provide an insight on how far the field has come and would allow researchers to identify new goals, challenges, and targets. This paper presents a meta-analysis of the first such challenge in automatic recognition of facial expressions, held during the IEEE conference on Face and Gesture Recognition 2011. It details the challenge data, evaluation protocol, and the results attained in two subchallenges: AU detection and classification of facial expression imagery in terms of a number of discrete emotion categories. We also summarize the lessons learned and reflect on the future of the field of facial expression recognition in general and on possible future challenges in particular.

Journal ArticleDOI
10 Jan 2012-PLOS ONE
TL;DR: A simple method is identified that generally works best for face and object recognition, and two that work well for recognizing textures, which are tested using a modern descriptor-based image recognition framework.
Abstract: In image recognition it is often assumed the method used to convert color images to grayscale has little impact on recognition performance. We compare thirteen different grayscale algorithms with four types of image descriptors and demonstrate that this assumption is wrong: not all color-to-grayscale algorithms work equally well, even when using descriptors that are robust to changes in illumination. These methods are tested using a modern descriptor-based image recognition framework, on face, object, and texture datasets, with relatively few training instances. We identify a simple method that generally works best for face and object recognition, and two that work well for recognizing textures.

Book ChapterDOI
07 Oct 2012
TL;DR: The proposed DL-COPAR method, which can learn the most compact and most discriminative class-specific dictionaries used for classification, achieves very promising performances in various applications, such as face recognition, handwritten digit recognition, scene classification and object recognition.
Abstract: Empirically, we find that, despite the class-specific features owned by the objects appearing in the images, the objects from different categories usually share some common patterns, which do not contribute to the discrimination of them. Concentrating on this observation and under the general dictionary learning (DL) framework, we propose a novel method to explicitly learn a common pattern pool (the commonality) and class-specific dictionaries (the particularity) for classification. We call our method DL-COPAR, which can learn the most compact and most discriminative class-specific dictionaries used for classification. The proposed DL-COPAR is extensively evaluated both on synthetic data and on benchmark image databases in comparison with existing DL-based classification methods. The experimental results demonstrate that DL-COPAR achieves very promising performances in various applications, such as face recognition, handwritten digit recognition, scene classification and object recognition.

Journal ArticleDOI
TL;DR: This paper considers each image as having been generated from several underlying causes, some of which are due to identity (latent identity variables, or LIVs), and develops a series of novel generative models which incorporate both within-individual and between-individual variation.
Abstract: Many face recognition algorithms use “distance-based” methods: Feature vectors are extracted from each face and distances in feature space are compared to determine matches. In this paper, we argue for a fundamentally different approach. We consider each image as having been generated from several underlying causes, some of which are due to identity (latent identity variables, or LIVs) and some of which are not. In recognition, we evaluate the probability that two faces have the same underlying identity cause. We make these ideas concrete by developing a series of novel generative models which incorporate both within-individual and between-individual variation. We consider both the linear case, where signal and noise are represented by a subspace, and the nonlinear case, where an arbitrary face manifold can be described and noise is position-dependent. We also develop a “tied” version of the algorithm that allows explicit comparison of faces across quite different viewing conditions. We demonstrate that our model produces results that are comparable to or better than the state of the art for both frontal face recognition and face recognition under varying pose.

Proceedings ArticleDOI
09 Jul 2012
TL;DR: It is shown, on this mobile phone database, that face and speaker recognition can be performed in a mobile environment and using score fusion can improve the performance by more than 25% in terms of error rates.
Abstract: This paper presents a novel fully automatic bi-modal, face and speaker, recognition system which runs in real-time on a mobile phone. The implemented system runs in real-time on a Nokia N900 and demonstrates the feasibility of performing both automatic face and speaker recognition on a mobile phone. We evaluate this recognition system on a novel publicly-available mobile phone database and provide a well defined evaluation protocol. This database was captured almost exclusively using mobile phones and aims to improve research into deploying biometric techniques to mobile devices. We show, on this mobile phone database, that face and speaker recognition can be performed in a mobile environment and using score fusion can improve the performance by more than 25% in terms of error rates.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: This work proposes a novel low-rank matrix approximation algorithm with structural incoherence for robust face recognition that decomposes raw training data into a set of representative basis with corresponding sparse errors for better modeling the face images.
Abstract: We address the problem of robust face recognition, in which both training and test image data might be corrupted due to occlusion and disguise. From standard face recognition algorithms such as Eigenfaces to recently proposed sparse representation-based classification (SRC) methods, most prior works did not consider possible contamination of data during training, and thus the associated performance might be degraded. Based on the recent success of low-rank matrix recovery, we propose a novel low-rank matrix approximation algorithm with structural incoherence for robust face recognition. Our method not only decomposes raw training data into a set of representative basis with corresponding sparse errors for better modeling the face images, we further advocate the structural incoherence between the basis learned from different classes. These basis are encouraged to be as independent as possible due to the regularization on structural incoherence. We show that this provides additional discriminating ability to the original low-rank models for improved performance. Experimental results on public face databases verify the effectiveness and robustness of our method, which is also shown to outperform state-of-the-art SRC based approaches.

Posted Content
TL;DR: This paper discusses how SRC works, and shows that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification.
Abstract: By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm characterization of coding coefficient is related to the degree of discrimination of facial features. Extensive experiments were conducted to verify the face recognition accuracy and efficiency of CRC with different instantiations.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel face parser is proposed, which recasts segmentation of face components as a cross-modality data transformation problem, i.e., transforming an image patch to a label map.
Abstract: This paper investigates how to parse (segment) facial components from face images which may be partially occluded. We propose a novel face parser, which recasts segmentation of face components as a cross-modality data transformation problem, i.e., transforming an image patch to a label map. Specifically, a face is represented hierarchically by parts, components, and pixel-wise labels. With this representation, our approach first detects faces at both the part- and component-levels, and then computes the pixel-wise label maps (Fig.1). Our part-based and component-based detectors are generatively trained with the deep belief network (DBN), and are discriminatively tuned by logistic regression. The segmentators transform the detected face components to label maps, which are obtained by learning a highly nonlinear mapping with the deep autoencoder. The proposed hierarchical face parsing is not only robust to partial occlusions but also provide richer information for face analysis and face synthesis compared with face keypoint detection and face alignment. The effectiveness of our algorithm is shown through several tasks on 2, 239 images selected from three datasets (e.g., LFW [12], BioID [13] and CUFSF [29]).

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A discriminative low-rank dictionary learning algorithm for sparse representation with objective function with sparse coefficients, class discrimination and rank minimization is proposed and optimized during dictionary learning.
Abstract: In this paper, we propose a discriminative low-rank dictionary learning algorithm for sparse representation. Sparse representation seeks the sparsest coefficients to represent the test signal as linear combination of the bases in an over-complete dictionary. Motivated by low-rank matrix recovery and completion, assume that the data from the same pattern are linearly correlated, if we stack these data points as column vectors of a dictionary, then the dictionary should be approximately low-rank. An objective function with sparse coefficients, class discrimination and rank minimization is proposed and optimized during dictionary learning. We have applied the algorithm for face recognition. Numerous experiments with improved performances over previous dictionary learning methods validate the effectiveness of the proposed algorithm.

Journal ArticleDOI
TL;DR: It is found that the magnitude of face-specific recognition accuracy correlated with the extent to which participants processed faces holistically, as indexed by the composite-face effect and the whole-part effect.
Abstract: Why do some people recognize faces easily and others frequently make mistakes in recognizing faces? Classic behavioral work has shown that faces are processed in a distinctive holistic manner that is unlike the processing of objects. In the study reported here, we investigated whether individual differences in holistic face processing have a significant influence on face recognition. We found that the magnitude of face-specific recognition accuracy correlated with the extent to which participants processed faces holistically, as indexed by the composite-face effect and the whole-part effect. This association is due to face-specific processing in particular, not to a more general aspect of cognitive processing, such as general intelligence or global attention. This finding provides constraints on computational models of face recognition and may elucidate mechanisms underlying cognitive disorders, such as prosopagnosia and autism, that are associated with deficits in face recognition.

Journal ArticleDOI
TL;DR: The authors present a novel approach based on analysing facial image for detecting whether there is a live person in front of the camera or a face print, using a set of low-level feature descriptors, fast linear classification scheme and score level fusion.
Abstract: Current face biometric systems are vulnerable to spoofing attacks. A spoofing attack occurs when a person tries to masquerade as someone else by falsifying data and thereby gaining illegitimate access. Inspired by image quality assessment, characterisation of printing artefacts and differences in light reflection, the authors propose to approach the problem of spoofing detection from texture analysis point of view. Indeed, face prints usually contain printing quality defects that can be well detected using texture and local shape features. Hence, the authors present a novel approach based on analysing facial image for detecting whether there is a live person in front of the camera or a face print. The proposed approach analyses the texture and gradient structures of the facial images using a set of low-level feature descriptors, fast linear classification scheme and score level fusion. Compared to many previous works, the authors proposed approach is robust and does not require user-cooperation. In addition, the texture features that are used for spoofing detection can also be used for face recognition. This provides a unique feature space for coupling spoofing detection and face recognition. Extensive experimental analysis on three publicly available databases showed excellent results compared to existing works.

Proceedings ArticleDOI
01 Sep 2012
TL;DR: This paper proposes a novel face representation based on Local Quantized Patterns that gives state-of-the-art performance without requiring neither a metric learning stage nor a costly labelled training dataset.
Abstract: This paper proposes a novel face representation based on Local Quantized Patterns (LQP). LQP is a generalization of local pattern features that makes use of vector quantization and lookup table to let local pattern features have many more pixels and/or quantization levels without sacrificing simplicity and computational efficiency. Our new LQP face representation not only outperforms any other representation on challenging face datasets but performs equally well in the intensity space and orientation space (obtained by applying gradient or Gabor Filters) and hence is intrinsically robust to illumination variations. Extensive experiments on several challenging face recognition datasets (such as FERET and LFW) show that this representation gives state-of-the-art performance (improving the earlier state-of-the-art by around 3%) without requiring neither a metric learning stage nor a costly labelled training dataset, having the comparison of two faces being made by simply computing the Cosine similarity between their LQP representations in a projected space.

Journal ArticleDOI
TL;DR: By proposing an additional technique that makes the feature descriptor robust to rotation, the efficiency of the algorithm is validated and it is proved that it is about 30 times faster than those based on Gabor filters.
Abstract: A good feature descriptor is desired to be discriminative, robust, and computationally inexpensive in both terms of time and storage requirement. In the domain of face recognition, these properties allow the system to quickly deliver high recognition results to the end user. Motivated by the recent feature descriptor called Patterns of Oriented Edge Magnitudes (POEM), which balances the three concerns, this paper aims at enhancing its performance with respect to all these criteria. To this end, we first optimize the parameters of POEM and then apply the whitened principal-component-analysis dimensionality reduction technique to get a more compact, robust, and discriminative descriptor. For face recognition, the efficiency of our algorithm is proved by strong results obtained on both constrained (Face Recognition Technology, FERET) and unconstrained (Labeled Faces in the Wild, LFW) data sets in addition with the low complexity. Impressively, our algorithm is about 30 times faster than those based on Gabor filters. Furthermore, by proposing an additional technique that makes our descriptor robust to rotation, we validate its efficiency for the task of image matching.