scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 2009"


Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations


Proceedings ArticleDOI
02 Sep 2009
TL;DR: This paper publishes a generative 3D shape and texture model, the Basel Face Model (BFM), and demonstrates its application to several face recognition task and publishes a set of detailed recognition and reconstruction results on standard databases to allow complete algorithm comparisons.
Abstract: Generative 3D face models are a powerful tool in computer vision. They provide pose and illumination invariance by modeling the space of 3D faces and the imaging process. The power of these models comes at the cost of an expensive and tedious construction process, which has led the community to focus on more easily constructed but less powerful models. With this paper we publish a generative 3D shape and texture model, the Basel Face Model (BFM), and demonstrate its application to several face recognition task. We improve on previous models by offering higher shape and texture accuracy due to a better scanning device and less correspondence artifacts due to an improved registration algorithm. The same 3D face model can be fit to 2D or 3D images acquired under different situations and with different sensors using an analysis by synthesis method. The resulting model parameters separate pose, lighting, imaging and identity parameters, which facilitates invariant face recognition across sensors and data sets by comparing only the identity parameters. We hope that the availability of this registered face model will spur research in generative models. Together with the model we publish a set of detailed recognition and reconstruction results on standard databases to allow complete algorithm comparisons.

1,265 citations


Proceedings ArticleDOI
29 Sep 2009
TL;DR: Two methods for learning robust distance measures are presented: a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN).
Abstract: Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3% and 87.5% correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks.

913 citations


Journal ArticleDOI
TL;DR: A novel face photo-sketch synthesis and recognition method using a multiscale Markov Random Fields (MRF) model that allows effective matching between the two in face sketch recognition.
Abstract: In this paper, we propose a novel face photo-sketch synthesis and recognition method using a multiscale Markov Random Fields (MRF) model. Our system has three components: 1) given a face photo, synthesizing a sketch drawing; 2) given a face sketch drawing, synthesizing a photo; and 3) searching for face photos in the database based on a query sketch drawn by an artist. It has useful applications for both digital entertainment and law enforcement. We assume that faces to be studied are in a frontal pose, with normal lighting and neutral expression, and have no occlusions. To synthesize sketch/photo images, the face region is divided into overlapping patches for learning. The size of the patches decides the scale of local face structures to be learned. From a training set which contains photo-sketch pairs, the joint photo-sketch model is learned at multiple scales using a multiscale MRF model. By transforming a face photo to a sketch (or transforming a sketch to a photo), the difference between photos and sketches is significantly reduced, thus allowing effective matching between the two in face sketch recognition. After the photo-sketch transformation, in principle, most of the proposed face photo recognition approaches can be applied to face sketch recognition in a straightforward way. Extensive experiments are conducted on a face sketch database including 606 faces, which can be downloaded from our Web site (http://mmlab.ie.cuhk.edu.hk/facesketch.html).

753 citations


Journal ArticleDOI
TL;DR: A discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has been provided.
Abstract: Face recognition presents a challenging problem in the field of image analysis and computer vision, and as such has received a great deal of attention over the last few years because of its many applications in various domains. Face recognition techniques can be broadly divided into three categories based on the face data acquisition methodology: methods that operate on intensity images; those that deal with video sequences; and those that require other sensory data such as 3D information or infra-red imagery. In this paper, an overview of some of the well-known methods in each of these categories is provided and some of the benefits and drawbacks of the schemes mentioned therein are examined. Furthermore, a discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has also been provided. This paper also mentions some of the most recent algorithms developed for this purpose and attempts to give an idea of the state of the art of face recognition technology.

751 citations


Book
20 Apr 2009
TL;DR: This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications.
Abstract: The detection and recognition of objects in images is a key research topic in the computer vision community Within this area, face recognition and interpretation has attracted increasing attention owing to the possibility of unveiling human perception mechanisms, and for the development of practical biometric systems This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications Using examples from face processing tasks throughout the book to illustrate more general object recognition approaches, Roberto Brunelli: examines the basics of digital image formation, highlighting points critical to the task of template matching; presents basic and advanced template matching techniques, targeting grey-level images, shapes and point sets; discusses recent pattern classification paradigms from a template matching perspective; illustrates the development of a real face recognition system; explores the use of advanced computer graphics techniques in the development of computer vision algorithms Template Matching Techniques in Computer Vision is primarily aimed at practitioners working on the development of systems for effective object recognition such as biometrics, robot navigation, multimedia retrieval and landmark detection It is also of interest to graduate students undertaking studies in these areas

721 citations


Book ChapterDOI
27 Jul 2009
TL;DR: This paper proposes for the first time a strongly privacy-enhanced face recognition system, which allows to efficiently hide both the biometrics and the result from the server that performs the matching operation, by using techniques from secure multiparty computation.
Abstract: Face recognition is increasingly deployed as a means to unobtrusively verify the identity of people. The widespread use of biometrics raises important privacy concerns, in particular if the biometric matching process is performed at a central or untrusted server, and calls for the implementation of Privacy-Enhancing Technologies. In this paper we propose for the first time a strongly privacy-enhanced face recognition system, which allows to efficiently hide both the biometrics and the result from the server that performs the matching operation, by using techniques from secure multiparty computation. We consider a scenario where one party provides a face image, while another party has access to a database of facial templates. Our protocol allows to jointly run the standard Eigenfaces recognition algorithm in such a way that the first party cannot learn from the execution of the protocol more than basic parameters of the database, while the second party does not learn the input image or the result of the recognition process. At the core of our protocol lies an efficient protocol for securely comparing two Pailler-encrypted numbers. We show through extensive experiments that the system can be run efficiently on conventional hardware.

546 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: This work investigates the biologically inspired features (BIF) for human age estimation from faces with significant improvements in age estimation accuracy over the state-of-the-art methods and proposes a new operator “STD” to encode the aging subtlety on faces.
Abstract: We investigate the biologically inspired features (BIF) for human age estimation from faces. As in previous bio-inspired models, a pyramid of Gabor filters are used at all positions of the input image for the S1 units. But unlike previous models, we find that the pre-learned prototypes for the S2 layer and then progressing to C2 cannot work well for age estimation. We also propose to use Gabor filters with smaller sizes and suggest to determine the number of bands and orientations in a problem-specific manner, rather than using a predefined number. More importantly, we propose a new operator “STD” to encode the aging subtlety on faces. Evaluated on the large database YGA with 8,000 face images and the public available FG-NET database, our approach achieves significant improvements in age estimation accuracy over the state-of-the-art methods. By applying our system to some Internet face images, we show the robustness of our method and the potential of cross-race age estimation, which has not been explored by any studies before.

530 citations


Journal ArticleDOI
TL;DR: A critical survey of researches on image-based face recognition across pose is provided, classified into different categories according to their methodologies in handling pose variations, and several promising directions for future research have been suggested.

511 citations


Journal ArticleDOI
Taiping Zhang1, Yuan Yan Tang1, Bin Fang1, Zhaowei Shang1, Xiaoyu Liu1 
TL;DR: Theoretical analysis and experimental results validate that gradient faces is an illumination insensitive measure, and robust to different illumination, including uncontrolled, natural lighting, and is also insensitive to image noise and object artifacts.
Abstract: In this correspondence, we propose a novel method to extract illumination insensitive features for face recognition under varying lighting called the gradient faces. Theoretical analysis shows gradient faces is an illumination insensitive measure, and robust to different illumination, including uncontrolled, natural lighting. In addition, gradient faces is derived from the image gradient domain such that it can discover underlying inherent structure of face images since the gradient domain explicitly considers the relationships between neighboring pixel points. Therefore, gradient faces has more discriminating power than the illumination insensitive measure extracted from the pixel domain. Recognition rates of 99.83% achieved on PIE database of 68 subjects, 98.96% achieved on Yale B of ten subjects, and 95.61% achieved on Outdoor database of 132 subjects under uncontrolled natural lighting conditions show that gradient faces is an effective method for face recognition under varying illumination. Furthermore, the experimental results on Yale database validate that gradient faces is also insensitive to image noise and object artifacts (such as facial expressions).

406 citations


Proceedings ArticleDOI
14 Jun 2009
TL;DR: A learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled video recordings, and is used to improve the performance on a supervised task of interest.
Abstract: This work proposes a learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled video recordings. That is, two successive frames are likely to contain the same object or objects. This coherence is used as a supervisory signal over the unlabeled data, and is used to improve the performance on a supervised task of interest. We demonstrate the effectiveness of this method on some pose invariant object and face recognition tasks.

Journal ArticleDOI
TL;DR: For instance, the authors found that people with exceptionally good face recognition ability are about as good at face recognition and perception as developmental prosopagnosics are bad on a perceptual discrimination test with faces.
Abstract: We tested 4 people who claimed to have significantly better than ordinary face recognition ability. Exceptional ability was confirmed in each case. On two very different tests of face recognition, all 4 experimental subjects performed beyond the range of control subject performance. They also scored significantly better than average on a perceptual discrimination test with faces. This effect was larger with upright than with inverted faces, and the 4 subjects showed a larger “inversion effect” than did control subjects, who in turn showed a larger inversion effect than did developmental prosopagnosics. This result indicates an association between face recognition ability and the magnitude of the inversion effect. Overall, these “super-recognizers” are about as good at face recognition and perception as developmental prosopagnosics are bad. Our findings demonstrate the existence of people with exceptionally good face recognition ability and show that the range of face recognition and face perception ability is wider than has been previously acknowledged.

Book ChapterDOI
02 Dec 2009
TL;DR: A privacy-preserving face recognition scheme that substantially improves over previous work in terms of communication-and computation efficiency and has a substantially smaller online communication complexity.
Abstract: Automatic recognition of human faces is becoming increasingly popular in civilian and law enforcement applications that require reliable recognition of humans. However, the rapid improvement and widespread deployment of this technology raises strong concerns regarding the violation of individuals' privacy. A typical application scenario for privacy-preserving face recognition concerns a client who privately searches for a specific face image in the face image database of a server. In this paper we present a privacy-preserving face recognition scheme that substantially improves over previouswork in terms of communication-and computation efficiency: the most recent proposal of Erkin et al. (PETS'09) requires O(log M) rounds and computationally expensive operations on homomorphically encrypted data to recognize a face in a database of M faces. Our improved scheme requires only O(1) rounds and has a substantially smaller online communication complexity (by a factor of 15 for each database entry) and less computation complexity. Our solution is based on known cryptographic building blocks combining homomorphic encryption with garbled circuits. Our implementation results show the practicality of our scheme also for large databases (e.g., for M = 1000 we need less than 13 seconds and less than 4 MByte on-line communication on two 2.4GHz PCs connected via Gigabit Ethernet).

Journal ArticleDOI
TL;DR: A novel face recognition method which exploits both global and local discriminative features, and which encodes the holistic facial information, such as facial contour, is proposed.
Abstract: In the literature of psychophysics and neurophysiology, many studies have shown that both global and local features are crucial for face representation and recognition. This paper proposes a novel face recognition method which exploits both global and local discriminative features. In this method, global features are extracted from the whole face images by keeping the low-frequency coefficients of Fourier transform, which we believe encodes the holistic facial information, such as facial contour. For local feature extraction, Gabor wavelets are exploited considering their biological relevance. After that, Fisher's linear discriminant (FLD) is separately applied to the global Fourier features and each local patch of Gabor features. Thus, multiple FLD classifiers are obtained, each embodying different facial evidences for face recognition. Finally, all these classifiers are combined to form a hierarchical ensemble classifier. We evaluate the proposed method using two large-scale face databases: FERET and FRGC version 2.0. Experiments show that the results of our method are impressively better than the best known results with the same evaluation protocol.

01 Jan 2009
TL;DR: The findings demonstrate the existence of people with exceptionally good face recognition ability and show that the range of face recognition and face perception ability is wider than has been previously acknowledged.
Abstract: We tested 4 people who claimed to have significantly better than ordinary face recognition ability. Exceptional ability was confirmed in each case. On two very different tests of face recognition, all 4 experimental subjects performed beyond the range of control subject performance. They also scored significantly better than average on a perceptual discrimination test with faces. This effect was larger with upright than with inverted faces, and the 4 subjects showed a larger “inversion effect” than did control subjects, who in turn showed a larger inversion effect than did developmental prosopagnosics. This result indicates an association between face recognition ability and the magnitude of the inversion effect. Overall, these “super-recognizers” are about as good at face recognition and perception as developmental prosopagnosics are bad. Our findings demonstrate the existence of people with exceptionally good face recognition ability and show that the range of face recognition and face perception ability is wider than has been previously acknowledged.

Proceedings ArticleDOI
11 Apr 2009
TL;DR: A new liveness detection method for face recognition based on differences in optical flow fields generated by movements of two-dimensional planes and three-dimensional objects is proposed.
Abstract: It is a common spoof to use a photograph to fool face recognition algorithm. In light of differences in optical flow fields generated by movements of two-dimensional planes and three-dimensional objects, we proposed a new liveness detection method for face recognition. Under the assumption that the test region is a two-dimensional plane, we can obtain a reference field from the actual optical flow field data. Then the degree of differences between the two fields can be used to distinguish between a three-dimensional face and a two-dimensional photograph. Empirical study shows that the proposed approach is both feasible and effective.

Journal ArticleDOI
TL;DR: Results suggest that human-level expression recognition accuracy in real-life illumination conditions is achievable with machine learning technology.
Abstract: Machine learning approaches have produced some of the highest reported performances for facial expression recognition. However, to date, nearly all automatic facial expression recognition research has focused on optimizing performance on a few databases that were collected under controlled lighting conditions on a relatively small number of subjects. This paper explores whether current machine learning methods can be used to develop an expression recognition system that operates reliably in more realistic conditions. We explore the necessary characteristics of the training data set, image registration, feature representation, and machine learning algorithms. A new database, GENKI, is presented which contains pictures, photographed by the subjects themselves, from thousands of different people in many different real-world imaging conditions. Results suggest that human-level expression recognition accuracy in real-life illumination conditions is achievable with machine learning technology. However, the data sets currently used in the automatic expression recognition literature to evaluate progress may be overly constrained and could potentially lead research into locally optimal algorithmic solutions.

Book ChapterDOI
23 Sep 2009
TL;DR: “background samples”, that is, examples which do not belong to any of the classes being learned, may provide a significant performance boost to such face recognition systems, and is defined and evaluated as an extension to the recently proposed “One-Shot Similarity” (OSS) measure.
Abstract: Evaluating the similarity of images and their descriptors by employing discriminative learners has proven itself to be an effective face recognition paradigm. In this paper we show how “background samples”, that is, examples which do not belong to any of the classes being learned, may provide a significant performance boost to such face recognition systems. In particular, we make the following contributions. First, we define and evaluate the “Two-Shot Similarity” (TSS) score as an extension to the recently proposed “One-Shot Similarity” (OSS) measure. Both these measures utilize background samples to facilitate better recognition rates. Second, we examine the ranking of images most similar to a query image and employ these as a descriptor for that image. Finally, we provide results underscoring the importance of proper face alignment in automatic face recognition systems. These contributions in concert allow us to obtain a success rate of 86.83% on the Labeled Faces in the Wild (LFW) benchmark, outperforming current state-of-the-art results.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: The proposed MDA method is evaluated on the tasks of object recognition with image sets, including face recognition and object categorization, and seeks to learn an embedding space, where manifolds with different class labels are better separated, and local data compactness within each manifold is enhanced.
Abstract: This paper presents a novel discriminative learning method, called manifold discriminant analysis (MDA), to solve the problem of image set classification. By modeling each image set as a manifold, we formulate the problem as classification-oriented multi-manifolds learning. Aiming at maximizing “manifold margin”, MDA seeks to learn an embedding space, where manifolds with different class labels are better separated, and local data compactness within each manifold is enhanced. As a result, new testing manifold can be more reliably classified in the learned embedding space. The proposed method is evaluated on the tasks of object recognition with image sets, including face recognition and object categorization. Comprehensive comparisons and extensive experiments demonstrate the effectiveness of our method.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: Two new approaches are proposed: Volume-SIFT (VSIFT) and Partial-Descriptor-Sift (PDSIFT) for face recognition based on the original SIFT algorithm, which can achieve comparable performance as the most successful holistic approach ERE and significantly outperforms FLDA and NLDA.
Abstract: Scale Invariant Feature Transform (SIFT) has shown to be a powerful technique for general object recognition/detection. In this paper, we propose two new approaches: Volume-SIFT (VSIFT) and Partial-Descriptor-SIFT (PDSIFT) for face recognition based on the original SIFT algorithm. We compare holistic approaches: Fisherface (FLDA), the null space approach (NLDA) and Eigenfeature Regularization and Extraction (ERE) with feature based approaches: SIFT and PDSIFT. Experiments on the ORL and AR databases show that the performance of PDSIFT is significantly better than the original SIFT approach. Moreover, PDSIFT can achieve comparable performance as the most successful holistic approach ERE and significantly outperforms FLDA and NLDA.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: A character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters is reported, demonstrating significantly increased coverage and performance with respect to previous methods on this material.
Abstract: We investigate the problem of automatically labelling faces of characters in TV or movie material with their names, using only weak supervision from automatically-aligned subtitle and script text. Our previous work (Everingham et al. [8]) demonstrated promising results on the task, but the coverage of the method (proportion of video labelled) and generalization was limited by a restriction to frontal faces and nearest neighbour classification. In this paper we build on that method, extending the coverage greatly by the detection and recognition of characters in profile views. In addition, we make the following contributions: (i) seamless tracking, integration and recognition of profile and frontal detections, and (ii) a character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters. We report results on seven episodes of the TV series "Buffy the Vampire Slayer", demonstrating significantly increased coverage and performance with respect to previous methods on this material.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: A novel framework for searching for people in surveillance environments based on a parsing of human parts and their attributes, including facial hair, eyewear, clothing color, etc, which can be extracted using detectors learned from large amounts of training data is proposed.
Abstract: We propose a novel framework for searching for people in surveillance environments. Rather than relying on face recognition technology, which is known to be sensitive to typical surveillance conditions such as lighting changes, face pose variation, and low-resolution imagery, we approach the problem in a different way: we search for people based on a parsing of human parts and their attributes, including facial hair, eyewear, clothing color, etc. These attributes can be extracted using detectors learned from large amounts of training data. A complete system that implements our framework is presented. At the interface, the user can specify a set of personal characteristics, and the system then retrieves events that match the provided description. For example, a possible query is “show me the bald people who entered a given building last Saturday wearing a red shirt and sunglasses.” This capability is useful in several applications, such as finding suspects or missing people. To evaluate the performance of our approach, we present extensive experiments on a set of images collected from the Internet, on infrared imagery, and on two-and-a-half months of video from a real surveillance environment. We are not aware of any similar surveillance system capable of automatically finding people in video based on their fine-grained body parts and attributes.

Proceedings ArticleDOI
01 Jan 2009
TL;DR: This paper presents a system utilizing identity and pose information to improve facial image pair-matching performance using multiple One-Shot scores, and shows how separating pose and identity may lead to better face recognition rates in unconstrained, “wild” facial images.
Abstract: The One-Shot Similarity (OSS) kernel [3, 4] has recently been introduced as a means of boosting the performance of face recognition systems. Given two vectors, their One-Shot Similarity score (Fig. 1) reflects the likelihood of each vector belonging to the same class as the other vector and not in a class defined by a fixed set of “negative” examples. In this paper we explore how the One-Shot Similarity may nevertheless benefit from the availability of such labels. (a) we present a system utilizing identity and pose information to improve facial image pair-matching performance using multiple One-Shot scores; (b) we show how separating pose and identity may lead to better face recognition rates in unconstrained, “wild” facial images; (c) we explore how far we can get using a single descriptor with different similarity tests as opposed to the popular multiple descriptor approaches; and (d) we demonstrate the benefit of learned metrics for improved One-Shot performance.

Journal ArticleDOI
TL;DR: A thorough analysis of various approaches that have been proposed for problems such as age estimation, appearance prediction, face verification, etc. are offered and offer insights into future research on this topic.
Abstract: Facial aging, a new dimension that has recently been added to the problem of face recognition, poses interesting theoretical and practical challenges to the research community. The problem which originally generated interest in the psychophysics and human perception community has recently found enhanced interest in the computer vision community. How do humans perceive age? What constitutes an age-invariant signature that can be derived from faces? How compactly can the facial growth event be described? How does facial aging impact recognition performance? In this paper, we give a thorough analysis on the problem of facial aging and further provide a complete account of the many interesting studies that have been performed on this topic from different fields. We offer a comparative analysis of various approaches that have been proposed for problems such as age estimation, appearance prediction, face verification, etc. and offer insights into future research on this topic.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: It is shown that even modest optimization of the simple model introduced by Pinto et al. using modern multiple kernel learning (MKL) techniques once again yields “state-of-the-art” performance levels on a standard face recognition set (“labeled faces in the wild”).
Abstract: In recent years, large databases of natural images have become increasingly popular in the evaluation of face and object recognition algorithms. However, Pinto et al. previously illustrated an inherent danger in using such sets, showing that an extremely basic recognition system, built on a trivial feature set, was able to take advantage of low-level regularities in popular object and face recognition sets, performing on par with many state-of-the-art systems. Recently, several groups have raised the performance “bar” for these sets, using more advanced classification tools. However, it is difficult to know whether these improvements are due to progress towards solving the core computational problem, or are due to further improvements in the exploitation of low-level regularities. Here, we show that even modest optimization of the simple model introduced by Pinto et al. using modern multiple kernel learning (MKL) techniques once again yields “state-of-the-art” performance levels on a standard face recognition set (“labeled faces in the wild”). However, at the same time, even with the inclusion of MKL techniques, systems based on these simple features still fail on a synthetic face recognition test that includes more “realistic” view variation by design. These results underscore the importance of building test sets focussed on capturing the central computational challenges of real-world face recognition.

Proceedings Article
07 Dec 2009
TL;DR: It is shown that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data.
Abstract: In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

Proceedings ArticleDOI
01 Sep 2009
TL;DR: This work shows how a Markov Random Field model for spatial continuity of the occlusion can be integrated into the computation of a sparse representation of the test image with respect to the training images and efficiently and reliably identifies the corrupted regions and excludes them from the sparse representation.
Abstract: Partially occluded faces are common in many applications of face recognition While algorithms based on sparse representation have demonstrated promising results, they achieve their best performance on occlusions that are not spatially correlated (ie random pixel corruption) We show that such sparsity-based algorithms can be significantly improved by harnessing prior knowledge about the pixel error distribution We show how a Markov Random Field model for spatial continuity of the occlusion can be integrated into the computation of a sparse representation of the test image with respect to the training images Our algorithm efficiently and reliably identifies the corrupted regions and excludes them from the sparse representation Extensive experiments on both laboratory and real-world datasets show that our algorithm tolerates much larger fractions and varieties of occlusion than current state-of-the-art algorithms

Proceedings ArticleDOI
20 Jun 2009
TL;DR: It is shown that the proposed simple and practical face recognition system can efficiently and effectively recognize faces under a variety of realistic conditions, using only frontal images under the proposed illuminations as training.
Abstract: Most contemporary face recognition algorithms work well under laboratory conditions but degrade when tested in less-controlled environments. This is mostly due to the difficulty of simultaneously handling variations in illumination, alignment, pose, and occlusion. In this paper, we propose a simple and practical face recognition system that achieves a high degree of robustness and stability to all these variations. We demonstrate how to use tools from sparse representation to align a test face image with a set of frontal training images in the presence of significant registration error and occlusion. We thoroughly characterize the region of attraction for our alignment algorithm on public face datasets such as Multi-PIE. We further study how to obtain a sufficient set of training illuminations for linearly interpolating practical lighting conditions. We have implemented a complete face recognition system, including a projector-based training acquisition system, in order to evaluate how our algorithms work under practical testing conditions. We show that our system can efficiently and effectively recognize faces under a variety of realistic conditions, using only frontal images under the proposed illuminations as training.

Book ChapterDOI
04 Jun 2009
TL;DR: The goal of the Multiple Biometrics Grand Challenge (MBGC) is to improve the performance of face and iris recognition technology from biometric samples acquired under unconstrained conditions.
Abstract: The goal of the Multiple Biometrics Grand Challenge (MBGC) is to improve the performance of face and iris recognition technology from biometric samples acquired under unconstrained conditions. The MBGC is organized into three challenge problems. Each challenge problem relaxes the acquisition constraints in different directions. In the Portal Challenge Problem, the goal is to recognize people from near-infrared (NIR) and high definition (HD) video as they walk through a portal. Iris recognition can be performed from the NIR video and face recognition from the HD video. The availability of NIR and HD modalities allows for the development of fusion algorithms. The Still Face Challenge Problem has two primary goals. The first is to improve recognition performance from frontal and off angle still face images taken under uncontrolled indoor and outdoor lighting. The second is to improve recognition performance on still frontal face images that have been resized and compressed, as is required for electronic passports. In the Video Challenge Problem, the goal is to recognize people from video in unconstrained environments. The video is unconstrained in pose, illumination, and camera angle. All three challenge problems include a large data set, experiment descriptions, ground truth, and scoring code.

Journal ArticleDOI
01 Feb 2009
TL;DR: This paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes Affective states based on phase synchronization/alignment.
Abstract: Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion.