scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 2008"


01 Oct 2008
TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.
Abstract: Most face databases have been created under controlled conditions to facilitate the study of specific parameters on the face recognition problem. These parameters include such variables as position, pose, lighting, background, camera quality, and gender. While there are many applications for face recognition technology in which one can control the parameters of image acquisition, there are also many applications in which the practitioner has little or no control over such parameters. This database, Labeled Faces in the Wild, is provided as an aid in studying the latter, unconstrained, recognition problem. The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life. The database exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background. In addition to describing the details of the database, we provide specific experimental paradigms for which the database is suitable. This is done in an effort to make research performed with the database as consistent and comparable as possible. We provide baseline results, including results of a state of the art face recognition system combined with a face alignment system. To facilitate experimentation on the database, we provide several parallel databases, including an aligned version.

5,742 citations


Proceedings Article
01 Sep 2008
TL;DR: The CMU Multi-PIE database as mentioned in this paper contains 337 subjects, imaged under 15 view points and 19 illumination conditions in up to four recording sessions, with a limited number of subjects, a single recording session and only few expressions captured.
Abstract: A close relationship exists between the advancement of face recognition algorithms and the availability of face databases varying factors that affect facial appearance in a controlled manner. The CMU PIE database has been very influential in advancing research in face recognition across pose and illumination. Despite its success the PIE database has several shortcomings: a limited number of subjects, a single recording session and only few expressions captured. To address these issues we collected the CMU Multi-PIE database. It contains 337 subjects, imaged under 15 view points and 19 illumination conditions in up to four recording sessions. In this paper we introduce the database and describe the recording procedure. We furthermore present results from baseline experiments using PCA and LDA classifiers to highlight similarities and differences between PIE and Multi-PIE.

1,181 citations


Journal ArticleDOI
01 Jan 2008
TL;DR: The evaluation protocol based on the CAS-PEAL-R1 database is discussed and the performance of four algorithms are presented as a baseline to do the following: elementarily assess the difficulty of the database for face recognition algorithms; preference evaluation results for researchers using the database; and identify the strengths and weaknesses of the commonly used algorithms.
Abstract: In this paper, we describe the acquisition and contents of a large-scale Chinese face database: the CAS-PEAL face database. The goals of creating the CAS-PEAL face database include the following: 1) providing the worldwide researchers of face recognition with different sources of variations, particularly pose, expression, accessories, and lighting (PEAL), and exhaustive ground-truth information in one uniform database; 2) advancing the state-of-the-art face recognition technologies aiming at practical applications by using off-the-shelf imaging equipment and by designing normal face variations in the database; and 3) providing a large-scale face database of Mongolian. Currently, the CAS-PEAL face database contains 99 594 images of 1040 individuals (595 males and 445 females). A total of nine cameras are mounted horizontally on an arc arm to simultaneously capture images across different poses. Each subject is asked to look straight ahead, up, and down to obtain 27 images in three shots. Five facial expressions, six accessories, and 15 lighting changes are also included in the database. A selected subset of the database (CAS-PEAL-R1, containing 30 863 images of the 1040 subjects) is available to other researchers now. We discuss the evaluation protocol based on the CAS-PEAL-R1 database and present the performance of four algorithms as a baseline to do the following: 1) elementarily assess the difficulty of the database for face recognition algorithms; 2) preference evaluation results for researchers using the database; and 3) identify the strengths and weaknesses of the commonly used algorithms.

971 citations


Journal ArticleDOI
TL;DR: It is shown that even without a fully optimized design, an MPCA-based gait recognition module achieves highly competitive performance and compares favorably to the state-of-the-art gait recognizers.
Abstract: This paper introduces a multilinear principal component analysis (MPCA) framework for tensor object feature extraction. Objects of interest in many computer vision and pattern recognition applications, such as 2D/3D images and video sequences are naturally described as tensors or multilinear arrays. The proposed framework performs feature extraction by determining a multilinear projection that captures most of the original tensorial input variation. The solution is iterative in nature and it proceeds by decomposing the original problem to a series of multiple projection subproblems. As part of this work, methods for subspace dimensionality determination are proposed and analyzed. It is shown that the MPCA framework discussed in this work supplants existing heterogeneous solutions such as the classical principal component analysis (PCA) and its 2D variant (2D PCA). Finally, a tensor object recognition system is proposed with the introduction of a discriminative tensor feature selection mechanism and a novel classification strategy, and applied to the problem of gait recognition. Results presented here indicate MPCA's utility as a feature extraction tool. It is shown that even without a fully optimized design, an MPCA-based gait recognition module achieves highly competitive performance and compares favorably to the state-of-the-art gait recognizers.

856 citations


Book ChapterDOI
23 Dec 2008
TL;DR: A new 3D face database that includes a rich set of expressions, systematic variation of poses and different types of occlusions is presented, which can be a very valuable resource for development and evaluation of algorithms on face recognition under adverse conditions and facial expression analysis as well as for facial expression synthesis.
Abstract: A new 3D face database that includes a rich set of expressions, systematic variation of poses and different types of occlusions is presented in this paper. This database is unique from three aspects: i) the facial expressions are composed of judiciously selected subset of Action Units as well as the six basic emotions, and many actors/actresses are incorporated to obtain more realistic expression data; ii) a rich set of head pose variations are available; and iii) different types of face occlusions are included. Hence, this new database can be a very valuable resource for development and evaluation of algorithms on face recognition under adverse conditions and facial expression analysis as well as for facial expression synthesis.

819 citations


Journal ArticleDOI
TL;DR: The age manifold learning scheme for extracting face aging features is introduced and a locally adjusted robust regressor for learning and prediction of human ages is designed, which improves the age estimation accuracy significantly over all previous methods.
Abstract: Estimating human age automatically via facial image analysis has lots of potential real-world applications, such as human computer interaction and multimedia communication. However, it is still a challenging problem for the existing computer vision systems to automatically and effectively estimate human ages. The aging process is determined by not only the person's gene, but also many external factors, such as health, living style, living location, and weather conditions. Males and females may also age differently. The current age estimation performance is still not good enough for practical use and more effort has to be put into this research direction. In this paper, we introduce the age manifold learning scheme for extracting face aging features and design a locally adjusted robust regressor for learning and prediction of human ages. The novel approach improves the age estimation accuracy significantly over all previous methods. The merit of the proposed approaches for image-based age estimation is shown by extensive experiments on a large internal age database and the public available FG-NET database.

661 citations


Proceedings ArticleDOI
Lijun Yin1, Xiaochen Chen1, Yi Sun1, T. Worm1, Michael Reale1 
01 Sep 2008
TL;DR: This paper presents a newly created high-resolution 3D dynamic facial expression database, which is made available to the scientific research community and has been validated through the authors' facial expression recognition experiment using an HMM based 3D spatio-temporal facial descriptor.
Abstract: Face information processing relies on the quality of data resource From the data modality point of view, a face database can be 2D or 3D, and static or dynamic From the task point of view, the data can be used for research of computer based automatic face recognition, face expression recognition, face detection, or cognitive and psychological investigation With the advancement of 3D imaging technologies, 3D dynamic facial sequences (called 4D data) have been used for face information analysis In this paper, we focus on the modality of 3D dynamic data for the task of facial expression recognition We present a newly created high-resolution 3D dynamic facial expression database, which is made available to the scientific research community The database contains 606 3D facial expression sequences captured from 101 subjects of various ethnic backgrounds The database has been validated through our facial expression recognition experiment using an HMM based 3D spatio-temporal facial descriptor It is expected that such a database shall be used to facilitate the facial expression analysis from a static 3D space to a dynamic 3D space, with a goal of scrutinizing facial behavior at a higher level of detail in a real 3D spatio-temporal domain

537 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: This work addresses the problem of tracking and recognizing faces in real-world, noisy videos using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting and introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework.
Abstract: We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting. However, adaptive appearance trackers often suffer from drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, our tracker introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework. The generative term conforms the particles to the space of generic face poses while the discriminative one ensures rejection of poorly aligned targets. This leads to a tracker that significantly improves robustness against abrupt appearance changes and occlusions, critical for the subsequent recognition phase. Identity of the tracked subject is established by fusing pose-discriminant and person-discriminant features over the duration of a video sequence. This leads to a robust video-based face recognizer with state-of-the-art recognition performance. We test the quality of tracking and face recognition on real-world noisy videos from YouTube as well as the standard Honda/UCSD database. Our approach produces successful face tracking results on over 80% of all videos without video or person-specific parameter tuning. The good tracking performance induces similarly high recognition rates: 100% on Honda/UCSD and over 70% on the YouTube set containing 35 celebrities in 1500 sequences.

493 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: The proposed MMD method outperforms the competing methods on the task of Face Recognition based on Image Set, and a novel manifold learning approach is proposed, which expresses a manifold by a collection of local linear models, each depicted by a subspace.
Abstract: In this paper, we address the problem of classifying image sets, each of which contains images belonging to the same class but covering large variations in, for instance, viewpoint and illumination. We innovatively formulate the problem as the computation of Manifold-Manifold Distance (MMD), i.e., calculating the distance between nonlinear manifolds each representing one image set. To compute MMD, we also propose a novel manifold learning approach, which expresses a manifold by a collection of local linear models, each depicted by a subspace. MMD is then converted to integrating the distances between pair of subspaces respectively from one of the involved manifolds. The proposed MMD method is evaluated on the task of Face Recognition based on Image Set (FRIS). In FRIS, each known subject is enrolled with a set of facial images and modeled as a gallery manifold, while a testing subject is modeled as a probe manifold, which is then matched against all the gallery manifolds by MMD. Identification is achieved by seeking the minimum MMD. Experimental results on two public face databases, Honda/UCSD and CMU MoBo, demonstrate that the proposed MMD method outperforms the competing methods.

443 citations


Journal ArticleDOI
TL;DR: It is demonstrated that face images with aging features can be effectively extracted from a discriminant subspace learning algorithm and visualized as distinct manifold structures through the manifold method of analysis on face images.
Abstract: Recently, extensive studies on human faces in the human-computer interaction (HCI) field reveal significant potentials for designing automatic age estimation systems via face image analysis. The success of such research may bring in many innovative HCI tools used for the applications of human-centered multimedia communication. Due to the temporal property of age progression, face images with aging features may display some sequential patterns with low-dimensional distributions. In this paper, we demonstrate that such aging patterns can be effectively extracted from a discriminant subspace learning algorithm and visualized as distinct manifold structures. Through the manifold method of analysis on face images, the dimensionality redundancy of the original image space can be significantly reduced with subspace learning. A multiple linear regression procedure, especially with a quadratic model function, can be facilitated by the low dimensionality to represent the manifold space embodying the discriminative property. Such a processing has been evaluated by extensive simulations and compared with the state-of-the-art methods. Experimental results on a large size aging database demonstrate the effectiveness and robustness of our proposed framework.

429 citations


Journal ArticleDOI
TL;DR: Experiments comparing the proposed approach with some other popular subspace methods on the FERET, ORL, AR, and GT databases show that the method consistently outperforms others.
Abstract: This work proposes a subspace approach that regularizes and extracts eigenfeatures from the face image. Eigenspace of the within-class scatter matrix is decomposed into three subspaces: a reliable subspace spanned mainly by the facial variation, an unstable subspace due to noise and finite number of training samples, and a null subspace. Eigenfeatures are regularized differently in these three subspaces based on an eigenspectrum model to alleviate problems of instability, overfitting, or poor generalization. This also enables the discriminant evaluation performed in the whole space. Feature extraction or dimensionality reduction occurs only at the final stage after the discriminant assessment. These efforts facilitate a discriminative and a stable low-dimensional feature representation of the face image. Experiments comparing the proposed approach with some other popular subspace methods on the FERET, ORL, AR, and GT databases show that our method consistently outperforms others.

Proceedings ArticleDOI
01 Dec 2008
TL;DR: Recognition of blurred faces using the recently introduced Local Phase Quantization (LPQ) operator is proposed and results show that the LPQ descriptor is highly tolerant to blur but still very descriptive outperforming LBP both with blurred and sharp images.
Abstract: In this paper, recognition of blurred faces using the recently introduced Local Phase Quantization (LPQ) operator is proposed. LPQ is based on quantizing the Fourier transform phase in local neighborhoods. The phase can be shown to be a blur invariant property under certain commonly fulfilled conditions. In face image analysis, histograms of LPQ labels computed within local regions are used as a face descriptor similarly to the widely used Local Binary Pattern (LBP) methodology for face image description. The experimental results on CMU PIE and FRGC 1.0.4 datasets show that the LPQ descriptor is highly tolerant to blur but still very descriptive outperforming LBP both with blurred and sharp images.

Journal ArticleDOI
TL;DR: In this article, an up-to-date review of major human face recognition research is provided, including a review of the most recent face recognition techniques and their applications, as well as a summary of the research results.
Abstract: The task of face recognition has been actively researched in recent years. This paper provides an up-to-date review of major human face recognition research. We first present an overview of face recognition and its applications. Then, a literature review of the most recent face recognition techniques is presented. Description and limitations of face databases which are used to test the performance of these face recognition algorithms are given. A brief summary of the face recognition vendor test (FRVT) 2002, a large scale evaluation of automatic face recognition technology, and its conclusions are also given. Finally, we give a summary of the research results. Keywords—Combined classifiers, face recognition, graph matching, neural networks.

Journal ArticleDOI
TL;DR: A new method for human face recognition by utilizing Gabor-based region covariance matrices as face descriptors is presented, using both pixel locations and Gabor coefficients to form the covariant matrices.
Abstract: This paper presents a new method for human face recognition by utilizing Gabor-based region covariance matrices as face descriptors. Both pixel locations and Gabor coefficients are employed to form the covariance matrices. Experimental results demonstrate the advantages of this proposed method.

Journal ArticleDOI
01 Aug 2008
TL;DR: This paper proposes algorithms for iris segmentation, quality enhancement, match score fusion, and indexing to improve both the accuracy and the speed of iris recognition.
Abstract: This paper proposes algorithms for iris segmentation, quality enhancement, match score fusion, and indexing to improve both the accuracy and the speed of iris recognition. A curve evolution approach is proposed to effectively segment a nonideal iris image using the modified Mumford-Shah functional. Different enhancement algorithms are concurrently applied on the segmented iris image to produce multiple enhanced versions of the iris image. A support-vector-machine-based learning algorithm selects locally enhanced regions from each globally enhanced image and combines these good-quality regions to create a single high-quality iris image. Two distinct features are extracted from the high-quality iris image. The global textural feature is extracted using the 1-D log polar Gabor transform, and the local topological feature is extracted using Euler numbers. An intelligent fusion algorithm combines the textural and topological matching scores to further improve the iris recognition performance and reduce the false rejection rate, whereas an indexing algorithm enables fast and accurate iris identification. The verification and identification performance of the proposed algorithms is validated and compared with other algorithms using the CASIA Version 3, ICE 2005, and UBIRIS iris databases.

Journal ArticleDOI
TL;DR: A multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: A joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.
Abstract: We propose a joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters. Formulating an optimization problem that combines the objective function of the classification with the representation error of both labeled and unlabeled data, constrained by sparsity, we propose an algorithm that alternates between solving for subsets of parameters, whilst preserving the sparsity. The method is then evaluated over two important classification problems in computer vision: object categorization of natural images using the Caltech 101 database and face recognition using the Extended Yale B face database. The results show that the proposed method is competitive against other recently proposed sparse overcomplete counterparts and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: This work proposes a new procedure for recognition of low-resolution faces, when there is a high-resolution training set available, and shows that recognition of faces of as low as 6 times 6 pixel size is considerably improved compared to matching using a super-resolution reconstruction followed by classification, and to matching with a low- resolution training set.
Abstract: Face recognition degrades when faces are of very low resolution since many details about the difference between one person and another can only be captured in images of sufficient resolution. In this work, we propose a new procedure for recognition of low-resolution faces, when there is a high-resolution training set available. Most previous super-resolution approaches are aimed at reconstruction, with recognition only as an after-thought. In contrast, in the proposed method, face features, as they would be extracted for a face recognition algorithm (e.g., eigenfaces, Fisher-faces, etc.), are included in a super-resolution method as prior information. This approach simultaneously provides measures of fit of the super-resolution result, from both reconstruction and recognition perspectives. This is different from the conventional paradigms of matching in a low-resolution domain, or, alternatively, applying a super-resolution algorithm to a low-resolution face and then classifying the super-resolution result. We show, for example, that recognition of faces of as low as 6 times 6 pixel size is considerably improved compared to matching using a super-resolution reconstruction followed by classification, and to matching with a low-resolution training set.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that using 28 small regions on the face allow for the highest level of 3D face recognition, and show the robustness of the algorithm by simulating large holes and artifacts in images.
Abstract: In this paper, we introduce a new system for 3D face recognition based on the fusion of results from a committee of regions that have been independently matched. Experimental results demonstrate that using 28 small regions on the face allow for the highest level of 3D face recognition. Score-based fusion is performed on the individual region match scores and experimental results show that the Borda count and consensus voting methods yield higher performance than the standard sum, product, and min fusion rules. In addition, results are reported that demonstrate the robustness of our algorithm by simulating large holes and artifacts in images. To our knowledge, no other work has been published that uses a large number of 3D face regions for high-performance face matching. Rank one recognition rates of 97.2% and verification rates of 93.2% at a 0.1% false accept rate are reported and compared to other methods published on the face recognition grand challenge v2 data set.

Journal ArticleDOI
TL;DR: A novel keypoint detection technique is proposed which can repeatably identify keypoints at locations where shape variation is high in 3D faces and a unique 3D coordinate basis can be defined locally at each keypoint facilitating the extraction of highly descriptive pose invariant features.
Abstract: Holistic face recognition algorithms are sensitive to expressions, illumination, pose, occlusions and makeup. On the other hand, feature-based algorithms are robust to such variations. In this paper, we present a feature-based algorithm for the recognition of textured 3D faces. A novel keypoint detection technique is proposed which can repeatably identify keypoints at locations where shape variation is high in 3D faces. Moreover, a unique 3D coordinate basis can be defined locally at each keypoint facilitating the extraction of highly descriptive pose invariant features. A 3D feature is extracted by fitting a surface to the neighborhood of a keypoint and sampling it on a uniform grid. Features from a probe and gallery face are projected to the PCA subspace and matched. The set of matching features are used to construct two graphs. The similarity between two faces is measured as the similarity between their graphs. In the 2D domain, we employed the SIFT features and performed fusion of the 2D and 3D features at the feature and score-level. The proposed algorithm achieved 96.1% identification rate and 98.6% verification rate on the complete FRGC v2 data set.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: It is demonstrated that the simple method of enhancing face recognition with social network context substantially increases recognition performance beyond that of a baseline face recognition system.
Abstract: Most personal photos that are shared online are embedded in some form of social network, and these social networks are a potent source of contextual information that can be leveraged for automatic image understanding. In this paper, we investigate the utility of social network context for the task of automatic face recognition in personal photographs. We combine face recognition scores with social context in a conditional random field (CRF) model and apply this model to label faces in photos from the popular online social network Facebook, which is now the top photo-sharing site on the Web with billions of photos in total. We demonstrate that our simple method of enhancing face recognition with social network context substantially increases recognition performance beyond that of a baseline face recognition system.

Journal ArticleDOI
TL;DR: A new face recognition algorithm based on the well-known EBGM which replaces Gabor features by HOG descriptors is presented which shows a better performance compared to other face recognition approaches using public available databases.

Journal ArticleDOI
TL;DR: Experimental results on Yale and CMU PIE face databases convince us that the proposed method provides a better representation of the class information and obtains much higher recognition accuracies.

Proceedings ArticleDOI
01 Sep 2008
TL;DR: An expression-invariant method for face recognition by fitting an identity/expression separated 3D Morphable Model to shape data that greatly improves recognition and retrieval rates in the uncooperative setting, while achieving recognition rates on par with the best recognition algorithms in the face recognition great vendor test.
Abstract: We describe an expression-invariant method for face recognition by fitting an identity/expression separated 3D Morphable Model to shape data. The expression model greatly improves recognition and retrieval rates in the uncooperative setting, while achieving recognition rates on par with the best recognition algorithms in the face recognition great vendor test. The fitting is performed with a robust nonrigid ICP algorithm. It is able to perform face recognition in a fully automated scenario and on noisy data. The system was evaluated on two datasets, one with a high noise level and strong expressions, and the standard UND range scan database, showing that while expression invariance increases recognition and retrieval performance for the expression dataset, it does not decrease performance on the neutral dataset. The high recognition rates are achieved even with a purely shape based method, without taking image data into account.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed deformation modeling scheme increases the 3D face matching accuracy in comparison to matching with 3D neutral models by 7 and 10 percentage points, respectively, on a subset of the FRGC v2.0 3D benchmark and the MSU multiview3D face database with expression variations.
Abstract: Face recognition based on 3D surface matching is promising for overcoming some of the limitations of current 2D image-based face recognition systems. The 3D shape is generally invariant to the pose and lighting changes, but not invariant to the nonrigid facial movement such as expressions. Collecting and storing multiple templates to account for various expressions for each subject in a large database is not practical. We propose a facial surface modeling and matching scheme to match 2.5D facial scans in the presence of both nonrigid deformations and pose changes (multiview) to a stored 3D face model with neutral expression. A hierarchical geodesic-based resampling approach is applied to extract landmarks for modeling facial surface deformations. We are able to synthesize the deformation learned from a small group of subjects (control group) onto a 3D neutral model (not in the control group), resulting in a deformed template. A user-specific (3D) deformable model is built for each subject in the gallery with respect to the control group by combining the templates with synthesized deformations. By fitting this generative deformable model to a test scan, the proposed approach is able to handle expressions and pose changes simultaneously. A fully automatic and prototypic deformable model based 3D face matching system has been developed. Experimental results demonstrate that the proposed deformation modeling scheme increases the 3D face matching accuracy in comparison to matching with 3D neutral models by 7 and 10 percentage points, respectively, on a subset of the FRGC v2.0 3D benchmark and the MSU multiview 3D face database with expression variations.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: A real-time algorithm to estimate the 3D pose of a previously unseen face from a single range image is presented, based on a novel shape signature to identify noses in range images and a novel error function that compares the input range image to precomputed pose images of an average face model.
Abstract: We present a real-time algorithm to estimate the 3D pose of a previously unseen face from a single range image. Based on a novel shape signature to identify noses in range images, we generate candidates for their positions, and then generate and evaluate many pose hypotheses in parallel using modern graphics processing units (GPUs). We developed a novel error function that compares the input range image to precomputed pose images of an average face model. The algorithm is robust to large pose variations of plusmn90deg yaw, plusmn45deg pitch and plusmn30deg roll rotation, facial expression, partial occlusion, and works for multiple faces in the field of view. It correctly estimates 97.8% of the poses within yaw and pitch error of 15deg at 55.8 fps. To evaluate the algorithm, we built a database of range images with large pose variations and developed a method for automatic ground truth annotation.

Journal ArticleDOI
TL;DR: A generative model that creates a one-to-many mapping from an idealized "identity" space to the observed data space to establish a probabilistic distance metric that allows a full posterior over possible matches to be established.
Abstract: Face recognition algorithms perform very unreliably when the pose of the probe face is different from the gallery face: typical feature vectors vary more with pose than with identity. We propose a generative model that creates a one-to-many mapping from an idealized "identity" space to the observed data space. In identity space, the representation for each individual does not vary with pose. We model the measured feature vector as being generated by a pose-contingent linear transformation of the identity variable in the presence of Gaussian noise. We term this model "tied" factor analysis. The choice of linear transformation (factors) depends on the pose, but the loadings are constant (tied) for a given individual. We use the EM algorithm to estimate the linear transformations and the noise parameters from training data. We propose a probabilistic distance metric that allows a full posterior over possible matches to be established. We introduce a novel feature extraction process and investigate recognition performance by using the FERET, XM2VTS, and PIE databases. Recognition performance compares favorably with contemporary approaches.

Patent
30 Dec 2008
TL;DR: In this article, the problem of automatically recognizing multiple known faces in photos or videos on a local computer storage device (on a home computer) was solved by automatically selecting thumbnail images of people.
Abstract: The present invention solves the problem of automatically recognizing multiple known faces in photos or videos on a local computer storage device (on a home computer). It further allows for sophisticated organization and presentation of the photos or videos based on the graphical selection of known faces (by selecting thumbnail images of people). It also solves the problem of sharing or distributing photos or videos in an automated fashion between 'friends' who are also using the same software that enables the invention. It further solves the problem of allowing a user of the invention to review the results of the automatic face detection, eye detection, and face recognition methods and to correct any errors resulting from the automated process.

Journal ArticleDOI
01 Feb 2008
TL;DR: Experimental results show that the proposed GSVD-ILDA algorithm gives the same performance as the LDA/GSVD with much smaller computational complexity, and also gives better classification performance than the other recently proposed ILDA algorithms.
Abstract: Dimensionality reduction methods have been successfully employed for face recognition. Among the various dimensionality reduction algorithms, linear (Fisher) discriminant analysis (LDA) is one of the popular supervised dimensionality reduction methods, and many LDA-based face recognition algorithms/systems have been reported in the last decade. However, the LDA-based face recognition systems suffer from the scalability problem. To overcome this limitation, an incremental approach is a natural solution. The main difficulty in developing the incremental LDA (ILDA) is to handle the inverse of the within-class scatter matrix. In this paper, based on the generalized singular value decomposition LDA (LDA/GSVD), we develop a new ILDA algorithm called GSVD-ILDA. Different from the existing techniques in which the new projection matrix is found in a restricted subspace, the proposed GSVD-ILDA determines the projection matrix in full space. Extensive experiments are performed to compare the proposed GSVD-ILDA with the LDA/GSVD as well as the existing ILDA methods using the face recognition technology face database and the Carneggie Mellon University Pose, Illumination, and Expression face database. Experimental results show that the proposed GSVD-ILDA algorithm gives the same performance as the LDA/GSVD with much smaller computational complexity. The experimental results also show that the proposed GSVD-ILDA gives better classification performance than the other recently proposed ILDA algorithms.

Journal ArticleDOI
TL;DR: A novel multiclassifier scheme is proposed to boost the recognition performance of human emotional state from audiovisual signals based on a comparative study of different classification algorithms and specific characteristics of individual emotion.
Abstract: Machine recognition of human emotional state is an important component for efficient human-computer interaction. The majority of existing works address this problem by utilizing audio signals alone, or visual information only. In this paper, we explore a systematic approach for recognition of human emotional state from audiovisual signals. The audio characteristics of emotional speech are represented by the extracted prosodic, Mel-frequency Cepstral Coefficient (MFCC), and formant frequency features. A face detection scheme based on HSV color model is used to detect the face from the background. The visual information is represented by Gabor wavelet features. We perform feature selection by using a stepwise method based on Mahalanobis distance. The selected audiovisual features are used to classify the data into their corresponding emotions. Based on a comparative study of different classification algorithms and specific characteristics of individual emotion, a novel multiclassifier scheme is proposed to boost the recognition performance. The feasibility of the proposed system is tested over a database that incorporates human subjects from different languages and cultural backgrounds. Experimental results demonstrate the effectiveness of the proposed system. The multiclassifier scheme achieves the best overall recognition rate of 82.14%.