scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 1999"


Book
01 Oct 1999
TL;DR: In this article, the authors presented a method for recognizing human faces from single images out of a large database with one image per person, based on a novel approach, the bunch graph, which is constructed from a small set of sample image graphs.
Abstract: We present a system for recognizing human faces from single images out of a large database with one image per person. The task is difficult because of image variation in terms of position, size, expression, and pose. The system collapses most of this variance by extracting concise face descriptions in the form of image graphs. In these, fiducial points on the face (eyes, mouth etc.) are described by sets of wavelet components (jets). Image graph extraction is based on a novel approach, the bunch graph, which is constructed from a small set of sample image graphs. Recognition is based on a straight-forward comparison of image graphs. We report recognition experiments on the FERET database and the Bochum database, including recognition across pose.

1,829 citations


Journal ArticleDOI
TL;DR: This work proposes a method for automatically classifying facial images based on labeled elastic graph matching, a 2D Gabor wavelet representation, and linear discriminant analysis, and a visual interpretation of the discriminant vectors.
Abstract: We propose a method for automatically classifying facial images based on labeled elastic graph matching, a 2D Gabor wavelet representation, and linear discriminant analysis. Results of tests with three image sets are presented for the classification of sex, "race", and expression. A visual interpretation of the discriminant vectors is provided.

1,095 citations


Journal ArticleDOI
TL;DR: This paper explores and compares techniques for automatically recognizing facial actions in sequences of images and provides converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.
Abstract: The facial action coding system (FAGS) is an objective method for quantifying facial movement in terms of component actions. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include: analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.

1,086 citations


Journal ArticleDOI
TL;DR: An efficient and reliable probabilistic metric derived from the Bhattacharrya distance is used in order to classify the extracted feature vectors into face or nonface areas, using some prototype face area vectors, acquired in a previous training stage.
Abstract: Detecting and recognizing human faces automatically in digital images strongly enhance content-based video indexing systems. In this paper, a novel scheme for human faces detection in color images under nonconstrained scene conditions, such as the presence of a complex background and uncontrolled illumination, is presented. Color clustering and filtering using approximations of the YCbCr and HSV skin color subspaces are applied on the original image, providing quantized skin color regions. A merging stage is then iteratively performed on the set of homogeneous skin color regions in the color quantized image, in order to provide a set of potential face areas. Constraints related to shape and size of faces are applied, and face intensity texture is analyzed by performing a wavelet packet decomposition on each face area candidate in order to detect human faces. The wavelet coefficients of the band filtered images characterize the face texture and a set of simple statistical deviations is extracted in order to form compact and meaningful feature vectors. Then, an efficient and reliable probabilistic metric derived from the Bhattacharrya distance is used in order to classify the extracted feature vectors into face or nonface areas, using some prototype face area vectors, acquired in a previous training stage.

641 citations


Journal ArticleDOI
TL;DR: A novel classification method, called the nearest feature line (NFL), for face recognition, based on the nearest distance from the query feature point to each FL, which achieves the lowest error rate reported for the ORL face database.
Abstract: We propose a classification method, called the nearest feature line (NFL), for face recognition. Any two feature points of the same class (person) are generalized by the feature line (FL) passing through the two points. The derived FL can capture more variations of face images than the original points and thus expands the capacity of the available database. The classification is based on the nearest distance from the query feature point to each FL. With a combined face database, the NFL error rate is about 43.7-65.4% of that of the standard eigenface method. Moreover, the NFL achieves the lowest error rate reported to date for the ORL face database.

555 citations



Patent
12 Apr 1999
TL;DR: In this article, an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets is described. And the jets are composed of wavelet transforms and are processed at nodes or landmark locations on an image corresponding to readily identifiable features.
Abstract: The present invention is embodied in an apparatus, and related method, for detecting and recognizing an object in an image frame. The object may be, for example, a head having particular facial characteristics. The object detection process uses robust and computationally efficient techniques. The object identification and recognition process uses an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets. The jets are composed of wavelet transforms and are processed at nodes or landmark locations on an image corresponding to readily identifiable features. The system of the invention is particularly advantageous for recognizing a person over a wide variety of pose angles.

379 citations


Journal ArticleDOI
TL;DR: In this article, a model of human face recognition which combines both a perceptual and a cognitive component is presented, which has a much wider predictive range than either perceptual or cognitive models alone.

304 citations


Journal ArticleDOI
TL;DR: The results suggest that the recognition of facial images is tuned to a relatively narrow band (< 2 octaves) of mid object spatial frequencies.

294 citations


Journal ArticleDOI
TL;DR: The proposed elastic graph matching method applied to the authentication of human faces where candidates claim an identity that is to be checked compares favorably with two methods that require a prior geometric face normalization, namely the synergetic and eigenface approaches.
Abstract: Elastic graph matching has been proposed as a practical implementation of dynamic link matching, which is a neural network with dynamically evolving links between a reference model and an input image. Each node of the graph contains features that characterize the neighborhood of its location in the image. The elastic graph matching usually consists of two consecutive steps, namely a matching with a rigid grid, followed by a deformation of the grid, which is actually the elastic part. The deformation step is introduced in order to allow for some deformation, rotation, and scaling of the object to be matched. This method is applied here to the authentication of human faces where candidates claim an identity that is to be checked. The matching error as originally suggested is not powerful enough to provide satisfying results in this case. We introduce an automatic weighting of the nodes according to their significance. We also explore the significance of the elastic deformation for an application of face-based person authentication. We compare performance results obtained with and without the second matching step. Results show that the deformation step slightly increases the performance, but has lower influence than the weighting of the nodes. The best results are obtained with the combination of both aspects. The results provided by the proposed method compare favorably with two methods that require a prior geometric face normalization, namely the synergetic and eigenface approaches.

249 citations


Journal ArticleDOI
TL;DR: Unique evidence supporting the indispensability of the early streaming process for face recognition is provided by a rare case of developmental pure prosopagnosia with otherwise normal visual and cognitive functions.
Abstract: Computational considerations suggest that efficient face identification requires the categorization and exclusive streaming of previously encoded face visual primitives into a dedicated face recognition system. Unique evidence supporting this claim is provided by a rare case of developmental pure prosopagnosia with otherwise normal visual and cognitive functions. Despite his normal visual memory and ability to describe faces, he is extremely impaired in face recognition. An early event related brain potential (N170) that is normally elicited exclusively by human faces, showed no specificity in this person. MRI revealed a smaller then normal right temporal lobe. These data emphasize the indispensability of the early streaming process for face recognition.

Proceedings ArticleDOI
01 Sep 1999
TL;DR: Given video footage of a person's face, this work presents new techniques to automatically recover the face position and the facial expression from each frame in the video sequence using a 3D face model fitted to each frame using a continuous optimization technique.
Abstract: Given video footage of a person's face, we present new techniques to automatically recover the face position and the facial expression from each frame in the video sequence. A 3D face model is fitted to each frame using a continuous optimization technique. Our model is based on a set of 3D face models that are linearly combined using 3D morphing. Our method has the advantages over previous techniques of fitting directly a realistic 3-dimensional face model and of recovering parameters that can be used directly in an animation system. We also explore many applications, including performance-driven animation (applying the recovered position and expression of the face to a synthetic character to produce an animation that mimics the input video), relighting the face, varying the camera position, and adding facial ornaments such as tattoos and scars.

Journal ArticleDOI
TL;DR: It is found that faces evoked different MEG responses as a function of task demands, i.e., the activations recorded during facial emotion recognition were different from those recorded during simple face recognition in the control task.

Proceedings ArticleDOI
24 Oct 1999
TL;DR: The results show that the use of the color information embedded in a eigen approach, improve the recognition rate when compared to the same scheme which uses only the luminance information.
Abstract: A common feature found in practically all technical approaches proposed for face recognition is the use of only the luminance information associated to the face image. One may wonder if this is due to the low importance of the color information in face recognition or due to other less technical reasons such as the no availability of color image database. Motivated by this reasoning, we have performed a variety of tests using a global eigen approach developed previously, which has been modified to cope with the color information. Our results show that the use of the color information embedded in a eigen approach, improve the recognition rate when compared to the same scheme which uses only the luminance information.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: This paper explores the issues involved in applying SVMs to phonetic classification as a first step to speech recognition and presents results on several standard vowel and phonetic Classification tasks and shows better performance than Gaussian mixture classifiers.
Abstract: Support vector machines (SVMs) represent a new approach to pattern classification which has attracted a great deal of interest in the machine learning community. Their appeal lies in their strong connection to the underlying statistical learning theory, in particular the theory of structural risk minimization. SVMs have been shown to be particularly successful in fields such as image identification and face recognition; in many problems SVM classifiers have been shown to perform much better than other nonlinear classifiers such as artificial neural networks and k-nearest neighbors. This paper explores the issues involved in applying SVMs to phonetic classification as a first step to speech recognition. We present results on several standard vowel and phonetic classification tasks and show better performance than Gaussian mixture classifiers. We also present an analysis of the difficulties we foresee in applying SVMs to continuous speech recognition problems.

Journal ArticleDOI
01 Apr 1999
TL;DR: A statistical approach that combines EST with canonical space transformation (CST) is proposed for gait recognition using temporal templates from a gait sequence as features to reduce data dimensionality and to optimise the class separability of different gait sequences simultaneously.
Abstract: A system for automatic gait recognition without segmentation of particular body parts is described. Eigenspace transformation (EST) has already proved useful for several tasks including face recognition, gait analysis, etc; it is optimal in dimensionality reduction by maximising the total scatter of all classes but is not optimal for class separability. A statistical approach that combines EST with canonical space transformation (CST) is proposed for gait recognition using temporal templates from a gait sequence as features. This method can be used to reduce data dimensionality and to optimise the class separability of different gait sequences simultaneously. Incorporating temporal information from optical-flow changes between two consecutive spatial templates, each temporal template extracted from computation of optical flow is projected from a high-dimensional image space to a single point in a low-dimensional canonical space. Using template matching, recognition of human gait becomes much faster and simpler in this new space. As such, the combination of EST and CST is shown to be of considerable potential in an emerging new biometric.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: An embedded hidden Markov model (HMM)-based approach for face detection and recognition that uses an efficient set of observation vectors obtained from the 2D-DCT coefficients that can model the two dimensional data better than the one-dimensional HMM and is computationally less complex than the two-dimensional model.
Abstract: We describe an embedded hidden Markov model (HMM)-based approach for face detection and recognition that uses an efficient set of observation vectors obtained from the 2D-DCT coefficients. The embedded HMM can model the two dimensional data better than the one-dimensional HMM and is computationally less complex than the two-dimensional HMM. This model is appropriate for face images since it exploits an important facial characteristic: frontal faces preserve the same structure of "super states" from top to bottom, and also the same left-to-right structure of "states" inside each of these "super states".

Journal ArticleDOI
TL;DR: A new approach is proposed which combines canonical space transformation (CST) based on Canonical Analysis (CA), with EST for feature extraction, which can be used to reduce data dimensionality and to optimise the class separability of different gait classes simultaneously.

01 Jan 1999
TL;DR: Discriminant analysis shows that the ICA criterion, when carried out in the properly compressed and whitened space, performs better than the eigenfaces and Fisherfaces methods for face recognition, but its performance deteriorates when augmented by additional criteria such as the Maximum A Posteriori (MAP) rule of the Bayes classifier or the FLD.
Abstract: This paper addresses the relative usefulness of Independent Component Analysis (ICA) for Face Recognition. Comparative assessments are made regarding (i) ICA sensitivity to the dimension of the space where it is carried out, and (ii) ICA discriminant performance alone or when combined with other discriminant criteria such as Bayesian framework or Fisher’s Linear Discriminant (FLD). Sensitivity analysis suggests that for enhanced performance ICA should be carried out in a compressed and whitened Principal Component Analysis (PCA) space where the small trailing eigenvalues are discarded. The reason for this finding is that during whitening the eigenvalues of the covariance matrix appear in the denominator and that the small trailing eigenvalues mostly encode noise. As a consequence the whitening component, if used in an uncompressed image space, would fit for misleading variations and thus generalize poorly to new data. Discriminant analysis shows that the ICA criterion, when carried out in the properly compressed and whitened space, performs better than the eigenfaces and Fisherfaces methods for face recognition, but its performance deteriorates when augmented by additional criteria such as the Maximum A Posteriori (MAP) rule of the Bayes classifier or the FLD. The reason for the last finding is that the Mahalanobis distance embedded in the MAP classifier duplicates to some extent the whitening component, while using FLD is counter to the independence criterion intrinsic to ICA.

Book
01 Jan 1999
TL;DR: In this paper, the authors developed the vocabulary of ridges and parabolic curves, of illumination eigenfaces and elastic warpings for describing the perceptually salient features of a face and its images.
Abstract: The human face is perhaps the most familiar and easily recognized object in the world, yet both its three-dimensional shape and its two-dimensional images are complex and hard to characterize. This book develops the vocabulary of ridges and parabolic curves, of illumination eigenfaces and elastic warpings for describing the perceptually salient features of a face and its images. The book also explores the underlying mathematics and applies these mathematical techniques to the computer vision problem of face recognition, using both optical and range images.

Proceedings Article
22 Aug 1999
TL;DR: One goal of human computer interaction (HCI) is to make an adaptive, smart computer system that could possibly include gesture recognition, facial recognition, eye tracking, speech recognition, etc.
Abstract: One goal of human computer interaction (HCI) is to make an adaptive, smart computer system. This type of project could possibly include gesture recognition, facial recognition, eye tracking, speech recognition, etc. Another non-invasive way to obtain information about a person is through touch. People use their computers to obtain, store and manipulate data using their computer. In order to start creating smart computers, the computer must start gaining information about the user.

Proceedings ArticleDOI
15 May 1999
TL;DR: Expression Glasses provide a wearable "appliance-based" alternative to general-purpose machine vision face recognition systems, and use pattern recognition to identify meaningful expressions such as confusion or interest.
Abstract: Expression Glasses provide a wearable "appliance-based" alternative to general-purpose machine vision face recognition systems. The glasses sense facial muscle movements, and use pattern recognition to identify meaningful expressions such as confusion or interest. A prototype of the glasses has been built and evaluated. The prototype uses piezoelectric sensors hidden in a visor extension to a pair of glasses, providing for compactness, user control, and anonymity. On users who received no training or feedback, the glasses initially performed at 94% accuracy in detecting an expression, and at 74% accuracy in recognizing whether the expression was confusion or interest. Significant improvement beyond these numbers appears to be possible with extended use, and with-a small amount of feedback (letting the user see the output of the system).

Journal ArticleDOI
TL;DR: In this paper, Simon et al. used computational models to show how the face processing specialization apparently underlying prosopagnosia and visual object agnosia could be attributed to a relatively simple competitive selection mechanism that, during development, devotes neural resources to the tasks they are best at performing.

Journal ArticleDOI
TL;DR: The self-organizing hierarchical optimal subspace learning and inference framework (SHOSLIF) system uses the theories of optimal linear projection for optimal feature derivation and a hierarchical structure to achieve logarithmic retrieval complexity.
Abstract: A self-organizing framework for object recognition is described. We describe a hierarchical database structure for image retrieval. The self-organizing hierarchical optimal subspace learning and inference framework (SHOSLIF) system uses the theories of optimal linear projection for optimal feature derivation and a hierarchical structure to achieve logarithmic retrieval complexity. A space-tessellation tree is generated using the most expressive features (MEF) and most discriminating features (MDF) at each level of the tree. The major characteristics of the analysis include: (1) avoiding the limitation of global linear features by deriving a recursively better-fitted set of features for each of the recursively subdivided sets of training samples; (2) generating a smaller tree whose cell boundaries separate the samples along the class boundaries better than the principal component analysis, thereby giving a better generalization capability (i.e., better recognition rate in a disjoint test); (3) accelerating the retrieval using a tree structure for data pruning, utilizing a different set of discriminant features at each level of the tree. We allow for perturbations in the size and position of objects in the images through learning. We demonstrate the technique on a large image database of widely varying real-world objects taken in natural settings, and show the applicability of the approach for variability in position, size, and 3D orientation. This paper concentrates on the hierarchical partitioning of the feature spaces.

Journal ArticleDOI
TL;DR: Results obtained from a testbed used to investigate different codings for automatic face recognition strongly support the suggestion that faces should be considered as lying in a high-dimensional manifold, which is locally linearly approximated by these shapes and textures, possibly with a separate system for local features.
Abstract: We describe results obtained from a testbed used to investigate different codings for automatic face recognition. An eigenface coding of shape-free faces using manually located landmarks was more effective than the corresponding coding of correctly shaped faces. Configuration also proved an effective method of recognition, with rankings given to incorrect matches relatively uncorrelated with those from shape-free faces. Both sets of information combine to improve significantly the performance of either system. The addition of a system, which directly correlated the intensity values of shape-free images, also significantly increased recognition, suggesting extra information was still available. The recognition advantage for shape-free faces reflected and depended upon high-quality representation of the natural facial variation via a disjoint ensemble of shape-free faces; if the ensemble comprised nonfaces, a shape-free disadvantage was induced. Manipulation within the shape-free coding to emphasize distinctive features of the faces, by caricaturing, allowed further increases in performance; this effect was only noticeable when the independent shape-free and configuration coding was used. Taken together, these results strongly support the suggestion that faces should be considered as lying in a high-dimensional manifold, which is locally linearly approximated by these shapes and textures, possibly with a separate system for local features. Principal components analysis is then seen as a convenient tool in this local approximation.

Proceedings ArticleDOI
12 Oct 1999
TL;DR: In order to equip the PICASSO system for the feedback loop from the observer, the eye camera system is introduced and principles to generate a facial caricature and to evaluate the generated works are explained precisely, together with demonstrations of experimental results.
Abstract: The PICASSO system for facial caricaturing is a typical KANSEI vision system in the following senses: it can process 2D, 3D and motion facial images to generate a facial sketch of line drawings; it can extract individuality features by introducing a "mean face" for enforcing the visual KANSEI impressions by which the face looks more likely to be so; and it can evaluate the generated facial caricature to be moderate for the individual visual KANSEI characteristics, based on the model of visual illusion. Image processing methods for generating a facial sketch based on the Hough transform are presented. Principles to generate a facial caricature and to evaluate the generated works are explained precisely, together with demonstrations of experimental results. In order to equip the PICASSO system for the feedback loop from the observer, we have introduced the eye camera system. This equipment is expected to provide a new method to extract the simultaneous individual eye-mark patterns which show where in the face the observer is watching. These preliminary experiments are also presented to show the possibility to extract more tightly individual KANSEI visual information.

Dissertation
01 Jan 1999
TL;DR: Compared to other methods, this proposed system offers a more flexible framework for face recognition and detection, and can be used more efficiently in scale invariant systems.
Abstract: The use of hidden Markov models (HMM) for faces is motivated by their partial invariance to variations in scaling and by the structure of faces. The most significant facial features of a frontal face include the hair, forehead, eyes, nose and mouth. These features occur in a natural order, from top to bottom, even if the images undergo small rotations in the image plane, and/or rotations in the plane perpendicular to the image plane. Therefore, the image of a face may be modeled using a one-dimensional HMM by assigning each of these regions to a state. The observation vectors are obtained from the DCT or KLT coefficients. A one-dimensional HMM may be generalized, to give it the appearance of a two-dimensional structure, by allowing each state in a one-dimensional HMM to be a HMM. In this way, the HMM consists of a set of super states, along with a set of embedded states. Therefore, this is referred to as an embedded HMM. The super states may then be used to model two-dimensional data along one direction, with the embedded HMM modeling the data along the other direction. Both the standard HMM and the embedded HMM were tested for face recognition and detection. Compared to other methods, our proposed system offers a more flexible framework for face recognition and detection, and can be used more efficiently in scale invariant systems.

Proceedings ArticleDOI
Baback Moghaddam1
20 Sep 1999
TL;DR: This work compares the recognition performance of a nearest neighbour matching rule with each principal manifold representation to that of a maximum a posteriori (MAP) matching rule using a Bayesian similarity measure derived from probabilistic subspaces, and demonstrates the superiority of the latter.
Abstract: We investigate the use of linear and nonlinear principal manifolds for learning low dimensional representations for visual recognition. Three techniques: principal component analysis (PCA), independent component analysis (ICA) and nonlinear PCA (NLPCA) are examined and tested in a visual recognition experiment using a large gallery of facial images from the "FERET" database. We compare the recognition performance of a nearest neighbour matching rule with each principal manifold representation to that of a maximum a posteriori (MAP) matching rule using a Bayesian similarity measure derived from probabilistic subspaces, and demonstrate the superiority of the latter.

Journal ArticleDOI
TL;DR: The authors compare two models that try to simulate neuropsychological findings that prosopagnosic patients, who are unable to recognise faces overtly, nonetheless show evidence of face recognition when indirect tests are used.
Abstract: We compare two models that try to simulate neuropsychological findings that prosopagnosic patients, who are unable to recognise faces overtly, nonetheless show evidence of face recognition when indirect tests are used. This “covert recognition” ability has been captured by simulation in an IAC model (Burton, Young, Bruce, Johnston, & Ellis, 1991) and a model we call FOV (Farah, O'Reilly, & Vecera, 1993). The IAC model is localist and has been developed incrementally to account for various effects in normal face recognition. The FOV model is distributed, and was created specifically to demonstrate how covert processing effects emerge as a “natural” consequence of this type of connectionist implementation. We examine the ability of these models to account for data from prosopagnosia, their plausibility as models of normal face recognition, and their general modelling styles. The FOV model is able to simulate only the data for which it was created, whereas the IAC model usually stands up well to these tests,...

Proceedings ArticleDOI
26 Sep 1999
TL;DR: This work extends SVMs to model the 2D appearance of human faces which undergo nonlinear change across the view sphere and enables simultaneous multi-view face detection and pose estimation at near-frame rate.
Abstract: Support vector machines have shown great potential for learning classification functions that can be applied to object recognition. In this work, we extend SVMs to model the 2D appearance of human faces which undergo nonlinear change across the view sphere. The model enables simultaneous multi-view face detection and pose estimation at near-frame rate.