scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 2000"


Journal ArticleDOI
TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

6,527 citations


Journal ArticleDOI
TL;DR: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems.
Abstract: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems. The Face Recognition Technology (FERET) program has addressed both issues through the FERET database of facial images and the establishment of the FERET tests. To date, 14,126 images from 1,199 individuals are included in the FERET database, which is divided into development and sequestered portions of the database. In September 1996, the FERET program administered the third in a series of FERET face-recognition tests. The primary objectives of the third test were to 1) assess the state of the art, 2) identify future areas of research, and 3) measure algorithm performance.

4,816 citations


Proceedings ArticleDOI
26 Mar 2000
TL;DR: The problem space for facial expression analysis is described, which includes level of description, transitions among expressions, eliciting conditions, reliability and validity of training and test data, individual differences in subjects, head orientation and scene complexity image characteristics, and relation to non-verbal behavior.
Abstract: Within the past decade, significant effort has occurred in developing methods of facial expression analysis. Because most investigators have used relatively limited data sets, the generalizability of these various methods remains unknown. We describe the problem space for facial expression analysis, which includes level of description, transitions among expressions, eliciting conditions, reliability and validity of training and test data, individual differences in subjects, head orientation and scene complexity image characteristics, and relation to non-verbal behavior. We then present the CMU-Pittsburgh AU-Coded Face Expression Image Database, which currently includes 2105 digitized image sequences from 182 adult subjects of varying ethnicity, performing multiple tokens of most primary FACS action units. This database is the most comprehensive testbed to date for comparative studies of facial expression analysis.

2,705 citations


Journal ArticleDOI
TL;DR: The capability of the human visual system with respect to these problems is discussed, and it is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.
Abstract: Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information, and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. The paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

1,872 citations


Journal ArticleDOI
TL;DR: It is proved that the most expressive vectors derived in the null space of the within-class scatter matrix using principal component analysis (PCA) are equal to the optimal discriminant vectorsderived in the original space using LDA.

1,447 citations


Proceedings Article
01 Jan 2000
TL;DR: The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data.
Abstract: We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data.

1,112 citations


Journal ArticleDOI
TL;DR: A simple method of replacing costly computation of nonlinear (on-line) Bayesian similarity measures by inexpensive linear subspace projections and simple Euclidean norms is derived, thus resulting in a significant computational speed-up for implementation with very large databases.

660 citations


Journal ArticleDOI
TL;DR: A speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments and is demonstrated on a large multispeaker database of continuously spoken digits.
Abstract: This paper describes a speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments. The system consists of three components: a visual module; an acoustic module; and a sensor fusion module. The visual module locates and tracks the lip movements of a given speaker and extracts relevant speech features. This task is performed with an appearance-based lip model that is learned from example images. Visual speech features are represented by contour information of the lips and grey-level information of the mouth area. The acoustic module extracts noise-robust features from the audio signal. Finally the sensor fusion module is responsible for the joint temporal modeling of the acoustic and visual feature streams and is realized using multistream hidden Markov models (HMMs). The multistream method allows the definition of different temporal topologies and levels of stream integration and hence enables the modeling of temporal dependencies more accurately than traditional approaches. We present two different methods to learn the asynchrony between the two modalities and how to incorporate them in the multistream models. The superior performance for the proposed system is demonstrated on a large multispeaker database of continuously spoken digits. On a recognition task at 15 dB acoustic signal-to-noise ratio (SNR), acoustic perceptual linear prediction (PLP) features lead to 56% error rate, noise robust RASTA-PLP (relative spectra) acoustic features to 7.2% error rate and combined noise robust acoustic features and visual features to 2.5% error rate.

620 citations


Journal ArticleDOI
TL;DR: In this paper, the head is modeled as a texture mapped cylinder and tracking is formulated as an image registration problem in the cylinder's texture map image, which is solved by regularized weighted least squares error minimization.
Abstract: A technique for 3D head tracking under varying illumination is proposed. The head is modeled as a texture mapped cylinder. Tracking is formulated as an image registration problem in the cylinder's texture map image. The resulting dynamic texture map provides a stabilized view of the face that can be used as input to many existing 2D techniques for face recognition, facial expressions analysis, lip reading, and eye tracking. To solve the registration problem with lighting variation and head motion, the residual registration error is modeled as a linear combination of texture warping templates and orthogonal illumination templates. Fast stable online tracking is achieved via regularized weighted least-squares error minimization. The regularization tends to limit potential ambiguities that arise in the warping and illumination templates. It enables stable tracking over extended sequences. Tracking does not require a precise initial model fit; the system is initialized automatically using a simple 2D face detector. It is assumed that the target is facing the camera in the first frame. The formulation uses texture mapping hardware. The nonoptimized implementation runs at about 15 frames per second on a SGI O2 graphic workstation. Extensive experiments evaluating the effectiveness of the formulation are reported. The sensitivity of the technique to illumination, regularization parameters, errors in the initial positioning, and internal camera parameters are analyzed. Examples and applications of tracking are reported.

606 citations


Proceedings ArticleDOI
26 Mar 2000
TL;DR: The potential of SVM on the Cambridge ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose, and facial details, is illustrated.
Abstract: Support vector machines (SVM) have been recently proposed as a new technique for pattern recognition. SVM with a binary tree recognition strategy are used to tackle the face recognition problem. We illustrate the potential of SVM on the Cambridge ORL face database, which consists of 400 images of 40 individuals, containing quite a high degree of variability in expression, pose, and facial details. We also present the recognition experiment on a larger face database of 1079 images of 137 individuals. We compare the SVM-based recognition with the standard eigenface approach using the nearest center classification (NCC) criterion.

557 citations


Journal Article
TL;DR: In this article, the authors designed a set of questions to ask when evaluating a biometric system, and to assist in determining whether performance levels meet the requirements of an application, such as reducing-as opposed to eliminating-fraud.
Abstract: On the basis of media hype alone, you might conclude that biometric passwords will soon replace their alphanumeric counterparts with versions that cannot be stolen, forgotten, lost, or given to another person. But what if the actual performance of these systems falls short of the estimates? The authors designed this article to provide sufficient information to know what questions to ask when evaluating a biometric system, and to assist in determining whether performance levels meet the requirements of an application. For example, a low-performance biometric is probably sufficient for reducing-as opposed to eliminating-fraud. Likewise, completely replacing an existing security system with a biometric-based one may require a high-performance biometric system, or the required performance may be beyond what current technology can provide. Of the biometrics that give the user some control over data acquisition, voice, face, and fingerprint systems have undergone the most study and testing-and therefore occupy the bulk of this discussion. This article also covers the tools and techniques of biometric testing.

Journal ArticleDOI
TL;DR: This article goes into detail about the BioID system functions, explaining the data acquisition and preprocessing techniques for voice, facial, and lip imagery data and the classification principles used for optical features and the sensor fusion options.
Abstract: Biometric identification systems, which use physical features to check a person's identity, ensure much greater security than password and number systems. Biometric features such as the face or a fingerprint can be stored on a microchip in a credit card, for example. A single feature, however, sometimes fails to be exact enough for identification. Another disadvantage of using only one feature is that the chosen feature is not always readable. Dialog Communication Systems (DCS AG) developed BioID, a multimodal identification system that uses three different features-face, voice, and lip movement-to identify people. With its three modalities, BioID achieves much greater accuracy than single-feature systems. Even if one modality is somehow disturbed-for example, if a noisy environment drowns out the voice-the ether two modalities still lead to an accurate identification. This article goes into detail about the system functions, explaining the data acquisition and preprocessing techniques for voice, facial, and lip imagery data. The authors also explain the classification principles used for optical features and the sensor fusion options (the combinations of the three results-face, voice, lip movement-to obtain varying levels of security).

Journal ArticleDOI
TL;DR: EP has better recognition performance than PCA (eigenfaces) and better generalization abilities than the Fisher linear discriminant (Fisherfaces).
Abstract: Introduces evolutionary pursuit (EP) as an adaptive representation method for image encoding and classification In analogy to projection pursuit, EP seeks to learn an optimal basis for the dual purpose of data compression and pattern classification It should increase the generalization ability of the learning machine as a result of seeking the trade-off between minimizing the empirical risk encountered during training and narrowing the confidence interval for reducing the guaranteed risk during testing It therefore implements strategies characteristic of GA for searching the space of possible solutions to determine the optimal basis It projects the original data into a lower dimensional whitened principal component analysis (PCA) space Directed random rotations of the basis vectors in this space are searched by GA where evolution is driven by a fitness function defined by performance accuracy (empirical risk) and class separation (confidence interval) Accuracy indicates the extent to which learning has been successful, while separation gives an indication of expected fitness The method has been tested on face recognition using a greedy search algorithm To assess both accuracy and generalization capability, the data includes for each subject images acquired at different times or under different illumination conditions EP has better recognition performance than PCA (eigenfaces) and better generalization abilities than the Fisher linear discriminant (Fisherfaces)

Book
01 Jan 2000
TL;DR: Introduction to Fingerprint Recognition, U.J. Erol Fingerprint Feature Processing Techniques and Poroscopy, A.R. Howell Neural Networks for Face recognition, and Ongun Introduction to Face Recognition.
Abstract: Introduction to Fingerprint Recognition, U. Halici, L.C. Jain, and A. Erol Fingerprint Feature Processing Techniques and Poroscopy, A.R. Roddy and J.D. Stosz Fingerprint Sub-Classification: A Neural Network Approach, G.A. Drets and H.G. Leljecstroem A Gabor Filter-Based Method for Fingerprint Identification, Y. Hamamoto Minutiae Extraction and Filtering from Gray-Scale Images, D. Maio and D. Maltoni Feature Selective Filtering for Ridge Extraction, A. Erol, U. Halici, and G. Ongun Introduction to Face Recognition, A.J. Howell Neural Networks for Face Recognition, A.S. Pandya and R.R. Szabo Face Unit Radial Basis Function Networks, A.J. Howell Face Recognition from Correspondence Maps, R.P. Wurtz Face Recognition by Elastic Bunch Graph Matching, L. Wiskott, J.-M. Fellous, N. Kruger, and C. von der Malsburg Facial Expression Synthesis Using Radial Basis Function Networks, I. King and X.Q. Li Recognition of Facial Expressions and Its Application to Human Computer Interaction, T. Onisawa and S. Kitazake

Proceedings ArticleDOI
26 Mar 2000
TL;DR: A novel face recognition algorithm based on the point signature-a representation for free-form surfaces that can be quickly and efficiently identified and ranked according to their similarity with the test face.
Abstract: We present a novel face recognition algorithm based on the point signature-a representation for free-form surfaces. We treat the face recognition problem as a non-rigid object recognition problem. The rigid parts of the face of one person are extracted after registering the range data sets of faces having different facial expressions. These rigid parts are used to create a model library for efficient indexing. For a test face, models are indexed from the library and the most appropriate models are ranked according to their similarity with the test face. Verification of each model face can be quickly and efficiently identified. Experimental results with range data involving six human subjects, each with four different facial expressions, have demonstrated the validity and effectiveness of our algorithm.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: Appearances-based methods which, unlike previous appearance-based approaches, require only a small set of training images to generate a rich representation that models this variability, are presented.
Abstract: Image variability due to changes in pose and illumination can seriously impair object recognition. This paper presents appearance-based methods which, unlike previous appearance-based approaches, require only a small set of training images to generate a rich representation that models this variability. Specifically, from as few as three images of an object in fixed pose seen under slightly varying but unknown lighting, a surface and an albedo map are reconstructed. These are then used to generate synthetic images with large variations in pose and illumination and thus build a representation useful for object recognition. Our methods have been tested within the domain of face recognition on a subset of the Yale Face Database B containing 4050 images of 10 faces seen under variable pose and illumination. This database was specifically gathered for testing these generative methods. Their performance is shown to exceed that of popular existing methods.

Journal ArticleDOI
TL;DR: The authors describe the face recognition technology used, explaining the algorithms for face recognition as well as novel applications, such as behavior monitoring that assesses emotions based on facial expressions.
Abstract: Smart environments, wearable computers, and ubiquitous computing in general are the coming "fourth generation" of computing and information technology. But that technology will be a stillbirth without new interfaces for interaction, minus a keyboard or mouse. To win wide consumer acceptance, these interactions must be friendly and personalized; the next generation interfaces must recognize people in their immediate environment and, at a minimum, know who they are. In this article, the authors discuss face recognition technology, how it works, problems to be overcome, current technologies, and future developments and possible applications. Twenty years ago, the problem of face recognition was considered among the most difficult in artificial intelligence and computer vision. Today, however, there are several companies that sell commercial face recognition software that is capable of high-accuracy recognition with databases of more than 1,000 people. The authors describe the face recognition technology used, explaining the algorithms for face recognition as well as novel applications, such as behavior monitoring that assesses emotions based on facial expressions.

BookDOI
01 Aug 2000
TL;DR: Many of the issues raised are relevant to object recognition in general, and such visual learning machines have numerous potential applications in areas such as visual surveillance multimedia and visually mediated interaction.
Abstract: From the Publisher: Face recognition is a task which the human visual system seems to perform almost effortlessly, yet goal of building machines with comparable capabilities has proven difficult to realize. The task requires the ability to locate and track faces through scenes which are often complex and dynamic. Recognition is difficult because of variations in factors such as lighting conditions, viewpoint, body movement and facial expression. Although evidence from psychophysical and neurobiological experiments provides intriguing insights into how we might code and recognize faces, their bearings on computational and engineering solutions are far from clear. This book describes how to build learning machines to perform face recognition in dynamic scenes. The task at hand is that of engineering robust machine vision systems that can operate under poorly controlled and changing conditions. Many of the issues raised are relevant to object recognition in general, and such visual learning machines have numerous potential applications in areas such as visual surveillance multimedia and visually mediated interaction.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector, which helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods.
Abstract: A support vector machine-based multi-view face detection and recognition framework is described. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity of the modelling. The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector. This helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods. For video sequences, further computational reduction can be achieved by using a pose change smoothing strategy. When face detectors find a face in frontal view, a support vector machine-based multi-class classifier is activated for face recognition. All the above issues are integrated under a support vector machine framework. Test results on four video sequences are presented, among them the detection rate is above 95%, recognition accuracy is above 90%, average pose estimation error is around 10/spl deg/, and the full detection and recognition speed is up to 4 frames/second on a Pentium II 300 PC.

Proceedings ArticleDOI
10 Sep 2000
TL;DR: This work investigates a generalization of PCA, kernel principal component analysis (kernel PCA), for learning low dimensional representations in the context of face recognition and shows that kernel PCA outperforms the eigenface method in face recognition.
Abstract: Eigenface or principal component analysis (PCA) methods have demonstrated their success in face recognition, detection, and tracking. The representation in PCA is based on the second order statistics of the image set, and does not address higher order statistical dependencies such as the relationships among three or more pixels. Higher order statistics (HOS) have been used as a more informative low dimensional representation than PCA for face and vehicle detection. We investigate a generalization of PCA, kernel principal component analysis (kernel PCA), for learning low dimensional representations in the context of face recognition. In contrast to HOS, kernel PCA computes the higher order statistics without the combinatorial explosion of time and memory complexity. While PCA aims to find a second order correlation of patterns, kernel PCA provides a replacement which takes into account higher order correlations. We compare the recognition results using kernel methods with eigenface methods on two benchmarks. Empirical results show that kernel PCA outperforms the eigenface method in face recognition.

Proceedings ArticleDOI
H. Hongo, Mitsunori Ohya1, M. Yasumoto, Y. Niwa, Kazuhiko Yamamoto1 
26 Mar 2000
TL;DR: A multi-camera system that can track multiple human faces and hands as well as focus on face and hand gestures for recognition and four directional features by using linear discriminant analysis are proposed.
Abstract: We propose a multi-camera system that can track multiple human faces and hands as well as focus on face and hand gestures for recognition. Our current system consists of four cameras. Two fixed cameras are used as a stereo system to estimate face and hand positions. The stereo camera detects faces and hands by a standard skin color method we propose. The distances of the targets are then estimated. Next to track multiple targets, we estimate the positions and sizes of targets between consecutive frames. The other two cameras perform tracking of such targets as faces and hands. If a target is not the appropriate size for recognition, the tracking cameras acquire its zoomed image. Since our system has two tracking cameras, it can track two targets at the same time. To recognize faces and hand gestures, we propose four directional features by using linear discriminant analysis. Using our system, we experimented on human position estimation, multiple face tracking, and face and hand gesture recognition. These experiments showed that our system could estimate human position with the stereo camera and track multiple targets by using target positions and sizes even if the persons overlapped with each other. In addition, our system could recognize faces and hand gestures by using the four directional features.

Journal ArticleDOI
TL;DR: Two new coding schemes are introduced, probabilistic reasoning models (PRM) and enhanced FLD (Fisher linear discriminant) models (EFM), for indexing and retrieval of large image databases with applications to face recognition.
Abstract: This paper introduces two new coding schemes, probabilistic reasoning models (PRM) and enhanced FLD (Fisher linear discriminant) models (EFM), for indexing and retrieval of large image databases with applications to face recognition. The unifying theme of the new schemes is that of lowering the space dimension ("data compression") subject to increased fitness for the discrimination index.

Proceedings ArticleDOI
01 Jun 2000
TL;DR: A new method based on symmetric shape-from-shading (SSFS) to develop a face recognition system that is robust to changes in illumination and utilizes the fact that all faces share a similar shape making the direct computation of the prototype image from a given face image feasible.
Abstract: Sensitivity to variations in illumination is a fundamental and challenging problem in face recognition. In this paper, we describe a new method based on symmetric shape-from-shading (SSFS) to develop a face recognition system that is robust to changes in illumination. The basic idea of this approach is to use the SSFS algorithm as a tool to obtain a prototype image which is illumination-normalized. It has been shown that the SSFS algorithm has a unique point-wise solution. But it is still difficult to recover accurate shape information given a single real face image with complex shape and varying albedo. In stead, we utilize the fact that all faces share a similar shape making the direct computation of the prototype image from a given face image feasible. Finally, to demonstrate the efficacy of our method, we have applied it to several publicly available face databases.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: The findings show that the audio and video information can be combined using a rule-based system to improve the recognition rate.
Abstract: This paper describes the use of statistical techniques and hidden Markov models (HMM) in the recognition of emotions. The method aims to classify 6 basic emotions (anger, dislike, fear, happiness, sadness and surprise) from both facial expressions (video) and emotional speech (audio). The emotions of 2 human subjects were recorded and analyzed. The findings show that the audio and video information can be combined using a rule-based system to improve the recognition rate.

Journal ArticleDOI
TL;DR: The method exploits information which is complementary to gray level based approaches, enabling the fusion with those techniques, and is cheap and fast while offering a sufficient resolution for face recognition purposes.

Book
15 Jun 2000
TL;DR: 1. Early Vision, 2. From Local To Global Image Representation, and 3. The Problem of Visual Recognition.
Abstract: 1 Early Vision 2 From Local To Global Image Representation 3 The Problem Of Visual Recognition 4 Object Recognition 5 Face Recognition 6 Word Recognition 7 Visual Attention 8 Hemispatial Neglect 9 Mental Imagery 10 Visual Awareness

Journal ArticleDOI
TL;DR: Two distinct patterns of development are found in the development of emotional facial recognition in a large sample of school‐aged children and it is suggested that these different profiles are a consequence of the very different cognitive abilities that they recruit.
Abstract: UNLABELLED Recognition of the facial expressions of emotions is a critical communicative system early in development and continues to play an important role throughout adulthood. In the past, the results of developmental studies of emotional facial recognition have often conflicted. The present study was designed to examine the development of emotional facial recognition in a large sample of school-aged children (n = 120, ages 5-10y). In particular, we investigate whether emotion categories, i.e., those based on the visual spatial parameters of facial expression, develop in a similar fashion to those that also recruit lexical knowledge of emotion terms. We have found two distinct patterns of development and we suggest that these different profiles are a consequence of the very different cognitive abilities that they recruit. CONCLUSION Emotion cognition is a variegated domain which is differentially related to such areas of cognition as visuo-spatial and lexical semantic abilities.

Journal ArticleDOI
TL;DR: An extension of the face recognition system based on 2D DCT features and pseudo 2D Hidden Markov Models is capable of recognizing faces by using JPEG compressed image data, and these are the best recognition results ever reported on this database.

Journal ArticleDOI
TL;DR: It is argued that a fruitful direction for future research may lie in weighing information about facial features together with localized image features in order to provide a better mechanism for feature selection.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a subband approach in using principal component analysis (PCA) on wavelet subband, which decomposes an im- age into different frequency subbands, and a midrange frequency subband is used for PCA representation.
Abstract: Together with the growing interest in the development of human and computer interface and biometric identification, human face recognition has become an active research area since early 1990. Nowadays, principal component analysis (PCA) has been widely adopted as the most promising face recognition algorithm. Yet still, traditional PCA approach has its limitations: poor discrimi- natory power and large computational load. In view of these limita- tions, this article proposed a subband approach in using PCA— apply PCA on wavelet subband. Traditionally, to represent the human face, PCA is performed on the whole facial image. In the proposed method, wavelet transform is used to decompose an im- age into different frequency subbands, and a midrange frequency subband is used for PCA representation. In comparison with the traditional use of PCA, the proposed method gives better recogni- tion accuracy and discriminatory power; further, the proposed method reduces the computational load significantly when the im- age database is large, with more than 256 training images. This article details the design and implementation of the proposed method, and presents the encouraging experimental results. © 2000 SPIE and IS&T. (S1017-9909(00)01702-5)