scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Enhancing face recognition from video sequences using robust statistics

01 Sep 2005-pp 324-329
TL;DR: This work investigates a way of enhancing the performance of face recognition from video sequences by selecting only well-framed face images from those extracted from video Sequence based on robust statistics, and more precisely, a recently proposed robust high-dimensional data analysis method, RobPCA.
Abstract: The aim of this work is to investigate a way of enhancing the performance of face recognition from video sequences by selecting only well-framed face images from those extracted from video sequences. It is known that noisy face images (e.g. not well-centered, non-frontal poses...) significantly reduce the performance of face recognition methods, and therefore, need to be filtered out during the training and the recognition. The proposed method is based on robust statistics, and more precisely, a recently proposed robust high-dimensional data analysis method, RobPCA. Experiments show that this filtering procedure improves the recognition rate by 10 to 20%.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
23 Jun 2008
TL;DR: This work addresses the problem of tracking and recognizing faces in real-world, noisy videos using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting and introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework.
Abstract: We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting. However, adaptive appearance trackers often suffer from drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, our tracker introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework. The generative term conforms the particles to the space of generic face poses while the discriminative one ensures rejection of poorly aligned targets. This leads to a tracker that significantly improves robustness against abrupt appearance changes and occlusions, critical for the subsequent recognition phase. Identity of the tracked subject is established by fusing pose-discriminant and person-discriminant features over the duration of a video sequence. This leads to a robust video-based face recognizer with state-of-the-art recognition performance. We test the quality of tracking and face recognition on real-world noisy videos from YouTube as well as the standard Honda/UCSD database. Our approach produces successful face tracking results on over 80% of all videos without video or person-specific parameter tuning. The good tracking performance induces similarly high recognition rates: 100% on Honda/UCSD and over 70% on the YouTube set containing 35 celebrities in 1500 sequences.

493 citations


Cites background from "Enhancing face recognition from vid..."

  • ...Heuristic temporal voting schemes such as [3, 15] aggregate data from key frames containing well-illuminated frontal poses....

    [...]

Proceedings ArticleDOI
20 Jun 2011
TL;DR: An efficient patch-based face image quality assessment algorithm which quantifies the similarity of a face image to a probabilistic face model, representing an ‘ideal’ face is proposed.
Abstract: In video based face recognition, face images are typically captured over multiple frames in uncontrolled conditions, where head pose, illumination, shadowing, motion blur and focus change over the sequence. Additionally, inaccuracies in face localisation can also introduce scale and alignment variations. Using all face images, including images of poor quality, can actually degrade face recognition performance. While one solution it to use only the ‘best’ of images, current face selection techniques are incapable of simultaneously handling all of the abovementioned issues. We propose an efficient patch-based face image quality assessment algorithm which quantifies the similarity of a face image to a probabilistic face model, representing an ‘ideal’ face. Image characteristics that affect recognition are taken into account, including variations in geometric alignment (shift, rotation and scale), sharpness, head pose and cast shadows. Experiments on FERET and PIE datasets show that the proposed algorithm is able to identify images which are simultaneously the most frontal, aligned, sharp and well illuminated. Further experiments on a new video surveillance dataset (termed ChokePoint) show that the proposed method provides better face subsets than existing face selection techniques, leading to significant improvements in recognition accuracy.

314 citations


Cites methods from "Enhancing face recognition from vid..."

  • ...k-means clustering [13]) and statistical model approaches for outlier removal [6]....

    [...]

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals.
Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the l1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our method matches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.

120 citations


Cites background from "Enhancing face recognition from vid..."

  • ...Due to the large variations in the data, key-frame selection is crucial in this paradigm [4]....

    [...]

Patent
Tong Zhang1
11 Oct 2007
TL;DR: In this article, face regions (58) are detected in images and at least one respective parameter value (53) is extracted from each of the face regions, which are ranked based on the extracted parameter values (53).
Abstract: Face-based image clustering systems and methods are described. In one aspect, face regions (58) are detected in images (20). At least one respective parameter value (53) is extracted from each of the face regions (58). Ones of the face regions (58) associated with parameter values (53) satisfying a cluster seed predicate are classified as cluster seed face regions (38). The cluster seed face regions (38) are clustered into one or more clusters (44, 48). A respective face model (24) is built for each of the clusters (44, 48). The face models (24) are stored. In another aspect, face regions (58) are detected in images (20). At least one respective parameter value (53) is extracted from each of the face regions (58). The face regions (58) are ranked based on the extracted parameter values (53). The face regions (58) are clustered in rank order into one or more clusters (44, 48). Representations of ones of the clusters (44, 48) are rendered on a display (116).

115 citations

Journal ArticleDOI
TL;DR: A broad and deep review of recently proposed methods for overcoming the difficulties encountered in unconstrained settings is presented and connections between the ways in which humans and current algorithms recognize faces are drawn.
Abstract: Driven by key law enforcement and commercial applications, research on face recognition from video sources has intensified in recent years. The ensuing results have demonstrated that videos possess unique properties that allow both humans and automated systems to perform recognition accurately in difficult viewing conditions. However, significant research challenges remain as most video-based applications do not allow for controlled recordings. In this survey, we categorize the research in this area and present a broad and deep review of recently proposed methods for overcoming the difficulties encountered in unconstrained settings. We also draw connections between the ways in which humans and current algorithms recognize faces. An overview of the most popular and difficult publicly available face video databases is provided to complement these discussions. Finally, we cover key research challenges and opportunities that lie ahead for the field as a whole.

115 citations


Cites background from "Enhancing face recognition from vid..."

  • ...to detect outliers in video sequences that could cause recognition errors.(79) Potential outliers include face images with disruptive illumination effects, off-frontal head poses, poor alignment or any other property that causes them to deviate from a PCA based face model....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture.
Abstract: We have developed a near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals. The computational approach taken in this system is motivated by both physiology and information theory, as well as by the practical requirements of near-real-time performance and accuracy. Our approach treats the face recognition problem as an intrinsically two-dimensional (2-D) recognition problem rather than requiring recovery of three-dimensional geometry, taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characteristic views. The system functions by projecting face images onto a feature space that spans the significant variations among known face images. The significant features are known as "eigenfaces," because they are the eigenvectors (principal components) of the set of faces; they do not necessarily correspond to features such as eyes, ears, and noses. The projection operation characterizes an individual face by a weighted sum of the eigenface features, and so to recognize a particular face it is necessary only to compare these weights to those of known individuals. Some particular advantages of our approach are that it provides for the ability to learn and later recognize new faces in an unsupervised manner, and that it is easy to implement using a neural network architecture.

14,562 citations


"Enhancing face recognition from vid..." refers methods in this paper

  • ...In this evaluation, we have used two face recognition techniques, PCA (eigenfaces [10]) and LDA2D [11]....

    [...]

  • ...To analyze this problem, we will focus first on the well-known method of eigenfaces [10] in the context of face recognition from still images....

    [...]

  • ...We will illustrate the introduced concepts of our approach and show its effectiveness using two face databases and two methods for face recognition (PCA [10] and LDA2D [11])....

    [...]

Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations


"Enhancing face recognition from vid..." refers methods in this paper

  • ...LDA2D is a recently proposed method that has the same principle as the well-known LDA method (referred to as Fisherfaces in [2])....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Abstract: As one of the most successful applications of image analysis and understanding, face recognition has recently received significant attention, especially during the past several years. At least two reasons account for this trend: the first is the wide range of commercial and law enforcement applications, and the second is the availability of feasible technologies after 30 years of research. Even though current machine recognition systems have reached a certain level of maturity, their success is limited by the conditions imposed by many real applications. For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition, relevant topics such as psychophysical studies, system evaluation, and issues of illumination and pose variation are covered.

6,384 citations

Book ChapterDOI
15 Apr 1996
TL;DR: A face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression is developed and the proposed “Fisherface” method has error rates that are significantly lower than those of the Eigenface technique when tested on the same database.
Abstract: We develop a face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face under varying illumination direction lie in a 3-D linear subspace of the high dimensional feature space — if the face is a Lambertian surface without self-shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's Linear Discriminant and produces well separated classes in a low-dimensional subspace even under severe variation in lighting and facial expressions. The Eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed “Fisherface” method has error rates that are significantly lower than those of the Eigenface technique when tested on the same database.

2,428 citations

Journal ArticleDOI
TL;DR: For small datasets, FAST-MCD typically finds the exact MCD, whereas for larger datasets it gives more accurate results than existing algorithms and is faster by orders.
Abstract: The minimum covariance determinant (MCD) method of Rousseeuw is a highly robust estimator of multivariate location and scatter. Its objective is to find h observations (out of n) whose covariance matrix has the lowest determinant. Until now, applications of the MCD were hampered by the computation time of existing algorithms, which were limited to a few hundred objects in a few dimensions. We discuss two important applications of larger size, one about a production process at Philips with n = 677 objects and p = 9 variables, and a dataset from astronomy with n = 137,256 objects and p = 27 variables. To deal with such problems we have developed a new algorithm for the MCD, called FAST-MCD. The basic ideas are an inequality involving order statistics and determinants, and techniques which we call “selective iteration” and “nested extensions.” For small datasets, FAST-MCD typically finds the exact MCD, whereas for larger datasets it gives more accurate results than existing algorithms and is faster by orders...

2,073 citations


"Enhancing face recognition from vid..." refers methods in this paper

  • ...To find these vectors, a FAST-MCD algorithm [8] is used....

    [...]