scispace - formally typeset
Search or ask a question
Journal ArticleDOI

On Recognizing Faces in Videos Using Clustering-Based Re-Ranking and Fusion

TL;DR: A video-based face recognition algorithm that computes a discriminative video signature as an ordered list of still face images from a large dictionary, which embeds diverse intra-personal variations and facilitates in matching two videos with large variations.
Abstract: Due to widespread applications, availability of large intra-personal variations in video and limited information content in still images, video-based face recognition has gained significant attention. Unlike still face images, videos provide abundant information that can be leveraged to address variations in pose, illumination, and expression as well as enhance the face recognition performance. This paper presents a video-based face recognition algorithm that computes a discriminative video signature as an ordered list of still face images from a large dictionary. A three-stage approach is proposed for optimizing ranked lists across multiple video frames and fusing them into a single composite ordered list to compute the video signature. This signature embeds diverse intra-personal variations and facilitates in matching two videos with large variations. For matching two videos, a discounted cumulative gain measure is utilized, which uses the ranking of images in the video signature as well as the usefulness of images in characterizing the individual in the video. The efficacy of the proposed algorithm is evaluated under different video-based face recognition scenarios such as matching still face images with videos and matching videos with videos. The efficacy of the proposed algorithm is demonstrated on the YouTube faces database and the MBGC v2 video challenge database that comprise different types of video-based face recognition challenges such as matching still face images with videos and matching videos with videos. Performance comparison with the benchmark results on both the databases and a commercial face recognition system shows the efficiency of the proposed algorithm for video-based face recognition.
Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of techniques incorporating ancillary information in the biometric recognition pipeline is presented in this paper, where the authors provide a comprehensive overview of the role of information fusion in biometrics.

151 citations

Journal ArticleDOI
TL;DR: Several systems and architectures related to the combination of biometric systems, both unimodal and multimodal, are overviews, classifying them according to a given taxonomy, and a case study for the experimental evaluation of methods for biometric fusion at score level is presented.

123 citations


Cites background or methods from "On Recognizing Faces in Videos Usin..."

  • ...Sometimes decision level fusion is based on ranking: a method for face recognition from video based on ranked list aggregation is proposed in [43]....

    [...]

  • ...[43] 2014 Decision level fusion Video face Faces recognition from videos using clustering based reranking and fusion...

    [...]

Journal ArticleDOI
TL;DR: The experimental results indicate that the proposed algorithm achieves high face recognition accuracy on RGB-D images obtained using Kinect compared with existing 2D and 3D approaches.
Abstract: Face recognition algorithms generally utilize 2D images for feature extraction and matching. To achieve higher resilience toward covariates, such as expression, illumination, and pose, 3D face recognition algorithms are developed. While it is challenging to use specialized 3D sensors due to high cost, RGB-D images can be captured by low-cost sensors such as Kinect. This research introduces a novel face recognition algorithm using RGB-D images. The proposed algorithm computes a descriptor based on the entropy of RGB-D faces along with the saliency feature obtained from a 2D face. Geometric facial attributes are also extracted from the depth image and face recognition is performed by fusing both the descriptor and attribute match scores. The experimental results indicate that the proposed algorithm achieves high face recognition accuracy on RGB-D images obtained using Kinect compared with existing 2D and 3D approaches.

84 citations

Proceedings ArticleDOI
TL;DR: A memorability based frame selection algorithm is presented that enables automatic selection of memorable frames for facial feature extraction and matching and achieves state-of-the-art performance at low false accept rates.
Abstract: Videos have ample amount of information in the form of frames that can be utilized for feature extraction and matching. However, face images in not all of the frames are ”memorable” and useful. Therefore, utilizing all the frames available in a video for recognition does not necessarily improve the performance but significantly increases the computation time. In this research, we present a memorability based frame selection algorithm that enables automatic selection of memorable frames for facial feature extraction and matching. A deep learning algorithm is then proposed that utilizes a stack of denoising autoencoders and deep Boltzmann machines to perform face recognition using the most memorable frames. The proposed algorithm, termed as MDLFace, is evaluated on two publicly available video face databases, Youtube Faces and Point and Shoot Challenge. The results show that the proposed algorithm achieves state-of-the-art performance at low false accept rates.

63 citations


Cites methods from "On Recognizing Faces in Videos Usin..."

  • ...[13] have applied clustering based re-ranking and fusion to obtain 80....

    [...]

References
More filters
Book
01 Jan 2008
TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.
Abstract: Class-tested and coherent, this groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

11,804 citations

Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Abstract: As one of the most successful applications of image analysis and understanding, face recognition has recently received significant attention, especially during the past several years. At least two reasons account for this trend: the first is the wide range of commercial and law enforcement applications, and the second is the availability of feasible technologies after 30 years of research. Even though current machine recognition systems have reached a certain level of maturity, their success is limited by the conditions imposed by many real applications. For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition, relevant topics such as psychophysical studies, system evaluation, and issues of illumination and pose variation are covered.

6,384 citations

Book
01 Feb 1975

6,068 citations

Journal ArticleDOI
TL;DR: A common theoretical framework for combining classifiers which use distinct pattern representations is developed and it is shown that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision.
Abstract: We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions-the sum rule-outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.

5,670 citations


"On Recognizing Faces in Videos Usin..." refers methods in this paper

  • ...types of fusion methods proposed in the literature [22], [35] such as sensor level, feature level, score level, and decision level fusion....

    [...]