scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Deep face recognition: A survey

14 Mar 2021-Neurocomputing (Elsevier)-Vol. 429, pp 215-244
TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.
About: This article is published in Neurocomputing.The article was published on 2021-03-14 and is currently open access. It has received 353 citations till now. The article focuses on the topics: Deep learning & Feature extraction.
Citations
More filters
Reference EntryDOI
15 Oct 2004

2,118 citations

Journal ArticleDOI
TL;DR: This survey provides a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors, and lists the traditional and new applications.
Abstract: Object detection is one of the most important and challenging branches of computer vision, which has been widely applied in people's life, such as monitoring security, autonomous driving and so on, with the purpose of locating instances of semantic objects of a certain class. With the rapid development of deep learning algorithms for detection tasks, the performance of object detectors has been greatly improved. In order to understand the main development status of object detection pipeline thoroughly and deeply, in this survey, we analyze the methods of existing typical detection models and describe the benchmark datasets at first. Afterwards and primarily, we provide a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors. Moreover, we list the traditional and new applications. Some representative branches of object detection are analyzed as well. Finally, we discuss the architecture of exploiting these object detection methods to build an effective and efficient system and point out a set of development trends to better follow the state-of-the-art algorithms and further research.

749 citations

Journal ArticleDOI
TL;DR: In this article, the authors provide a review of deep neural network concepts in background subtraction for novices and experts in order to analyze this success and to provide further directions.

278 citations

01 Jan 2006
TL;DR: It is concluded that the problem of age-progression on face recognition (FR) is not unique to the algorithm used in this work, and the efficacy of this algorithm is evaluated against the variables of gender and racial origin.
Abstract: This paper details MORPH a longitudinal face database developed for researchers investigating all facets of adult age-progression, e.g. face modeling, photo-realistic animation, face recognition, etc. This database contributes to several active research areas, most notably face recognition, by providing: the largest set of publicly available longitudinal images; longitudinal spans from a few months to over twenty years; and, the inclusion of key physical parameters that affect aging appearance. The direct contribution of this data corpus for face recognition is highlighted in the evaluation of a standard face recognition algorithm, which illustrates the impact that age-progression, has on recognition rates. Assessment of the efficacy of this algorithm is evaluated against the variables of gender and racial origin. This work further concludes that the problem of age-progression on face recognition (FR) is not unique to the algorithm used in this work.

139 citations

References
More filters
Journal ArticleDOI
TL;DR: This survey provides a comprehensive review of established techniques and recent developments in HFR, and offers a detailed account of datasets and benchmarks commonly used for evaluation.

114 citations

Proceedings ArticleDOI
07 Mar 2016
TL;DR: In this article, a bilinear CNN (B-CNN) was applied to the IARPA Janus Benchmark A (IJB-A) for face recognition.
Abstract: The recent explosive growth in convolutional neural network (CNN) research has produced a variety of new architectures for deep learning. One intriguing new architecture is the bilinear CNN (B-CNN), which has shown dramatic performance gains on certain fine-grained recognition problems [15]. We apply this new CNN to the challenging new face recognition benchmark, the IARPA Janus Benchmark A (IJB-A) [12]. It features faces from a large number of identities in challenging real-world conditions. Because the face images were not identified automatically using a computerized face detection system, it does not have the bias inherent in such a database. We demonstrate the performance of the B-CNN model beginning from an AlexNet-style network pre-trained on ImageNet. We then show results for fine-tuning using a moderate-sized and public external database, FaceScrub [17]. We also present results with additional fine-tuning on the limited training data provided by the protocol. In each case, the fine-tuned bilinear model shows substantial improvements over the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a large face database, the recently released VGG-Face model [20], can be converted into a B-CNN without any additional feature training. This B-CNN improves upon the CNN performance on the IJB-A benchmark, achieving 89.5% rank-1 recall.

114 citations

Journal ArticleDOI
TL;DR: This work proposes a face search system which combines a fast search procedure, coupled with a state-of-the-art commercial off the shelf (COTS) matcher, in a cascaded framework, and shows that the learned deep features provide complementary information over representations used in state of theart face matchers.
Abstract: Given the prevalence of social media websites, one challenge facing computer vision researchers is to devise methods to search for persons of interest among the billions of shared photos on these websites. Despite significant progress in face recognition, searching a large collection of unconstrained face images remains a difficult problem. To address this challenge, we propose a face search system which combines a fast search procedure, coupled with a state-of-the-art commercial off the shelf (COTS) matcher, in a cascaded framework. Given a probe face, we first filter the large gallery of photos to find the top- $k$ most similar faces using features learned by a convolutional neural network. The $k$ retrieved candidates are re-ranked by combining similarities based on deep features and those output by the COTS matcher. We evaluate the proposed face search system on a gallery containing $80$ million web-downloaded face images. Experimental results demonstrate that while the deep features perform worse than the COTS matcher on a mugshot dataset (93.7 percent versus 98.6 percent TAR@FAR of 0.01 percent), fusing the deep features with the COTS matcher improves the overall performance ( $99.5$ percent TAR@FAR of 0.01 percent). This shows that the learned deep features provide complementary information over representations used in state-of-the-art face matchers. On the unconstrained face image benchmarks, the performance of the learned deep features is competitive with reported accuracies. LFW database: $98.20$ percent accuracy under the standard protocol and $88.03$ percent TAR@FAR of $0.1$ percent under the BLUFR protocol; IJB-A benchmark: $51.0$ percent TAR@FAR of $0.1$ percent (verification), rank 1 retrieval of $82.2$ percent (closed-set search), $61.5$ percent FNIR@FAR of $1$ percent (open-set search). The proposed face search system offers an excellent trade-off between accuracy and scalability on galleries with millions of images. Additionally, in a face search experiment involving photos of the Tsarnaev brothers, convicted of the Boston Marathon bombing, the proposed cascade face search system could find the younger brother's (Dzhokhar Tsarnaev) photo at rank $1$ in $1$ second on a $5$ M gallery and at rank $8$ in $7$ seconds on an $80$ M gallery.

113 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: This paper proposes an approach to extend the deep learning breakthrough for VIS face recognition to the NIR spectrum, without retraining the underlying deep models that see only VIS faces, and obtains state-of-the-art accuracy on the CASIA NIR-VIS v2.0 benchmark.
Abstract: Surveillance cameras today often capture NIR (near infrared) images in low-light environments. However, most face datasets accessible for training and verification are only collected in the VIS (visible light) spectrum. It remains a challenging problem to match NIR to VIS face images due to the different light spectrum. Recently, breakthroughs have been made for VIS face recognition by applying deep learning on a huge amount of labeled VIS face samples. The same deep learning approach cannot be simply applied to NIR face recognition for two main reasons: First, much limited NIR face images are available for training compared to the VIS spectrum. Second, face galleries to be matched are mostly available only in the VIS spectrum. In this paper, we propose an approach to extend the deep learning breakthrough for VIS face recognition to the NIR spectrum, without retraining the underlying deep models that see only VIS faces. Our approach consists of two core components, cross-spectral hallucination and low-rank embedding, to optimize respectively input and output of a VIS deep model for cross-spectral face recognition. Cross-spectral hallucination produces VIS faces from NIR images through a deep learning approach. Low-rank embedding restores a low-rank structure for faces deep features across both NIR and VIS spectrum. We observe that it is often equally effective to perform hallucination to input NIR images or low-rank embedding to output deep features for a VIS deep model for cross-spectral recognition. When hallucination and low-rank embedding are deployed together, we observe significant further improvement, we obtain state-of-the-art accuracy on the CASIA NIR-VIS v2.0 benchmark, without the need at all to re-train the recognition system.

112 citations

Proceedings ArticleDOI
01 Jun 2016
TL;DR: In this paper, the authors evaluated the performance of deep learning based face representation under several conditions including the varying head pose angles, upper and lower face occlusion, changing illumination of different strengths, and misalignment due to erroneous facial feature localization.
Abstract: Deep learning based approaches have been dominating the face recognition field due to the significant performance improvement they have provided on the challenging wild datasets. These approaches have been extensively tested on such unconstrained datasets, on the Labeled Faces in the Wild and YouTube Faces, to name a few. However, their capability to handle individual appearance variations caused by factors such as head pose, illumination, occlusion, and misalignment has not been thoroughly assessed till now. In this paper, we present a comprehensive study to evaluate the performance of deep learning based face representation under several conditions including the varying head pose angles, upper and lower face occlusion, changing illumination of different strengths, and misalignment due to erroneous facial feature localization. Two successful and publicly available deep learning models, namely VGG-Face and Lightened CNN have been utilized to extract face representations. The obtained results show that although deep learning provides a powerful representation for face recognition, it can still benefit from preprocessing, for example, for pose and illumination normalization in order to achieve better performance under various conditions. Particularly, if these variations are not included in the dataset used to train the deep learning model, the role of preprocessing becomes more crucial. Experimental results also show that deep learning based representation is robust to misalignment and can tolerate facial feature localization errors up to 10% of the interocular distance.

110 citations