scispace - formally typeset
Proceedings ArticleDOI

Disguised Faces in the Wild 2019

01 Oct 2019-pp 0-0

TL;DR: The outcome of the Disguised Faces in the Wild 2019 competition is summarized in terms of the dataset used for evaluation, a brief review of the algorithms employed by the participants for this task, and the results obtained.

AbstractDisguised face recognition has wide-spread applicability in scenarios such as law enforcement, surveillance, and access control. Disguise accessories such as sunglasses, masks, scarves, or make-up modify or occlude different facial regions which makes face recognition a challenging task. In order to understand and benchmark the state-of-the-art on face recognition in the presence of disguise variations, the Disguised Faces in the Wild 2019 (DFW2019) competition has been organized. This paper summarizes the outcome of the competition in terms of the dataset used for evaluation, a brief review of the algorithms employed by the participants for this task, and the results obtained. The DFW2019 dataset has been released with four evaluation protocols and baseline results obtained from two deep learning-based state-of-the-art face recognition models. The DFW2019 dataset has also been analyzed with respect to degrees of difficulty: (i) easy, (ii) medium, and (iii) hard. The dataset has been released as part of the International Workshop on Disguised Faces in the Wild at International Conference on Computer Vision (ICCV), 2019.

...read more

Content maybe subject to copyright    Report


Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.
Abstract: Deep learning applies multiple processing layers to learn representations of data with multiple levels of feature extraction. This emerging technique has reshaped the research landscape of face recognition (FR) since 2014, launched by the breakthroughs of DeepFace and DeepID. Since then, deep learning technique, characterized by the hierarchical architecture to stitch together pixels into invariant face representation, has dramatically improved the state-of-the-art performance and fostered successful real-world applications. In this survey, we provide a comprehensive review of the recent developments on deep FR, covering broad topics on algorithm designs, databases, protocols, and application scenes. First, we summarize different network architectures and loss functions proposed in the rapid evolution of the deep FR methods. Second, the related face processing methods are categorized into two classes: “one-to-many augmentation” and “many-to-one normalization”. Then, we summarize and compare the commonly used databases for both model training and evaluation. Third, we review miscellaneous scenes in deep FR, such as cross-factor, heterogenous, multiple-media and industrial scenes. Finally, the technical challenges and several promising directions are highlighted.

169 citations

Journal ArticleDOI
TL;DR: Various applications and opportunities of SM multimodal data, latest advancements, current challenges, and future directions for the crisis informatics and other related research fields are highlighted.
Abstract: People increasingly use Social Media (SM) platforms such as Twitter and Facebook during disasters and emergencies to post situational updates including reports of injured or dead people, infrastructure damage, requests of urgent needs, and the like. Information on SM comes in many forms, such as textual messages, images, and videos. Several studies have shown the utility of SM information for disaster response and management, which encouraged humanitarian organizations to start incorporating SM data sources into their workflows. However, several challenges prevent these organizations from using SM data for response efforts. These challenges include near-real-time information processing, information overload, information extraction, summarization, and verification of both textual and visual content. We highlight various applications and opportunities of SM multimodal data, latest advancements, current challenges, and future directions for the crisis informatics and other related research fields.

29 citations

Journal ArticleDOI
03 Apr 2020
TL;DR: Different ways in which the robustness of a face recognition algorithm is challenged, which can severely affect its intended working are summarized.
Abstract: Face recognition algorithms have demonstrated very high recognition performance, suggesting suitability for real world applications Despite the enhanced accuracies, robustness of these algorithms against attacks and bias has been challenged This paper summarizes different ways in which the robustness of a face recognition algorithm is challenged, which can severely affect its intended working Different types of attacks such as physical presentation attacks, disguise/makeup, digital adversarial attacks, and morphing/tampering using GANs have been discussed We also present a discussion on the effect of bias on face recognition models and showcase that factors such as age and gender variations affect the performance of modern algorithms The paper also presents the potential reasons for these challenges and some of the future research directions for increasing the robustness of face recognition models

28 citations


Cites background or methods from "Disguised Faces in the Wild 2019"

  • ...…the top performing teams in the competition demonstrated high verification performance at higher False Acceptance Rates (Deng and Zafeririou 2019; Singh et al. 2019a), analysis of the submissions demonstrate low performance (less than 10% True Acceptance Rate) at 0% False Acceptance Rate; a…...

    [...]

  • ...In 2018, the Disguised Faces in the Wild (DFW) dataset (Singh et al. 2019b) was released as part of the International Workshop on DFW held in conjunction with CVPR2018....

    [...]

  • ...Recently, the DFW2019 competition (Singh et al. 2019a) has also contained a protocol for recognizing images under plastic surgery variations, where deep learning based baseline algorithms show around 50% verification accuracy at 0.01% False Acceptance Rate....

    [...]

  • ...Recently, the DFW2019 competition (Singh et al. 2019a) has also contained a protocol for recognizing images under plastic surgery variations, where deep learning based baseline algorithms show around 50% verification accuracy at 0....

    [...]

  • ...While the top performing teams in the competition demonstrated high verification performance at higher False Acceptance Rates (Deng and Zafeririou 2019; Singh et al. 2019a), analysis of the submissions demonstrate low performance (less than 10% True Acceptance Rate) at 0% False Acceptance Rate; a metric often used in stricter settings such as access control in highly secure locations....

    [...]

Proceedings ArticleDOI
01 Oct 2019
TL;DR: By using the authors' RetinaFace for face detection and alignment and Arc face for face feature embedding, this work achieves state-of-the-art performance on the DFW2019 challenge.
Abstract: Even though deep face recognition is extensively explored and remarkable advances have been achieved on large-scale in-the-wild dataset, disguised face recognition receives much less attention. Face feature embedding targeting on intra-class compactness and inter-class discrepancy is very challenging as high intra-class diversity and inter-class similarity are very common on the disguised face recognition dataset. In this report, we give the technical details of our submission to the DFW2019 challenge. By using our RetinaFace for face detection and alignment and ArcFace for face feature embedding, we achieve state-of-the-art performance on the DFW2019 challenge.

9 citations


Cites methods from "Disguised Faces in the Wild 2019"

  • ...Method Impersonation Obfuscation Plastic Surgery Overall FAR 1e− 4 1e− 3 1e− 2 1e− 4 1e− 3 1e− 4 1e− 3 1e− 4 1e− 3 ResNet-50 [23] 38....

    [...]

  • ...On the Impersonation track, our solution is worse than the baseline method (LightCNN-29v2 [23]) provided by the organiser....

    [...]

Journal ArticleDOI
TL;DR: In this article, a comprehensive survey of works related to the topic of makeup presentation attack detection is provided, along with a critical discussion, and the vulnerability of a commercial off-the-shelf and an open-source face recognition system against makeup presentation attacks is assessed.
Abstract: The application of facial cosmetics may cause substantial alterations in the facial appearance, which can degrade the performance of facial biometrics systems. Additionally, it was recently demonstrated that makeup can be abused to launch so-called makeup presentation attacks. More precisely, an attacker might apply heavy makeup to obtain the facial appearance of a target subject with the aim of impersonation or to conceal their own identity. We provide a comprehensive survey of works related to the topic of makeup presentation attack detection, along with a critical discussion. Subsequently, we assess the vulnerability of a commercial off-the-shelf and an open-source face recognition system against makeup presentation attacks. Specifically, we focus on makeup presentation attacks with the aim of impersonation employing the publicly available Makeup Induced Face Spoofing (MIFS) and Disguised Faces in the Wild (DFW) databases. It is shown that makeup presentation attacks might seriously impact the security of face recognition systems. Further, we propose different image pair-based, i.e. differential, attack detection schemes which analyse differences in feature representations obtained from potential makeup presentation attacks and corresponding target face images. The proposed detection systems employ various types of feature extractors including texture descriptors, facial landmarks, and deep (face) representations. To distinguish makeup presentation attacks from genuine, i.e. bona fide presentations, machine learning-based classifiers are used. The classifiers are trained with a large number of synthetically generated makeup presentation attacks utilising a generative adversarial network for facial makeup transfer in conjunction with image warping. Experimental evaluations conducted using the MIFS database and a subset of the DFW database reveal that deep face representations achieve competitive detection equal error rates of 0.7% and 1.8%, respectively.

7 citations


References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

93,356 citations

Proceedings ArticleDOI
Qiong Cao1, Li Shen1, Weidi Xie1, Omkar M. Parkhi1, Andrew Zisserman1 
15 May 2018
TL;DR: VGGFace2 as discussed by the authors is a large-scale face dataset with 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject.
Abstract: In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimise the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS-Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on the IJB-A and IJB-B face recognition benchmarks, exceeding the previous state-of-the-art by a large margin. The dataset and models are publicly available.

1,471 citations

Journal Article
TL;DR: A semi-automatical way to collect face images from Internet is proposed and a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace is built, based on which a 11-layer CNN is used to learn discriminative representation and obtain state-of-theart accuracy on LFW and YTF.
Abstract: Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, ie, 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, this paper proposes a semi- automatical way to collect face images from Internet and builds a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace. Based on the database, we use a 11-layer CNN to learn discriminative representation and obtain state- of-theart accuracy on LFW and YTF.

1,425 citations


Additional excerpts

  • ...ResNet-502 [7] (pre-trained on the large-scale VGG-Face2 [1] and MS-Celeb-1M [6] datasets) and LightCNN-29v23 [20] (pre-trained on the large-scale CASIA-WebFace [21] and MS-Celeb-1M [6] datasets) have been used for evaluation....

    [...]

  • ...ResNet-50(2) [7] (pre-trained on the large-scale VGG-Face2 [1] and MS-Celeb-1M [6] datasets) and LightCNN-29v2(3) [20] (pre-trained on the large-scale CASIA-WebFace [21] and MS-Celeb-1M [6] datasets) have been used for evaluation....

    [...]

Book ChapterDOI
Yandong Guo1, Lei Zhang1, Yuxiao Hu1, Xiaodong He1, Jianfeng Gao1 
08 Oct 2016
TL;DR: In this article, the authors proposed a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data.
Abstract: In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to various real-world applications, such as image captioning and news video analysis. Associated with this task, we design and provide concrete measurement set, evaluation protocol, as well as training data. We also present in details our experiment setup and report promising baseline results. Our benchmark task could lead to one of the largest classification problems in computer vision. To the best of our knowledge, our training dataset, which contains 10M images in version 1, is the largest publicly available one in the world.

1,144 citations

Posted Content
TL;DR: This article proposed an additive angular margin loss (ArcFace) to obtain highly discriminative features for face recognition, which has a clear geometric interpretation due to the exact correspondence to the geodesic distance on the hypersphere.
Abstract: One of the main challenges in feature learning using Deep Convolutional Neural Networks (DCNNs) for large-scale face recognition is the design of appropriate loss functions that enhance discriminative power. Centre loss penalises the distance between the deep features and their corresponding class centres in the Euclidean space to achieve intra-class compactness. SphereFace assumes that the linear transformation matrix in the last fully connected layer can be used as a representation of the class centres in an angular space and penalises the angles between the deep features and their corresponding weights in a multiplicative way. Recently, a popular line of research is to incorporate margins in well-established loss functions in order to maximise face class separability. In this paper, we propose an Additive Angular Margin Loss (ArcFace) to obtain highly discriminative features for face recognition. The proposed ArcFace has a clear geometric interpretation due to the exact correspondence to the geodesic distance on the hypersphere. We present arguably the most extensive experimental evaluation of all the recent state-of-the-art face recognition methods on over 10 face recognition benchmarks including a new large-scale image database with trillion level of pairs and a large-scale video dataset. We show that ArcFace consistently outperforms the state-of-the-art and can be easily implemented with negligible computational overhead. We release all refined training data, training codes, pre-trained models and training logs, which will help reproduce the results in this paper.

1,122 citations