scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

15 Jun 2019-pp 4690-4699
TL;DR: This paper presents arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks, and shows that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead.
Abstract: One of the main challenges in feature learning using Deep Convolutional Neural Networks (DCNNs) for large-scale face recognition is the design of appropriate loss functions that can enhance the discriminative power. Centre loss penalises the distance between deep features and their corresponding class centres in the Euclidean space to achieve intra-class compactness. SphereFace assumes that the linear transformation matrix in the last fully connected layer can be used as a representation of the class centres in the angular space and therefore penalises the angles between deep features and their corresponding weights in a multiplicative way. Recently, a popular line of research is to incorporate margins in well-established loss functions in order to maximise face class separability. In this paper, we propose an Additive Angular Margin Loss (ArcFace) to obtain highly discriminative features for face recognition. The proposed ArcFace has a clear geometric interpretation due to its exact correspondence to geodesic distance on a hypersphere. We present arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks which includes a new large-scale image database with trillions of pairs and a large-scale video dataset. We show that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead. To facilitate future research, the code has been made available.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A subject-independent learning method for EEG-based biometrics using Hilbert spectrograms of the data is proposed using the integrated gradients method and a multi-similarity loss was used as the loss function for subject- independent learning.
Abstract: A promising approach to overcome the various shortcomings of password systems is the use of biometric authentication, in particular the use of electroencephalogram (EEG) data. In this paper, we propose a subject-independent learning method for EEG-based biometrics using Hilbert spectrograms of the data. The proposed neural network architecture treats the spectrogram as a collection of one-dimensional series and applies one-dimensional dilated convolutions over them, and a multi-similarity loss was used as the loss function for subject-independent learning. The architecture was tested on the publicly available PhysioNet EEG Motor Movement/Imagery Dataset (PEEGMIMDB) with a 14.63% Equal Error Rate (EER) achieved. The proposed approach’s main advantages are subject independence and suitability for interpretation via created spectrograms and the integrated gradients method.

4 citations

Proceedings ArticleDOI
12 Nov 2022
TL;DR: GestaltMatcher Arc as discussed by the authors replaces the GestaltMatchers DCNN with a state-of-the-art face recognition approach, iResNet with ArcFace, and proposes test-time augmentation, and model ensembles that mix general face verification models and models specific for verifying disorders.
Abstract: Rare genetic disorders affect more than 6% of the global population. Reaching a diagnosis is challenging because rare disorders are very diverse. Many disorders have recognizable facial features that are hints for clinicians to diagnose patients. Previous work, such as GestaltMatcher, utilized representation vectors produced by a DCNN similar to AlexNet to match patients in high-dimensional feature space to support "unseen" ultra-rare disorders. However, the architecture and dataset used for transfer learning in GestaltMatcher have become outdated. Moreover, a way to train the model for generating better representation vectors for unseen ultra-rare disorders has not yet been studied. Because of the overall scarcity of patients with ultra-rare disorders, it is infeasible to directly train a model on them. Therefore, we first analyzed the influence of replacing GestaltMatcher DCNN with a state-of-the-art face recognition approach, iResNet with ArcFace. Additionally, we experimented with different face recognition datasets for transfer learning. Furthermore, we proposed test-time augmentation, and model ensembles that mix general face verification models and models specific for verifying disorders to improve the disorder verification accuracy of unseen ultra-rare disorders. Our proposed ensemble model achieves state-of-the-art performance on both seen and unseen disorders. Code is available at github.com/igsb/GestaltMatcher-Arc.

4 citations

Journal ArticleDOI
10 Jun 2022
TL;DR: It is demonstrated that female and male hairstyles have important differences that impact face recognition accuracy, and that when the data used to estimate recognition accuracy is balanced across gender for how hairstyles occlude the face, the initially observed gender gap in accuracy largely disappears.
Abstract: It is broadly accepted that there is a “gender gap” in face recognition accuracy, with females having lower accuracy. However, relatively little is known about the cause(s) of this gender gap. We first demonstrate that female and male hairstyles have important differences that impact face recognition accuracy. In particular, variation in male facial hair contributes to a greater average difference in appearance between different male faces. We then demonstrate that when the data used to evaluate recognition accuracy is gender-balanced for how hairstyles occlude the face, the initially observed gender gap in accuracy largely disappears. We show this result for two different matchers, and for a Caucasian image dataset and an African-American dataset. Our results suggest that research on demographic variation in accuracy should include a check for balanced quality of the test data as part of the problem formulation. This new understanding of the causes of the gender gap in recognition accuracy will hopefully promote rational consideration of what might be done about it. To promote reproducible research, the matchers, attribute classifiers, and datasets used in this work are available to other researchers.

4 citations

Posted Content
TL;DR: The results show that excessive alignment is harmful and an optimal balanced point of alignment is in need and a novel joint learning approach where alignment learning is controllable with respect to its strength and driven by recognition is proposed.
Abstract: Face alignment is crucial for face recognition and has been widely adopted. However, current practice is too simple and under-explored. There lacks an understanding of how important face alignment is and how it should be performed, for recognition. This work studies these problems and makes two contributions. First, it provides an in-depth and quantitative study of how alignment strength affects recognition accuracy. Our results show that excessive alignment is harmful and an optimal balanced point of alignment is in need. To strike the balance, our second contribution is a novel joint learning approach where alignment learning is controllable with respect to its strength and driven by recognition. Our proposed method is validated by comprehensive experiments on several benchmarks, especially the challenging ones with large pose.

4 citations


Cites methods or result from "ArcFace: Additive Angular Margin Lo..."

  • ...To further verify the effectiveness of our method, we compare the performance with several state-of-the-art face recognition methods on three public benchmarks In this part, we use LResnet100E-IR (the same as ArcFace [26]) as our recognition model and it is trained on MS-Celeb-1M dataset [35]....

    [...]

  • ...The result in Table 4 shows that our model can achieve comparable result with the state-of-the-art methods CosFace [33] and ArcFace [26]....

    [...]

  • ...The MegaFace dataset we utilize is refined using the same protocol as in ArcFace [26]....

    [...]

  • ...† means the reported results in ArcFace [26]....

    [...]

Book ChapterDOI
10 Sep 2021
TL;DR: Li et al. as mentioned in this paper proposed a lightweight convolutional neural network (CNN) based method for finger-vein recognition, which is comparable or superior to the prior state-of-the-art methods.
Abstract: Hand-crafted approaches were the dominating solutions and recently, more convolutional neural network (CNN)-based methods have been proposed for finger-vein recognition. However, the previous deep learning methods usually designed the network architecture with increasing layers and parameters, which incurs device memory issues and processing speed issues. Although many researchers have devoted to design image enhancement algorithms to improve the recognition performance of hand-crafted methods, it is interesting to investigate whether deep learning method can achieve satisfactory performance without image enhancement. This paper focuses on two different dimension issues: lightweight CNN design and the impact of image enhancement on deep learning methods. The experimental results demonstrate that the proposed method LFVRN is comparable or superior to the prior competition winners. In addition, image enhancement is validated not inevitable for the proposed lightweight CNN model LFVRN.

4 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Journal Article
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Abstract: Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

33,597 citations

Proceedings Article
Sergey Ioffe1, Christian Szegedy1
06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

30,843 citations

28 Oct 2017
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

13,268 citations

Posted Content
TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

10,447 citations