ArcFace: Additive Angular Margin Loss for Deep Face Recognition

doi:10.1109/CVPR.2019.00482

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Hierarchical Pyramid Diverse Attention Networks for Face Recognition

[...]

Qiangchang Wang¹, Tianyi Wu², He Zheng², Guodong Guo¹•Institutions (2)

West Virginia University¹, Baidu²

14 Jun 2020

TL;DR: This work proposes a pyramid diverse attention (PDA) to learn multi-scale diverse local representations automatically and adaptively in face recognition, developed by integrating the PDA into the hierarchical bilinear pooling (HBP) to fuse information from multiple layers effectively.

...read moreread less

Abstract: Deep learning has achieved a great success in face recognition (FR), however, few existing models take hierarchical multi-scale local features into consideration. In this work, we propose a hierarchical pyramid diverse attention (HPDA) network. First, it is observed that local patches would play important roles in FR when the global face appearance changes dramatically. Some recent works apply attention modules to locate local patches automatically without relying on face landmarks. Unfortunately, without considering diversity, some learned attentions tend to have redundant responses around some similar local patches, while neglecting other potential discriminative facial parts. Meanwhile, local patches may appear at different scales due to pose variations or large expression changes. To alleviate these challenges, we propose a pyramid diverse attention (PDA) to learn multi-scale diverse local representations automatically and adaptively. More specifically, a pyramid attention is developed to capture multi-scale features. Meanwhile, a diverse learning is developed to encourage models to focus on different local patches and generate diverse local features. Second, almost all existing models focus on extracting features from the last convolutional layer, lacking of local details or small-scale face parts in lower layers. Instead of simple concatenation or addition, we propose to use a hierarchical bilinear pooling (HBP) to fuse information from multiple layers effectively. Thus, the HPDA is developed by integrating the PDA into the HBP. Experimental results on several datasets show the effectiveness of the HPDA, compared to the state-of-the-art methods.

...read moreread less

61 citations

Cites background from "ArcFace: Additive Angular Margin Lo..."

...Global faces based models usually accept whole faces as inputs [22, 34, 19, 28, 3]....
[...]
...global representations where whole faces are regarded as CNN inputs [22, 34, 19, 3]....
[...]
...To get a high-quality dataset, [3] refined the dataset and made it publicly available....
[...]

Proceedings Article•DOI•

Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition

[...]

Yaobin Zhang¹, Weihong Deng¹, Mei Wang¹, Jiani Hu¹, Xian Li, Dongyue Zhao, Dongchao Wen - Show less +3 more•Institutions (1)

Beijing University of Posts and Telecommunications¹

14 Jun 2020

TL;DR: An effective automatic label noise cleansing framework for face recognition datasets, FaceGraph, which performs global-to-local discrimination to select useful data in a noisy environment and surpasses state-of-the-art performance on the IJB-C benchmark.

...read moreread less

Abstract: In the field of face recognition, large-scale web-collected datasets are essential for learning discriminative representations, but they suffer from noisy identity labels, such as outliers and label flips. It is beneficial to automatically cleanse their label noise for improving recognition accuracy. Unfortunately, existing cleansing methods cannot accurately identify noise in the wild. To solve this problem, we propose an effective automatic label noise cleansing framework for face recognition datasets, FaceGraph. Using two cascaded graph convolutional networks, FaceGraph performs global-to-local discrimination to select useful data in a noisy environment. Extensive experiments show that cleansing widely used datasets, such as CASIA-WebFace, VGGFace2, MegaFace2, and MS-Celeb-1M, using the proposed method can improve the recognition performance of state-of-the-art representation learning methods like Arcface. Further, we cleanse massive self-collected celebrity data, namely MillionCelebs, to provide 18.8M images of 636K identities. Training with the new data, Arcface surpasses state-of-the-art performance by a notable margin to reach 95.62% TPR at 1e-5 FPR on the IJB-C benchmark.

...read moreread less

60 citations

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

...The effectiveness is assessed in terms of the comparative recognition performance of Arcface [15] trained on different datasets....
[...]
...For instance, the Arcface method trained by this new dataset outperforms stateof-the-art performance on the IJB-C by a notable margin....
[...]
...For the real data validation, we evaluate face recognition performance of ResNet [23] models trained on original and cleansed datasets by the Arcface loss [15]....
[...]
...In Table 4, adopting FaceScrub [37] as probe set and using the wash list provided by DeepInsight [15], the results of two MillionCelebs cleansed versions do not differ a lot, but they all outperform other training datasets by a large margin....
[...]
...Table 3: Cleanse 4 face recognition datasets and train deep models by Arcface [15] to test face verification accuracy (%)....
[...]

Proceedings Article•DOI•

Fair Loss: Margin-Aware Reinforcement Learning for Deep Face Recognition

[...]

Bingyu Liu¹, Weihong Deng¹, Yaoyao Zhong¹, Mei Wang¹, Jiani Hu¹, Xunqiang Tao, Yaohai Huang - Show less +3 more•Institutions (1)

Beijing University of Posts and Telecommunications¹

01 Oct 2019

TL;DR: This paper introduces a new margin-aware reinforcement learning based loss function, namely fair loss, in which each class will learn an appropriate adaptive margin by Deep Q-learning, and trains an agent to learn a margin adaptive strategy for each class, and makes the additive margins for different classes more reasonable.

...read moreread less

Abstract: Recently, large-margin softmax loss methods, such as angular softmax loss (SphereFace), large margin cosine loss (CosFace), and additive angular margin loss (ArcFace), have demonstrated impressive performance on deep face recognition. These methods incorporate a fixed additive margin to all the classes, ignoring the class imbalance problem. However, imbalanced problem widely exists in various real-world face datasets, in which samples from some classes are in a higher number than others. We argue that the number of a class would influence its demand for the additive margin. In this paper, we introduce a new margin-aware reinforcement learning based loss function, namely fair loss, in which each class will learn an appropriate adaptive margin by Deep Q-learning. Specifically, we train an agent to learn a margin adaptive strategy for each class, and make the additive margins for different classes more reasonable. Our method has better performance than present large-margin loss functions on three benchmarks, Labeled Face in the Wild (LFW), Youtube Faces (YTF) and MegaFace, which demonstrates that our method could learn better face representation on imbalanced face datasets.

...read moreread less

59 citations

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

...FairLoss Cos and FairLoss Arc represent the methods with our margin adaptive strategy used in CosFace [36] and ArcFace [5], respectively....
[...]
...The other is ResNet50 [10] with a modified structure, proposed in [5], after the last convolutional layer....
[...]
...We fix ‖xi‖ by L2 normalization and re-scale ‖xi‖ to s, following [20, 36, 5]....
[...]
...[5] directly add an angular margin in the angular space and have a more clear geometric interpretation....
[...]
...Corresponding Author tation, several large-margin loss functions have been proposed to improve the generalization ability of softmax loss, such as SphereFace [20], CosFace [36], and ArcFace [5]....
[...]

Proceedings Article•DOI•

Domain Balancing: Face Recognition on Long-Tailed Domains

[...]

Dong Cao¹, Xiangyu Zhu¹, Xingyu Huang², Jianzhu Guo¹, Zhen Lei² - Show less +1 more•Institutions (2)

Chinese Academy of Sciences¹, Tianjin University²

14 Jun 2020

TL;DR: A novel Domain Balancing (DB) mechanism to handle the long-tailed domain distribution problem, which refers to the fact that a small number of domains frequently appear while other domains far less existing, is proposed.

...read moreread less

Abstract: Long-tailed problem has been an important topic in face recognition task. However, existing methods only concentrate on the long-tailed distribution of classes. Differently, we devote to the long-tailed domain distribution problem, which refers to the fact that a small number of domains frequently appear while other domains far less existing. The key challenge of the problem is that domain labels are too complicated (related to race, age, pose, illumination, etc.) and inaccessible in real applications. In this paper, we propose a novel Domain Balancing (DB) mechanism to handle this problem. Specifically, we first propose a Domain Frequency Indicator (DFI) to judge whether a sample is from head domains or tail domains. Secondly, we formulate a light-weighted Residual Balancing Mapping (RBM) block to balance the domain distribution by adjusting the network according to DFI. Finally, we propose a Domain Balancing Margin (DBM) in the loss function to further optimize the feature space of the tail domains to improve generalization. Extensive analysis and experiments on several face recognition benchmarks demonstrate that the proposed method effectively enhances the generalization capacities and achieves superior performance.

...read moreread less

58 citations

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

...Specifically, our method achieves 95.54% average accuracy, about 0.4% average improvement over ArcFace....
[...]
...To the compared approaches, we compare the proposed method with the baseline Softmax loss and the recently popular state-of-the-arts, including SphereFace [18], CosFace [32] and ArcFace [4]....
[...]
...Recent years have witnessed remarkable progresses in face recognition, with a variety of approaches proposed in the literatures and applied in real applications [18, 32, 4, 7, 6, 42]....
[...]
...Recently, a variety of margin based softmax losses [18, 32, 4] have achieved the state-of-the-art performances....
[...]
...In particular, the proposed method surpasses the best approach ArcFace by an obvious margin (about 0.82% at Rank-1 identification rate and 0.68% verification rate)....
[...]

Proceedings Article•DOI•

SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance

[...]

Fu-Zhao Ou¹, Chen Xingyu², Ruixin Zhang², Yuge Huang², Shaoxin Li², Jilin Li², Yong Li³, Liujuan Cao⁴, Yuan-Gen Wang¹ - Show less +5 more•Institutions (4)

Guangzhou University¹, Tencent², Nanjing University of Science and Technology³, Xiamen University⁴

20 Jun 2021

TL;DR: Wang et al. as mentioned in this paper proposed an unsupervised FIQA method that incorporates similarity distribution distance for face image quality assessment, which generates quality pseudo-labels by calculating the Wasserstein distance between the intra-class and inter-class similarity distributions.

...read moreread less

Abstract: In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. For this purpose, the FIQA method should consider both the intrinsic property and the recognizability of the face image. Most previous works aim to estimate the sample-wise embedding uncertainty or pair-wise similarity as the quality score, which only considers the partial information from the intra-class. However, these methods ignore the valuable in-formation from the inter-class, which is for estimating the recognizability of face image. In this work, we argue that a high-quality face image should be similar to its intra-class samples and dissimilar to its inter-class samples. Thus, we propose a novel unsupervised FIQA method that incorporates Similarity Distribution Distance for Face Image Quality Assessment (SDD-FIQA). Our method generates quality pseudo-labels by calculating the Wasserstein Distance (WD) between the intra-class and inter-class similarity distributions. With these quality pseudo-labels, we are capable of training a regression network for quality prediction. Extensive experiments on benchmark datasets demonstrate that the proposed SDD-FIQA surpasses the state-of-the-arts by an impressive margin. Meanwhile, our method shows good generalization across different recognition systems.

...read moreread less

58 citations

Collapse

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Citations

Cites background from "ArcFace: Additive Angular Margin Lo..."

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

References

Related Papers (5)