(Open Access) ArcFace: Additive Angular Margin Loss for Deep Face Recognition (2019) | Jiankang Deng

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Additive Margin Softmax for Face Verification

[...]

Feng Wang¹, Jian Cheng¹, Weiyang Liu², Haijun Liu¹•Institutions (2)

University of Electronic Science and Technology of China¹, Georgia Institute of Technology²

04 Apr 2018-IEEE Signal Processing Letters

TL;DR: In this paper, the authors proposed a conceptually simple and intuitive learning objective function, i.e., additive margin softmax, for face verification, which is more intuitive and interpretable.

...read moreread less

Abstract: In this letter, we propose a conceptually simple and intuitive learning objective function, i.e., additive margin softmax, for face verification. In general, face verification tasks can be viewed as metric learning problems, even though lots of face verification models are trained in classification schemes. It is possible when a large-margin strategy is introduced into the classification model to encourage intraclass variance minimization. As one alternative, angular softmax has been proposed to incorporate the margin. In this letter, we introduce another kind of margin to the softmax loss function, which is more intuitive and interpretable. Experiments on LFW and MegaFace show that our algorithm performs better when the evaluation criteria are designed for very low false alarm rate.

...read moreread less

936 citations

Proceedings Article•DOI•

Large-Scale Long-Tailed Recognition in an Open World

[...]

Ziwei Liu¹, Zhongqi Miao², Xiaohang Zhan¹, Jiayun Wang², Boqing Gong³, Stella X. Yu² - Show less +2 more•Institutions (3)

The Chinese University of Hong Kong¹, University of California, Berkeley², Google³

15 Jun 2019

TL;DR: An integrated OLTR algorithm is developed that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world.

...read moreread less

Abstract: Real world data often have a long-tailed and open-ended distribution. A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance. We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes. OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum. The key challenges are how to share visual knowledge between head and tail classes and how to reduce confusion between tail and open classes. We develop an integrated OLTR algorithm that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world. Our so-called dynamic meta-embedding combines a direct image feature and an associated memory feature, with the feature norm indicating the familiarity to known classes. On three large-scale OLTR datasets we curate from object-centric ImageNet, scene-centric Places, and face-centric MS1M data, our method consistently outperforms the state-of-the-art. Our code, datasets, and models enable future OLTR research and are publicly available at \url{https://liuziwei7.github.io/projects/LongTail.html}.

...read moreread less

780 citations

Cites methods from "ArcFace: Additive Angular Margin Lo..."

...MS1M-LT: To create a long-tailed version of the MS1MArcFace dataset [14, 8], we sample images for each identity with a probability proportional to the image numbers of each identity....
[...]

Proceedings Article•DOI•

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

[...]

Cheng-Han Lee¹, Ziwei Liu², Lingyun Wu¹, Ping Luo³•Institutions (3)

SenseTime¹, The Chinese University of Hong Kong², University of Hong Kong³

14 Jun 2020

TL;DR: MaskGAN as mentioned in this paper proposes MaskGAN to enable diverse and interactive face manipulation by learning style mapping between a free-form user modified mask and a target image, enabling diverse generation results.

...read moreread less

Abstract: Facial image manipulation has achieved great progress in recent years. However, previous methods either operate on a predefined set of face attributes or leave users little freedom to interactively manipulate images. To overcome these drawbacks, we propose a novel framework termed MaskGAN, enabling diverse and interactive face manipulation. Our key insight is that semantic masks serve as a suitable intermediate representation for flexible face manipulation with fidelity preservation. MaskGAN has two main components: 1) Dense Mapping Network (DMN) and 2) Editing Behavior Simulated Training (EBST). Specifically, DMN learns style mapping between a free-form user modified mask and a target image, enabling diverse generation results. EBST models the user editing behavior on the source mask, making the overall framework more robust to various manipulated inputs. Specifically, it introduces dual-editing consistency as the auxiliary supervision signal. To facilitate extensive studies, we construct a large-scale high-resolution face dataset with fine-grained mask annotations named CelebAMask-HQ. MaskGAN is comprehensively evaluated on two challenging tasks: attribute transfer and style copy, demonstrating superior performance over other state-of-the-art methods. The code, models, and dataset are available at https://github.com/switchablenorms/CelebAMask-HQ.

...read moreread less

727 citations

Proceedings Article•DOI•

RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild

[...]

Jiankang Deng¹, Jia Guo, Evangelos Ververas¹, Irene Kotsia², Stefanos Zafeiriou¹ - Show less +1 more•Institutions (2)

Imperial College London¹, Middlesex University²

14 Jun 2020

TL;DR: A novel single-shot, multi-level face localisation method, named RetinaFace, which unifies face box prediction, 2D facial landmark localisation and 3D vertices regression under one common target: point regression on the image plane.

...read moreread less

Abstract: Though tremendous strides have been made in uncontrolled face detection, accurate and efficient 2D face alignment and 3D face reconstruction in-the-wild remain an open challenge. In this paper, we present a novel single-shot, multi-level face localisation method, named RetinaFace, which unifies face box prediction, 2D facial landmark localisation and 3D vertices regression under one common target: point regression on the image plane. To fill the data gap, we manually annotated five facial landmarks on the WIDER FACE dataset and employed a semi-automatic annotation pipeline to generate 3D vertices for face images from the WIDER FACE, AFLW and FDDB datasets. Based on extra annotations, we propose a mutually beneficial regression target for 3D face reconstruction, that is predicting 3D vertices projected on the image plane constrained by a common 3D topology. The proposed 3D face reconstruction branch can be easily incorporated, without any optimisation difficulty, in parallel with the existing box and 2D landmark regression branches during joint training. Extensive experimental results show that RetinaFace can simultaneously achieve stable face detection, accurate 2D face alignment and robust 3D face reconstruction while being efficient through single-shot inference.

...read moreread less

683 citations

Cites background from "ArcFace: Additive Angular Margin Lo..."

...expression [64] and age [41, 39]) and facial identity recognition [18, 12, 56]....
[...]

Proceedings Article•DOI•

ECAPA-TDNN : Emphasized Channel Attention, Propagation and Aggregation in TDNN based speaker verification

[...]

Brecht Desplanques¹, Jenthe Thienpondt¹, Kris Demuynck¹•Institutions (1)

Ghent University¹

14 May 2020

TL;DR: The proposed ECAPA-TDNN architecture significantly outperforms state-of-the-art TDNN based systems on the Voxceleb test sets and the 2019 VoxCeleb Speaker Recognition Challenge.

...read moreread less

Abstract: Current speaker verification techniques rely on a neural network to extract speaker representations. The successful x-vector architecture is a Time Delay Neural Network (TDNN) that applies statistics pooling to project variable-length utterances into fixed-length speaker characterizing embeddings. In this paper, we propose multiple enhancements to this architecture based on recent trends in the related fields of face verification and computer vision. Firstly, the initial frame layers can be restructured into 1-dimensional Res2Net modules with impactful skip connections. Similarly to SE-ResNet, we introduce Squeeze-and-Excitation blocks in these modules to explicitly model channel interdependencies. The SE block expands the temporal context of the frame layer by rescaling the channels according to global properties of the recording. Secondly, neural networks are known to learn hierarchical features, with each layer operating on a different level of complexity. To leverage this complementary information, we aggregate and propagate features of different hierarchical levels. Finally, we improve the statistics pooling module with channel-dependent frame attention. This enables the network to focus on different subsets of frames during each of the channel’s statistics estimation. The proposed ECAPA-TDNN architecture significantly outperforms state-of-the-art TDNN based systems on the VoxCeleb test sets and the 2019 VoxCeleb Speaker Recognition Challenge.

...read moreread less

617 citations

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

...All systems are trained using AAM softmax [3, 23] with a margin of 0....
[...]
...The rising popularity of the x-vector system has resulted in significant architectural improvements and optimized training procedures [3] over the original approach....
[...]

Collapse

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Citations

Cites methods from "ArcFace: Additive Angular Margin Lo..."

Cites background from "ArcFace: Additive Angular Margin Lo..."

Cites background or methods from "ArcFace: Additive Angular Margin Lo..."

References

"ArcFace: Additive Angular Margin Lo..." refers methods in this paper

"ArcFace: Additive Angular Margin Lo..." refers background in this paper

Related Papers (5)