Unmasking Face Embeddings by Self-restrained Triplet Loss for Accurate Masked Face Recognition.

Home
/
Papers
/
Unmasking Face Embeddings by Self-restrained Triplet Loss for Accurate Masked Face Recognition.

Posted Content•

Unmasking Face Embeddings by Self-restrained Triplet Loss for Accurate Masked Face Recognition.

Fadi Boutros¹, Naser Damer, Florian Kirchbuchner, Arjan Kuijper•Institutions (1)

02 Mar 2021-arXiv: Computer Vision and Pattern Recognition-

TL;DR: Wang et al. as mentioned in this paper proposed the Embedding Unmasking Model (EUM) operated on top of existing face recognition models, which enabled the EUM to produce embeddings similar to these of unmasked faces of the same identities.

read less

Abstract: Using the face as a biometric identity trait is motivated by the contactless nature of the capture process and the high accuracy of the recognition algorithms. After the current COVID-19 pandemic, wearing a face mask has been imposed in public places to keep the pandemic under control. However, face occlusion due to wearing a mask presents an emerging challenge for face recognition systems. In this paper, we present a solution to improve the masked face recognition performance. Specifically, we propose the Embedding Unmasking Model (EUM) operated on top of existing face recognition models. We also propose a novel loss function, the Self-restrained Triplet (SRT), which enabled the EUM to produce embeddings similar to these of unmasked faces of the same identities. The achieved evaluation results on three face recognition models, two real masked datasets, and two synthetically generated masked face datasets proved that our proposed approach significantly improves the performance in most experimental settings.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

MixFaceNets: Extremely Efficient Face Recognition Networks

[...]

Fadi Boutros¹, Naser Damer¹, Meiling Fang¹, Florian Kirchbuchner¹, Arjan Kuijper¹ - Show less +1 more•Institutions (1)

Fraunhofer Society¹

04 Aug 2021-International Journal of Central Banking

TL;DR: MixFaceNets as discussed by the authors is a set of extremely efficient and high throughput models for accurate face verification, which are inspired by Mixed Depthwise Convolutional Kernels (MDCK).

...read moreread less

Abstract: In this paper, we present a set of extremely efficient and high throughput models for accurate face verification, Mix-FaceNets which are inspired by Mixed Depthwise Convolutional Kernels. Extensive experiment evaluations on Label Face in the Wild (LFW), Age-DB, MegaFace, and IARPA Janus Benchmarks IJB-B and IJB-C datasets have shown the effectiveness of our MixFaceNets for applications requiring extremely low computational complexity. Under the same level of computation complexity (≤ 500M FLOPs), our MixFaceNets outperform MobileFaceNets on all the evaluated datasets, achieving 99.60% accuracy on LFW, 97.05% accuracy on AgeDB-30, 93.60 TAR (at FAR1e-6) on MegaFace, 90.94 TAR (at FAR1e-4) on IJB-B and 93.08 TAR (at FAR1e-4) on IJB-C. With computational complexity between 500M and 1G FLOPs, our MixFaceNets achieved results comparable to the top-ranked models, while using significantly fewer FLOPs and less computation over-head, which proves the practical value of our proposed Mix-FaceNets. All training codes, pre-trained models, and training logs have been made available https://github.com/fdbtrs/mixfacenets.

...read moreread less

44 citations

Proceedings Article•DOI•

MFR 2021: Masked Face Recognition Competition

[...]

Fadi Boutros¹, Naser Damer¹, Jan Niklas Kolf¹, Kiran B. Raja², Florian Kirchbuchner¹, Raghavendra Ramachandra², Arjan Kuijper¹, Pengcheng Fang, Chao Zhang, Fei Wang, David Montero, Naiara Aginako³, Basilio Sierra³, Marcos Nieto, Mustafa Ekrem Erakin⁴, Ugur Demir⁴, Hazim Kemal Ekenel⁴, Asaki Kataoka⁵, Kohei Ichikawa⁵, Shizuma Kubo⁵, Jie Zhang⁶, Mingjie He⁶, Dan Han⁶, Shiguang Shan⁶, Klemen Grm⁷, Vitomir Struc⁷, Sachith Seneviratne, Nuran Kasthuriarachchi⁸, Sanka Rasnayaka⁹, Pedro C. Neto, Ana F. Sequeira¹⁰, Joao Ribeiro Pinto, Mohsen Saffari, Jaime S. Cardoso - Show less +30 more•Institutions (10)

Fraunhofer Society¹, Norwegian University of Science and Technology², University of the Basque Country³, Istanbul Technical University⁴, San Antonio College⁵, Chinese Academy of Sciences⁶, University of Ljubljana⁷, University of Moratuwa⁸, National University of Singapore⁹, University of Porto¹⁰

04 Aug 2021-International Journal of Central Banking

TL;DR: The Masked Face Recognition Competition (MFR) as discussed by the authors was held within the 2021 International Joint Conference on Biometrics (IJCB 2021) and attracted a total of 10 participating teams with valid submissions.

...read moreread less

Abstract: This paper presents a summary of the Masked Face Recognition Competitions (MFR) held within the 2021 International Joint Conference on Biometrics (IJCB 2021). The competition attracted a total of 10 participating teams with valid submissions. The affiliations of these teams are diverse and associated with academia and industry in nine different countries. These teams successfully submitted 18 valid solutions. The competition is designed to motivate solutions aiming at enhancing the face recognition accuracy of masked faces. Moreover, the competition considered the deployability of the proposed solutions by taking the compactness of the face recognition models into account. A private dataset representing a collaborative, multisession, real masked, capture scenario is used to evaluate the submitted solutions. In comparison to one of the topperforming academic face recognition solutions, 10 out of the 18 submitted solutions did score higher masked face verification accuracy.

...read moreread less

37 citations

Proceedings Article•DOI•

My Eyes Are Up Here: Promoting Focus on Uncovered Regions in Masked Face Recognition

[...]

Pedro C. Neto, Fadi Boutros¹, Joao Ribeiro Pinto, Mohsen Saffari, Naser Damer¹, Ana F. Sequeira, Jaime S. Cardoso - Show less +3 more•Institutions (1)

Fraunhofer Society¹

27 Sep 2021

TL;DR: In this article, the authors proposed a methodology that combines the traditional triplet loss and the mean squared error (MSE) intending to improve the robustness of an MFR system in the masked-unmasked comparison mode.

...read moreread less

Abstract: The recent Covid-19 pandemic and the fact that wearing masks in public is now mandatory in several countries, created challenges in the use of face recognition systems (FRS). In this work, we address the challenge of masked face recognition (MFR) and focus on evaluating the verification performance in FRS when verifying masked vs unmasked faces compared to verifying only unmasked faces. We propose a methodology that combines the traditional triplet loss and the mean squared error (MSE) intending to improve the robustness of an MFR system in the masked-unmasked comparison mode. The results obtained by our proposed method show improvements in a detailed step-wise ablation study. The conducted study showed significant performance gains induced by our proposed training paradigm and modified triplet loss on two evaluation databases.

...read moreread less

15 citations

Posted Content•

Masked Face Recognition: Human vs. Machine

[...]

Naser Damer¹, Fadi Boutros, Marius Süßmilch, Meiling Fang, Florian Kirchbuchner, Arjan Kuijper - Show less +2 more•Institutions (1)

Fraunhofer Society¹

02 Mar 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a joint evaluation and in-depth analyses of the face verification performance of human experts in comparison to state-of-the-art automatic face recognition solutions is provided. But, the study concludes with a set of take-home messages on different aspects of the correlation between the verification behavior of human and machine.

...read moreread less

Abstract: The recent COVID-19 pandemic has increased the focus on hygienic and contactless identity verification methods. However, the pandemic led to the wide use of face masks, essential to keep the pandemic under control. The effect of wearing a mask on face recognition in a collaborative environment is currently sensitive yet understudied issue. Recent reports have tackled this by evaluating the masked probe effect on the performance of automatic face recognition solutions. However, such solutions can fail in certain processes, leading to performing the verification task by a human expert. This work provides a joint evaluation and in-depth analyses of the face verification performance of human experts in comparison to state-of-the-art automatic face recognition solutions. This involves an extensive evaluation with 12 human experts and 4 automatic recognition solutions. The study concludes with a set of take-home messages on different aspects of the correlation between the verification behavior of human and machine.

...read moreread less

6 citations

Posted Content•

The Effect of Wearing a Face Mask on Face Image Quality

[...]

Biying Fu¹, Florian Kirchbuchner, Naser Damer•Institutions (1)

Fraunhofer Society¹

21 Oct 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the effect of wearing a face mask on the performance of face recognition has been investigated in a collaborative environment, where state-of-the-art face image quality assessment methods of different natures were used.

...read moreread less

Abstract: Due to the COVID-19 situation, face masks have become a main part of our daily life. Wearing mouth-and-nose protection has been made a mandate in many public places, to prevent the spread of the COVID-19 virus. However, face masks affect the performance of face recognition, since a large area of the face is covered. The effect of wearing a face mask on the different components of the face recognition system in a collaborative environment is a problem that is still to be fully studied. This work studies, for the first time, the effect of wearing a face mask on face image quality by utilising state-of-the-art face image quality assessment methods of different natures. This aims at providing better understanding on the effect of face masks on the operation of face recognition as a whole system. In addition, we further studied the effect of simulated masks on face image utility in comparison to real face masks. We discuss the correlation between the mask effect on face image quality and that on the face verification performance by automatic systems and human experts, indicating a consistent trend between both factors. The evaluation is conducted on the database containing (1) no-masked faces, (2) real face masks, and (3) simulated face masks, by synthetically generating digital facial masks on no-masked faces according to the NIST protocols [1, 23]. Finally, a visual interpretation of the face areas contributing to the quality score of a selected set of quality assessment methods is provided to give a deeper insight into the difference of network decisions in masked and non-masked faces, among other variations.

...read moreread less

3 citations

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

30,843 citations

Proceedings Article•DOI•

MobileNetV2: Inverted Residuals and Linear Bottlenecks

[...]

Mark Sandler¹, Andrew Howard¹, Menglong Zhu¹, Andrey Zhmoginov¹, Liang-Chieh Chen¹ - Show less +1 more•Institutions (1)

Google¹

18 Jun 2018

TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.

...read moreread less

Abstract: In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

...read moreread less

9,381 citations

Proceedings Article•DOI•

FaceNet: A unified embedding for face recognition and clustering

[...]

Florian Schroff¹, Dmitry Kalenichenko¹, James Philbin¹•Institutions (1)

Google¹

07 Jun 2015

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.

...read moreread less

Abstract: Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.

...read moreread less

8,289 citations

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

[...]

Gary B. Huang¹, Marwan Mattar¹, Tamara L. Berg², Eric Learned-Miller¹•Institutions (2)

University of Massachusetts Amherst¹, Stony Brook University²

01 Oct 2008

TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.

...read moreread less

Abstract: Most face databases have been created under controlled conditions to facilitate the study of specific parameters on the face recognition problem. These parameters include such variables as position, pose, lighting, background, camera quality, and gender. While there are many applications for face recognition technology in which one can control the parameters of image acquisition, there are also many applications in which the practitioner has little or no control over such parameters. This database, Labeled Faces in the Wild, is provided as an aid in studying the latter, unconstrained, recognition problem. The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life. The database exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background. In addition to describing the details of the database, we provide specific experimental paradigms for which the database is suitable. This is done in an effort to make research performed with the database as consistent and comparable as possible. We provide baseline results, including results of a state of the art face recognition system combined with a face alignment system. To facilitate experimentation on the database, we provide several parallel databases, including an aligned version.

...read moreread less

5,742 citations