scispace - formally typeset
Search or ask a question

Showing papers on "Facial recognition system published in 2020"


Proceedings ArticleDOI
Lingzhi Li1, Jianmin Bao2, Ting Zhang2, Hao Yang2, Dong Chen2, Fang Wen2, Baining Guo2 
14 Jun 2020
TL;DR: A novel image representation called face X-ray is proposed, which only assumes the existence of a blending step and does not rely on any knowledge of the artifacts associated with a specific face manipulation technique, and can be trained without fake images generated by any of the state-of-the-art face manipulation methods.
Abstract: In this paper we propose a novel image representation called face X-ray for detecting forgery in face images. The face X-ray of an input face image is a greyscale image that reveals whether the input image can be decomposed into the blending of two images from different sources. It does so by showing the blending boundary for a forged image and the absence of blending for a real image. We observe that most existing face manipulation methods share a common step: blending the altered face into an existing background image. For this reason, face X-ray provides an effective way for detecting forgery generated by most existing face manipulation algorithms. Face X-ray is general in the sense that it only assumes the existence of a blending step and does not rely on any knowledge of the artifacts associated with a specific face manipulation technique. Indeed, the algorithm for computing face X-ray can be trained without fake images generated by any of the state-of-the-art face manipulation methods. Extensive experiments show that face X-ray remains effective when applied to forgery generated by unseen face manipulation techniques, while most existing face forgery detection or deepfake detection algorithms experience a significant performance drop.

479 citations


Journal ArticleDOI
TL;DR: A very large-scale audio-visual dataset collected from open source media using a fully automated pipeline and developed and compared different CNN architectures with various aggregation methods and training loss functions that can effectively recognise identities from voice under various conditions are introduced.

443 citations


Posted Content
TL;DR: A multi-granularity masked face recognition model is developed that achieves 95% accuracy, exceeding the results reported by the industry and is currently the world's largest real-world masked face dataset.
Abstract: In order to effectively prevent the spread of COVID-19 virus, almost everyone wears a mask during coronavirus epidemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently theworld's largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: this https URL.

277 citations


Journal ArticleDOI
11 Aug 2020-Sensors
TL;DR: This work evaluates the speed–accuracy tradeoff of three popular deep learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs and develops a regression model capable to estimate the performance, both in terms of processing time and accuracy.
Abstract: Face recognition is a valuable forensic tool for criminal investigators since it certainly helps in identifying individuals in scenarios of criminal activity like fugitives or child sexual abuse. It is, however, a very challenging task as it must be able to handle low-quality images of real world settings and fulfill real time requirements. Deep learning approaches for face detection have proven to be very successful but they require large computation power and processing time. In this work, we evaluate the speed-accuracy tradeoff of three popular deep-learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs. We also develop a regression model capable to estimate the performance, both in terms of processing time and accuracy. We expect this to become a very useful tool for the end user in forensic laboratories in order to estimate the performance for different face detection options. Experimental results showed that the best speed-accuracy tradeoff is achieved with images resized to 50% of the original size in GPUs and images resized to 25% of the original size in CPUs. Moreover, performance can be estimated using multiple linear regression models with a Mean Absolute Error (MAE) of 0.113, which is very promising for the forensic field.

267 citations


Journal ArticleDOI
07 Jan 2020-Sensors
TL;DR: This survey is to review some well-known techniques for each approach and to give the taxonomy of their categories and a solid discussion is given about future directions in terms of techniques to be used for face recognition.
Abstract: Over the past few decades, interest in theories and algorithms for face recognition has been growing rapidly. Video surveillance, criminal identification, building access control, and unmanned and autonomous vehicles are just a few examples of concrete applications that are gaining attraction among industries. Various techniques are being developed including local, holistic, and hybrid approaches, which provide a face image description using only a few face image features or the whole facial features. The main contribution of this survey is to review some well-known techniques for each approach and to give the taxonomy of their categories. In the paper, a detailed comparison between these techniques is exposed by listing the advantages and the disadvantages of their schemes in terms of robustness, accuracy, complexity, and discrimination. One interesting feature mentioned in the paper is about the database used for face recognition. An overview of the most commonly used databases, including those of supervised and unsupervised learning, is given. Numerical results of the most interesting techniques are given along with the context of experiments and challenges handled by these techniques. Finally, a solid discussion is given in the paper about future directions in terms of techniques to be used for face recognition.

257 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: Wang et al. as mentioned in this paper proposed a self-attention mechanism over FER dataset to weight each sample in training with a ranking regularization, and a careful relabeling mechanism to modify the labels of these samples in the lowest ranked group.
Abstract: Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators. These uncertainties suspend the progress of large-scale Facial Expression Recognition (FER) in data-driven deep learning era. To address this problelm, this paper proposes to suppress the uncertainties by a simple yet efficient Self-Cure Network (SCN). Specifically, SCN suppresses the uncertainty from two different aspects: 1) a self-attention mechanism over FER dataset to weight each sample in training with a ranking regularization, and 2) a careful relabeling mechanism to modify the labels of these samples in the lowest-ranked group. Experiments on synthetic FER datasets and our collected WebEmotion dataset validate the effectiveness of our method. Results on public benchmarks demonstrate that our SCN outperforms current state-of-the-art methods with \textbf{88.14}\% on RAF-DB, \textbf{60.23}\% on AffectNet, and \textbf{89.35}\% on FERPlus.

220 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: Two learning methods are proposed that are easy to use and outperform existing deterministic methods as well as PFE on challenging unconstrained scenarios and help reducing the adverse effects of noisy samples and affects the feature learning.
Abstract: Modeling data uncertainty is important for noisy images, but seldom explored for face recognition. The pioneer work, PFE, considers uncertainty by modeling each face image embedding as a Gaussian distribution. It is quite effective. However, it uses fixed feature (mean of the Gaussian) from an existing model. It only estimates the variance and relies on an ad-hoc and costly metric. Thus, it is not easy to use. It is unclear how uncertainty affects feature learning. This work applies data uncertainty learning to face recognition, such that the feature (mean) and uncertainty (variance) are learnt simultaneously, for the first time. Two learning methods are proposed. They are easy to use and outperform existing deterministic methods as well as PFE on challenging unconstrained scenarios. We also provide insightful analysis on how incorporating uncertainty estimation helps reducing the adverse effects of noisy samples and affects the feature learning.

205 citations


Journal ArticleDOI
TL;DR: A dual-path network with a novel bi-directional dual-constrained top-ranking (BDTR) loss to learn discriminative feature representations and the extensive experiments on two cross-modality re-ID datasets demonstrate the superiority of the proposed method compared to the state-of-the-arts.
Abstract: Visible thermal person re-identification (VT-REID) is a task of matching person images captured by thermal and visible cameras, which is an extremely important issue in night-time surveillance applications. Existing cross-modality recognition works mainly focus on learning sharable feature representations to handle the cross-modality discrepancies. However, apart from the cross-modality discrepancy caused by different camera spectrums, VT-REID also suffers from large cross-modality and intra-modality variations caused by different camera environments and human poses, and so on. In this paper, we propose a dual-path network with a novel bi-directional dual-constrained top-ranking (BDTR) loss to learn discriminative feature representations. It is featured in two aspects: 1) end-to-end learning without extra metric learning step and 2) the dual-constraint simultaneously handles the cross-modality and intra-modality variations to ensure the feature discriminability. Meanwhile, a bi-directional center-constrained top-ranking (eBDTR) is proposed to incorporate the previous two constraints into a single formula, which preserves the properties to handle both cross-modality and intra-modality variations. The extensive experiments on two cross-modality re-ID datasets demonstrate the superiority of the proposed method compared to the state-of-the-arts.

171 citations


Journal ArticleDOI
TL;DR: The history of face recognition technology, the current state-of-the-art methodologies, and future directions are presented, specifically on the most recent databases, 2D and 3D face recognition methods.
Abstract: Face recognition is one of the most active research fields of computer vision and pattern recognition, with many practical and commercial applications including identification, access control, forensics, and human-computer interactions. However, identifying a face in a crowd raises serious questions about individual freedoms and poses ethical issues. Significant methods, algorithms, approaches, and databases have been proposed over recent years to study constrained and unconstrained face recognition. 2D approaches reached some degree of maturity and reported very high rates of recognition. This performance is achieved in controlled environments where the acquisition parameters are controlled, such as lighting, angle of view, and distance between the camera–subject. However, if the ambient conditions (e.g., lighting) or the facial appearance (e.g., pose or facial expression) change, this performance will degrade dramatically. 3D approaches were proposed as an alternative solution to the problems mentioned above. The advantage of 3D data lies in its invariance to pose and lighting conditions, which has enhanced recognition systems efficiency. 3D data, however, is somewhat sensitive to changes in facial expressions. This review presents the history of face recognition technology, the current state-of-the-art methodologies, and future directions. We specifically concentrate on the most recent databases, 2D and 3D face recognition methods. Besides, we pay particular attention to deep learning approach as it presents the actuality in this field. Open issues are examined and potential directions for research in facial recognition are proposed in order to provide the reader with a point of reference for topics that deserve consideration.

155 citations


Journal ArticleDOI
14 May 2020-Sensors
TL;DR: A new facemask-wearing condition identification method by combining image super-resolution and classification networks (SRCNet), which quantifies a three-category classification problem based on unconstrained 2D facial images, thus having potential applications in epidemic prevention involving COVID-19.
Abstract: The rapid worldwide spread of Coronavirus Disease 2019 (COVID-19) has resulted in a global pandemic. Correct facemask wearing is valuable for infectious disease control, but the effectiveness of facemasks has been diminished, mostly due to improper wearing. However, there have not been any published reports on the automatic identification of facemask-wearing conditions. In this study, we develop a new facemask-wearing condition identification method by combining image super-resolution and classification networks (SRCNet), which quantifies a three-category classification problem based on unconstrained 2D facial images. The proposed algorithm contains four main steps: Image pre-processing, facial detection and cropping, image super-resolution, and facemask-wearing condition identification. Our method was trained and evaluated on the public dataset Medical Masks Dataset containing 3835 images with 671 images of no facemask-wearing, 134 images of incorrect facemask-wearing, and 3030 images of correct facemask-wearing. Finally, the proposed SRCNet achieved 98.70% accuracy and outperformed traditional end-to-end image classification methods using deep learning without image super-resolution by over 1.5% in kappa. Our findings indicate that the proposed SRCNet can achieve high-accuracy identification of facemask-wearing conditions, thus having potential applications in epidemic prevention involving COVID-19.

150 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: A new approach to detect presentation attacks from multiple frames based on two insights, able to capture discriminative details via Residual Spatial Gradient Block (RSGB) and encode spatio-temporal information from Spatio-Temporal Propagation Module (STPM) efficiently.
Abstract: Face anti-spoofing is critical to the security of face recognition systems. Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing. Despite the great success, most previous works still formulate the problem as a single-frame multi-task one by simply augmenting the loss with depth, while neglecting the detailed fine-grained information and the interplay between facial depths and moving patterns. In contrast, we design a new approach to detect presentation attacks from multiple frames based on two insights: 1) detailed discriminative clues (e.g., spatial gradient magnitude) between living and spoofing face may be discarded through stacked vanilla convolutions, and 2) the dynamics of 3D moving faces provide important clues in detecting the spoofing faces. The proposed method is able to capture discriminative details via Residual Spatial Gradient Block (RSGB) and encode spatio-temporal information from Spatio-Temporal Propagation Module (STPM) efficiently. Moreover, a novel Contrastive Depth Loss is presented for more accurate depth supervision. To assess the efficacy of our method, we also collect a Double-modal Anti-spoofing Dataset (DMAD) which provides actual depth for each sample. The experiments demonstrate that the proposed approach achieves state-of-the-art results on five benchmark datasets including OULU-NPU, SiW, CASIA-MFSD, Replay-Attack, and the new DMAD. Codes will be available at https://github.com/clks-wzz/FAS-SGTD.

Journal ArticleDOI
TL;DR: Experimental results validate the efficiency of the proposed method in accurate detection of faces compared to state-of-the-art face detection and recognition methods, and verify its effectiveness for enhancing law-enforcement services in smart cities.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This paper proposes a novel face recognition method via meta-learning named Meta Face Recognition (MFR), which synthesizes the source/target domain shift with a meta-optimization objective, which requires the model to learn effective representations not only on synthesized source domains but also on synthesizer target domains.
Abstract: Face recognition systems are usually faced with unseen domains in real-world applications and show unsatisfactory performance due to their poor generalization. For example, a well-trained model on webface data cannot deal with the ID vs. Spot task in surveillance scenario. In this paper, we aim to learn a generalized model that can directly handle new unseen domains without any model updating. To this end, we propose a novel face recognition method via meta-learning named Meta Face Recognition (MFR). MFR synthesizes the source/target domain shift with a meta-optimization objective, which requires the model to learn effective representations not only on synthesized source domains but also on synthesized target domains. Specifically, we build domain-shift batches through a domain-level sampling strategy and get back-propagated gradients/meta-gradients on synthesized source/target domains by optimizing multi-domain distributions. The gradients and meta-gradients are further combined to update the model to improve generalization. Besides, we propose two benchmarks for generalized face recognition evaluation. Experiments on our benchmarks validate the generalization of our method compared to several baselines and other state-of-the-arts. The proposed benchmarks and code will be available at https://github.com/cleardusk/MFR.

Journal ArticleDOI
TL;DR: A deep regression network termed DepressNet is presented to learn a depression representation with visual explanation, with results showing that the DAM induced by the learned deep model may help reveal the visual depression pattern on faces and understand the insights of automated depression diagnosis.
Abstract: Recent evidence in mental health assessment have demonstrated that facial appearance could be highly indicative of depressive disorder. While previous methods based on the facial analysis promise to advance clinical diagnosis of depressive disorder in a more efficient and objective manner, challenges in visual representation of complex depression pattern prevent widespread practice of automated depression diagnosis. In this paper, we present a deep regression network termed DepressNet to learn a depression representation with visual explanation. Specifically, a deep convolutional neural network equipped with a global average pooling layer is first trained with facial depression data, which allows for identifying salient regions of input image in terms of its severity score based on the generated depression activation map (DAM). We then propose a multi-region DepressNet, with which multiple local deep regression models for different face regions are jointly leaned and their responses are fused to improve the overall recognition performance. We evaluate our method on two benchmark datasets, and the results show that our method significantly boosts state-of-the-art performance of the visual-based depression recognition. Most importantly, the DAM induced by our learned deep model may help reveal the visual depression pattern on faces and understand the insights of automated depression diagnosis.

Journal ArticleDOI
TL;DR: It is argued that school-based facial recognition presents a number of other social challenges and concerns that merit specific attention and the likelihood of facial recognition technology altering the nature of schools and schooling along divisive, authoritarian and oppressive lines.
Abstract: Facial recognition technology is now being introduced across various aspects of public life. This includes the burgeoning integration of facial recognition and facial detection into compuls...

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work proposes a novel approach named Label Distribution Learning on Auxiliary Label Space Graphs (LDL-ALSG) that leverages the topological information of the labels from related but more distinct tasks, such as action unit recognition and facial landmark detection.
Abstract: Many existing studies reveal that annotation inconsistency widely exists among a variety of facial expression recognition (FER) datasets. The reason might be the subjectivity of human annotators and the ambiguous nature of the expression labels. One promising strategy tackling such a problem is a recently proposed learning paradigm called Label Distribution Learning (LDL), which allows multiple labels with different intensity to be linked to one expression. However, it is often impractical to directly apply label distribution learning because numerous existing datasets only contain one-hot labels rather than label distributions. To solve the problem, we propose a novel approach named Label Distribution Learning on Auxiliary Label Space Graphs(LDL-ALSG) that leverages the topological information of the labels from related but more distinct tasks, such as action unit recognition and facial landmark detection. The underlying assumption is that facial images should have similar expression distributions to their neighbours in the label space of action unit recognition and facial landmark detection. Our proposed method is evaluated on a variety of datasets and outperforms those state-of-the-art methods consistently with a huge margin.

Proceedings Article
01 Jan 2020
TL;DR: Fawkes is a system that helps individuals inoculate their images against unauthorized facial recognition models by helping users add imperceptible pixel-level changes to their own photos before releasing them, and is robust against a variety of countermeasures that try to detect or disrupt image cloaks.
Abstract: Today's proliferation of powerful facial recognition systems poses a real threat to personal privacy. As this http URL demonstrated, anyone can canvas the Internet for data and train highly accurate facial recognition models of individuals without their knowledge. We need tools to protect ourselves from potential misuses of unauthorized facial recognition systems. Unfortunately, no practical or effective solutions exist. In this paper, we propose Fawkes, a system that helps individuals inoculate their images against unauthorized facial recognition models. Fawkes achieves this by helping users add imperceptible pixel-level changes (we call them "cloaks") to their own photos before releasing them. When used to train facial recognition models, these "cloaked" images produce functional models that consistently cause normal images of the user to be misidentified. We experimentally demonstrate that Fawkes provides 95+% protection against user recognition regardless of how trackers train their models. Even when clean, uncloaked images are "leaked" to the tracker and used for training, Fawkes can still maintain an 80+% protection success rate. We achieve 100% success in experiments against today's state-of-the-art facial recognition services. Finally, we show that Fawkes is robust against a variety of countermeasures that try to detect or disrupt image cloaks.

Journal ArticleDOI
TL;DR: This paper proposes a two-stream convolutional neural network (TSCNN), which works on two complementary spaces: RGB space ( original imaging space) and multi-scale retinex (MSR) space (illumination-invariant space), and proposes an attention-based fusion method, which can effectively capture the complementarity of two features.
Abstract: Since the human face preserves the richest information for recognizing individuals, face recognition has been widely investigated and achieved great success in various applications in the past decades. However, face spoofing attacks (e.g., face video replay attack) remain a threat to modern face recognition systems. Though many effective methods have been proposed for anti-spoofing, we find that the performance of many existing methods is degraded by illuminations. It motivates us to develop illumination-invariant methods for anti-spoofing. In this paper, we propose a two-stream convolutional neural network (TSCNN), which works on two complementary spaces: RGB space (original imaging space) and multi-scale retinex (MSR) space (illumination-invariant space). Specifically, the RGB space contains the detailed facial textures, yet it is sensitive to illumination; MSR is invariant to illumination, yet it contains less detailed facial information. In addition, the MSR images can effectively capture the high-frequency information, which is discriminative for face spoofing detection. Images from two spaces are fed to the TSCNN to learn the discriminative features for anti-spoofing. To effectively fuse the features from two sources (RGB and MSR), we propose an attention-based fusion method, which can effectively capture the complementarity of two features. We evaluate the proposed framework on various databases, i.e., CASIA-FASD, REPLAY-ATTACK, and OULU, and achieve very competitive performance. To further verify the generalization capacity of the proposed strategies, we conduct cross-database experiments, and the results show the great effectiveness of our method.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work proposes a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge.
Abstract: Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge. We firstly synthesize training data alongside some semantically meaningful variations, such as low resolution, occlusion and head pose. However, directly feeding the augmented data for training will not converge well as the newly introduced samples are mostly hard examples. We propose to split the feature embedding into multiple sub-embeddings, and associate different confidence values for each sub-embedding to smooth the training procedure. The sub-embeddings are further decorrelated by regularizing variation classification loss and variation adversarial loss on different partitions of them. Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace, while significantly better on extreme benchmarks such as TinyFace and IJB-S.

Journal ArticleDOI
TL;DR: The objective here is to explore the possibility of identifying diseases from uncontrolled 2D face images by deep learning techniques by using deep transfer learning from face recognition to perform the computer-aided facial diagnosis on various diseases.
Abstract: The relationship between face and disease has been discussed from thousands years ago, which leads to the occurrence of facial diagnosis The objective here is to explore the possibility of identifying diseases from uncontrolled 2D face images by deep learning techniques In this paper, we propose using deep transfer learning from face recognition to perform the computer-aided facial diagnosis on various diseases In the experiments, we perform the computer-aided facial diagnosis on single (beta-thalassemia) and multiple diseases (beta-thalassemia, hyperthyroidism, Down syndrome, and leprosy) with a relatively small dataset The overall top-1 accuracy by deep transfer learning from face recognition can reach over 90% which outperforms the performance of both traditional machine learning methods and clinicians in the experiments In practical, collecting disease-specific face images is complex, expensive and time consuming, and imposes ethical limitations due to personal data treatment Therefore, the datasets of facial diagnosis related researches are private and generally small comparing with the ones of other machine learning application areas The success of deep transfer learning applications in the facial diagnosis with a small dataset could provide a low-cost and noninvasive way for disease screening and detection

Journal ArticleDOI
TL;DR: Face recognition has become the future development direction and has many potential application prospects and is introduced in the general evaluation standards and the general databases of face recognition.
Abstract: Face recognition technology is a biometric technology, which is based on the identification of facial features of a person. People collect the face images, and the recognition equipment automatically processes the images. The paper introduces the related researches of face recognition from different perspectives. The paper describes the development stages and the related technologies of face recognition. We introduce the research of face recognition for real conditions, and we introduce the general evaluation standards and the general databases of face recognition. We give a forward-looking view of face recognition. Face recognition has become the future development direction and has many potential application prospects.

Journal ArticleDOI
TL;DR: The Tufts Face Database is introduced that includes images acquired in various modalities: photograph images, thermal images, near infrared images, a recorded video, a computerized facial sketch, and 3D images of each volunteer's face.
Abstract: Cross-modality face recognition is an emerging topic due to the wide-spread usage of different sensors in day-to-day life applications. The development of face recognition systems relies greatly on existing databases for evaluation and obtaining training examples for data-hungry machine learning algorithms. However, currently, there is no publicly available face database that includes more than two modalities for the same subject. In this work, we introduce the Tufts Face Database that includes images acquired in various modalities: photograph images, thermal images, near infrared images, a recorded video, a computerized facial sketch, and 3D images of each volunteer's face. An Institutional Research Board protocol was obtained and images were collected from students, staff, faculty, and their family members at Tufts University. The database includes over 10,000 images from 113 individuals from more than 15 different countries, various gender identities, ages, and ethnic backgrounds. The contributions of this work are: 1) Detailed description of the content and acquisition procedure for images in the Tufts Face Database; 2) The Tufts Face Database is publicly available to researchers worldwide, which will allow assessment and creation of more robust, consistent, and adaptable recognition algorithms; 3) A comprehensive, up-to-date review on face recognition systems and face datasets.

Journal ArticleDOI
18 Feb 2020
TL;DR: Using two different deep convolutional neural network face matchers, it is shown that for a fixed decision threshold, the African-American image cohort has a higher false match rate (FMR), and the Caucasian cohort hasA higher false nonmatch rate.
Abstract: Face recognition technology has recently become controversial over concerns about possible bias due to accuracy varying based on race or skin tone. We explore three important aspects of face recognition technology related to this controversy. Using two different deep convolutional neural network face matchers, we show that for a fixed decision threshold, the African-American image cohort has a higher false match rate (FMR), and the Caucasian cohort has a higher false nonmatch rate. We present an analysis of the impostor distribution designed to test the premise that darker skin tone causes a higher FMR, and find no clear evidence to support this premise. Finally, we explore how using face recognition for one-to-many identification can have a very low false-negative identification rate and still present concerns related to the false-positive identification rate. Both the ArcFace and VGGFace2 matchers and the MORPH dataset used in our experiments are available to the research community so that others should be able to reproduce or reanalyze our results.

Posted Content
TL;DR: This paper addresses a methodology to use the current facial datasets by augmenting it with tools that enable masks to be recognized with low false-positive rates and high overall accuracy, without requiring the user dataset to be recreated by taking new pictures for authentication.
Abstract: With the recent world-wide COVID-19 pandemic, using face masks have become an important part of our lives. People are encouraged to cover their faces when in public area to avoid the spread of infection. The use of these face masks has raised a serious question on the accuracy of the facial recognition system used for tracking school/office attendance and to unlock phones. Many organizations use facial recognition as a means of authentication and have already developed the necessary datasets in-house to be able to deploy such a system. Unfortunately, masked faces make it difficult to be detected and recognized, thereby threatening to make the in-house datasets invalid and making such facial recognition systems inoperable. This paper addresses a methodology to use the current facial datasets by augmenting it with tools that enable masked faces to be recognized with low false-positive rates and high overall accuracy, without requiring the user dataset to be recreated by taking new pictures for authentication. We present an open-source tool, MaskTheFace to mask faces effectively creating a large dataset of masked faces. The dataset generated with this tool is then used towards training an effective facial recognition system with target accuracy for masked faces. We report an increase of 38% in the true positive rate for the Facenet system. We also test the accuracy of re-trained system on a custom real-world dataset MFR2 and report similar accuracy.

Journal ArticleDOI
TL;DR: A robust, effective, and gait-related loss function, called angle center loss (ACL), is proposed to learn discriminative gait features and introduces long short-term memory units as the temporal attention model to learn the attention score for each frame.
Abstract: Recently, deep learning-based cross-view gait recognition has become popular owing to the strong capacity of convolutional neural networks (CNNs). Current deep learning methods often rely on loss functions used widely in the task of face recognition, e.g., contrastive loss and triplet loss. These loss functions have the problem of hard negative mining. In this paper, a robust, effective, and gait-related loss function, called angle center loss (ACL), is proposed to learn discriminative gait features. The proposed loss function is robust to different local parts and temporal window sizes. Different from center loss which learns a center for each identity, the proposed loss function learns multiple sub-centers for each angle of the same identity. Only the largest distance between the anchor feature and the corresponding cross-view sub-centers is penalized, which achieves better intra-subject compactness. We also propose to extract discriminative spatial–temporal features by local feature extractors and a temporal attention model. A simplified spatial transformer network is proposed to localize the suitable horizontal parts of the human body. Local gait features for each horizontal part are extracted and then concatenated as the descriptor. We introduce long short-term memory (LSTM) units as the temporal attention model to learn the attention score for each frame, e.g., focusing more on discriminative frames and less on frames with bad quality. The temporal attention model shows better performance than the temporal average pooling or gait energy images (GEI). By combing the three aspects, we achieve state-of-the-art results on several cross-view gait recognition benchmarks.

Journal ArticleDOI
01 Mar 2020
TL;DR: A new method using Local Binary Pattern (LBP) algorithm combined with advanced image processing techniques such as Contrast Adjustment, Bilateral Filter, Histogram Equalization and Image Blending to address some of the issues hampering face recognition accuracy so as to improve the LBP codes, thus improve the accuracy of the overall face recognition system.
Abstract: Face Recognition is a computer application that is capable of detecting, tracking, identifying or verifying human faces from an image or video captured using a digital camera. Although lot of progress has been made in domain of face detection and recognition for security, identification and attendance purpose, but still there are issues hindering the progress to reach or surpass human level accuracy. These issues are variations in human facial appearance such as; varying lighting condition, noise in face images, scale, pose etc. This research paper presents a new method using Local Binary Pattern (LBP) algorithm combined with advanced image processing techniques such as Contrast Adjustment, Bilateral Filter, Histogram Equalization and Image Blending to address some of the issues hampering face recognition accuracy so as to improve the LBP codes, thus improve the accuracy of the overall face recognition system. Our experiment results show that our method is very accurate, reliable and robust for face recognition system that can be practically implemented in real-life environment as an automatic attendance management system.

Proceedings ArticleDOI
15 Oct 2020
TL;DR: A review of face recognition has been done and the description of the developed lightweight hybrid high performance face recognition framework enables to switch face recognition models among state-of-the-art ones.
Abstract: Face recognition constitutes a relatively a popular area which has emerged from the rulers of the social media to top universities in the world. Those frontiers and rule makers recently designed deep learning based custom face recognition models. A modern face recognition pipeline consists of four common stages: detecting, alignment, representation and verification. However, face recognition studies mainly mention the representation stage of a pipeline. In this paper, first of all a review face recognition has been done and then the description of the developed lightweight hybrid high performance face recognition framework has been made. Its hybrid feature enables to switch face recognition models among state-of-the-art ones.

Journal ArticleDOI
12 Feb 2020
TL;DR: A novel multi-modal multi-scale fusion method is presented as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales.
Abstract: Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities ( i.e. , RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0 .

Journal ArticleDOI
TL;DR: The methods used to obtain and classify facial biometric data in the literature have been summarized and a taxonomy of image-based and video-based face recognition methods is given, outlining the major historical developments, and the main processing steps.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: In this article, the authors reveal critical insights into problems of bias in state-of-the-art facial recognition (FR) systems using a novel Balanced Faces In the Wild (BFW) dataset: data balanced for gender and ethnic groups.
Abstract: We reveal critical insights into problems of bias in state-of-the-art facial recognition (FR) systems using a novel Balanced Faces In the Wild (BFW) dataset: data balanced for gender and ethnic groups. We show variations in the optimal scoring threshold for face-pairs across different subgroups. Thus, the conventional approach of learning a global threshold for all pairs results in performance gaps between subgroups. By learning subgroup-specific thresholds, we reduce performance gaps, and also show a notable boost in overall performance. Furthermore, we do a human evaluation to measure bias in humans, which supports the hypothesis that an analogous bias exists in human perception. For the BFW database, source code, and more, visit https://github.com/visionjo/facerec-bias-bfw.