scispace - formally typeset
Search or ask a question

Showing papers by "Maneet Singh published in 2019"


Journal ArticleDOI
TL;DR: A comprehensive review of techniques incorporating ancillary information in the biometric recognition pipeline is presented in this paper, where the authors provide a comprehensive overview of the role of information fusion in biometrics.

151 citations


Posted Content
TL;DR: A better understanding of state-of-the-art deep learning networks would enable researchers to address the given challenge of bias in AI, and develop fairer systems.
Abstract: Do very high accuracies of deep networks suggest pride of effective AI or are deep networks prejudiced? Do they suffer from in-group biases (own-race-bias and own-age-bias), and mimic the human behavior? Is in-group specific information being encoded sub-consciously by the deep networks? This research attempts to answer these questions and presents an in-depth analysis of `bias' in deep learning based face recognition systems This is the first work which decodes if and where bias is encoded for face recognition Taking cues from cognitive studies, we inspect if deep networks are also affected by social in- and out-group effect Networks are analyzed for own-race and own-age bias, both of which have been well established in human beings The sub-conscious behavior of face recognition models is examined to understand if they encode race or age specific features for face recognition Analysis is performed based on 36 experiments conducted on multiple datasets Four deep learning networks either trained from scratch or pre-trained on over 10M images are used Variations across class activation maps and feature visualizations provide novel insights into the functioning of deep learning systems, suggesting behavior similar to humans It is our belief that a better understanding of state-of-the-art deep learning networks would enable researchers to address the given challenge of bias in AI, and develop fairer systems

68 citations


Journal ArticleDOI
08 Mar 2019
TL;DR: The disguised faces in the wild (DFW) dataset as discussed by the authors contains over 11,000 images of 1000 identities with variations across different types of disguise accessories, including impersonator and genuine obfuscated face images for each subject.
Abstract: Research in face recognition has seen tremendous growth over the past couple of decades. Beginning from algorithms capable of performing recognition in constrained environments, existing face recognition systems achieve very high accuracies on large-scale unconstrained face datasets. While upcoming algorithms continue to achieve improved performance, many of them are susceptible to reduced performance under disguise variations, one of the most challenging covariate of face recognition. In this paper, the disguised faces in the wild (DFW) dataset is presented, which contains over 11000 images of 1000 identities with variations across different types of disguise accessories (the DFW dataset link: http://iab-rubric.org/resources/dfw.html ). The images are collected from the Internet, resulting in unconstrained variations similar to real-world settings. This is a unique dataset that contains impersonator and genuine obfuscated face images for each subject. The DFW dataset has been analyzed in terms of three levels of difficulty: 1) easy; 2) medium; and 3) hard, in order to showcase the challenging nature of the problem. The dataset was released as part of the First International Workshop and Competition on DFW at the Conference on Computer Vision and Pattern Recognition, 2018. This paper presents the DFW dataset in detail, including the evaluation protocols, baseline results, performance analysis of the submissions received as part of the competition, and three levels of difficulties of the DFW challenge dataset.

52 citations


Posted Content
TL;DR: The purpose of this article is to provide readers a comprehensive overview of the role of information fusion in biometrics with specific focus on three questions: what to fusion, when to fuse, and how to fuse.
Abstract: The performance of a biometric system that relies on a single biometric modality (e.g., fingerprints only) is often stymied by various factors such as poor data quality or limited scalability. Multibiometric systems utilize the principle of fusion to combine information from multiple sources in order to improve recognition accuracy whilst addressing some of the limitations of single-biometric systems. The past two decades have witnessed the development of a large number of biometric fusion schemes. This paper presents an overview of biometric fusion with specific focus on three questions: what to fuse, when to fuse, and how to fuse. A comprehensive review of techniques incorporating ancillary information in the biometric recognition pipeline is also presented. In this regard, the following topics are discussed: (i) incorporating data quality in the biometric recognition pipeline; (ii) combining soft biometric attributes with primary biometric identifiers; (iii) utilizing contextual information to improve biometric recognition accuracy; and (iv) performing continuous authentication using ancillary information. In addition, the use of information fusion principles for presentation attack detection and multibiometric cryptosystems is also discussed. Finally, some of the research challenges in biometric fusion are enumerated. The purpose of this article is to provide readers a comprehensive overview of the role of information fusion in biometrics.

47 citations


Proceedings ArticleDOI
14 May 2019
TL;DR: This research presents a novel large-scale drone dataset, DroneSURF: Drone Surveillance of Faces, in order to facilitate research for face recognition, along with information regarding the data distribution, protocols for evaluation, and baseline results.
Abstract: Unmanned Aerial Vehicles (UAVs) or drones are often used to reach remote areas or regions which are inaccessible to humans. Equipped with a large field of view, compact size, and remote control abilities, drones are deemed suitable for monitoring crowded or disaster-hit areas, and performing aerial surveillance. While research has focused on area monitoring, object detection and tracking, limited attention has been given to person identification, especially face recognition, using drones. This research presents a novel large-scale drone dataset, DroneSURF: Drone Surveillance of Faces, in order to facilitate research for face recognition. The dataset contains 200 videos of 58 subjects, captured across 411K frames, having over 786K face annotations. The proposed dataset demonstrates variations across two surveillance use cases: (i) active and (ii) passive, two locations, and two acquisition times. DroneSURF encapsulates challenges due to the effect of motion, variations in pose, illumination, background, altitude, and resolution, especially due to the large and varying distance between the drone and the subjects. This research presents a detailed description of the proposed DroneSURF dataset, along with information regarding the data distribution, protocols for evaluation, and baseline results.

37 citations


Proceedings ArticleDOI
01 Oct 2019
TL;DR: DirectCapsNet as discussed by the authors utilizes a combination of capsule and convolutional layers for learning an effective very low resolution (VLR) recognition model, and incorporates two novel loss functions: (i) the HR-anchor loss and (ii) the proposed targeted reconstruction loss, in order to overcome the challenges of limited information content in VLR images.
Abstract: Very low resolution (VLR) image recognition corresponds to classifying images with resolution 16x16 or less. Though it has widespread applicability when objects are captured at a very large stand-off distance (e.g. surveillance scenario) or from wide angle mobile cameras, it has received limited attention. This research presents a novel Dual Directed Capsule Network model, termed as DirectCapsNet, for addressing VLR digit and face recognition. The proposed architecture utilizes a combination of capsule and convolutional layers for learning an effective VLR recognition model. The architecture also incorporates two novel loss functions: (i) the proposed HR-anchor loss and (ii) the proposed targeted reconstruction loss, in order to overcome the challenges of limited information content in VLR images. The proposed losses use high resolution images as auxiliary data during training to "direct" discriminative feature learning. Multiple experiments for VLR digit classification and VLR face recognition are performed along with comparisons with state-of-the-art algorithms. The proposed DirectCapsNet consistently showcases state-of-the-art results; for example, on the UCCS face database, it shows over 95% face recognition accuracy when 16x16 images are matched with 80x80 images.

24 citations


Posted Content
TL;DR: This research presents a novel Dual Directed Capsule Network model, termed as DirectCapsNet, for addressing VLR digit and face recognition, which utilizes a combination of capsule and convolutional layers for learning an effective VLR recognition model.
Abstract: Very low resolution (VLR) image recognition corresponds to classifying images with resolution 16x16 or less. Though it has widespread applicability when objects are captured at a very large stand-off distance (e.g. surveillance scenario) or from wide angle mobile cameras, it has received limited attention. This research presents a novel Dual Directed Capsule Network model, termed as DirectCapsNet, for addressing VLR digit and face recognition. The proposed architecture utilizes a combination of capsule and convolutional layers for learning an effective VLR recognition model. The architecture also incorporates two novel loss functions: (i) the proposed HR-anchor loss and (ii) the proposed targeted reconstruction loss, in order to overcome the challenges of limited information content in VLR images. The proposed losses use high resolution images as auxiliary data during training to "direct" discriminative feature learning. Multiple experiments for VLR digit classification and VLR face recognition are performed along with comparisons with state-of-the-art algorithms. The proposed DirectCapsNet consistently showcases state-of-the-art results; for example, on the UCCS face database, it shows over 95\% face recognition accuracy when 16x16 images are matched with 80x80 images.

23 citations


Proceedings ArticleDOI
01 Oct 2019
TL;DR: The outcome of the Disguised Faces in the Wild 2019 competition is summarized in terms of the dataset used for evaluation, a brief review of the algorithms employed by the participants for this task, and the results obtained.
Abstract: Disguised face recognition has wide-spread applicability in scenarios such as law enforcement, surveillance, and access control. Disguise accessories such as sunglasses, masks, scarves, or make-up modify or occlude different facial regions which makes face recognition a challenging task. In order to understand and benchmark the state-of-the-art on face recognition in the presence of disguise variations, the Disguised Faces in the Wild 2019 (DFW2019) competition has been organized. This paper summarizes the outcome of the competition in terms of the dataset used for evaluation, a brief review of the algorithms employed by the participants for this task, and the results obtained. The DFW2019 dataset has been released with four evaluation protocols and baseline results obtained from two deep learning-based state-of-the-art face recognition models. The DFW2019 dataset has also been analyzed with respect to degrees of difficulty: (i) easy, (ii) medium, and (iii) hard. The dataset has been released as part of the International Workshop on Disguised Faces in the Wild at International Conference on Computer Vision (ICCV), 2019.

20 citations


Journal ArticleDOI
TL;DR: In this paper, a novel deep learning formulation, termed as R-Codean autoencoder, was proposed for facial attribute prediction using the utility of shortcut connections in deep models to facilitate learning of optimal parameters.

16 citations


Proceedings ArticleDOI
01 Jun 2019
TL;DR: This is the first research which presents a deep learning based expression classification approach for children and a novel supervised deep learning formulation, termed as Mean Supervised Deep Boltzmann Machine (msDBM) is proposed which classifies an input face image into one of the seven expression classes.
Abstract: Automated facial expression classification has widespread application in multiple domains such as human computer interaction, health and entertainment, biometrics, and security. There are six basic facial expressions: Anger, Disgust, Fear, Happiness, Sadness, and Surprise, apart from a neutral state. Most of the research in expression classification has focused on adult face images, with no dedicated research on automating expression classification for children. To the best of our knowledge, this is the first research which presents a deep learning based expression classification approach for children. A novel supervised deep learning formulation, termed as Mean Supervised Deep Boltzmann Machine (msDBM) is proposed which classifies an input face image into one of the seven expression classes. The proposed approach has been evaluated on two child face datasets - Radboud Faces and CAFE, along with experiments on the adult face images of the Radboud Faces dataset. Experimental results and analysis reinforces the challenging nature of the task at hand, and the effectiveness of the proposed msDBM model.

11 citations


Proceedings ArticleDOI
14 May 2019
TL;DR: The proposed FaceSurv database contains over 142K face images, spread across videos captured in both visible and near-infrared spectra, offering a plethora of challenges common to surveillance settings.
Abstract: Existing face recognition algorithms achieve high recognition performance for frontal face images with good illumination and close proximity to the imaging device. However, most of the existing algorithms fail to perform equally well in surveillance scenarios, where videos are captured across varying resolutions and spectra. In surveillance settings, cameras are usually placed far away from the subjects, thereby resulting in variations across pose, illumination, occlusion, and resolution. Current video datasets used for face recognition are often captured in constrained environments, and thus fail to simulate the real world scenarios. In this paper, we present the FaceSurv database featuring 252 subjects in 460 videos. The proposed dataset contains over 142K face images, spread across videos captured in both visible and near-infrared spectra. Each video contains a group of individuals walking from 36ft towards the imaging device, offering a plethora of challenges common to surveillance settings. Benchmark experimental protocol and baseline results have been reported with state-of-the-art algorithms for face detection and recognition. It is our assertion that the availability of such a challenging database will facilitate the development of robust face recognition systems relevant to real world surveillance scenarios.

Journal ArticleDOI
TL;DR: A novel deep learning based formulation, termed as Class Specific Mean Autoencoder, to learn the intra-class similarity and extract class-specific features to help in learning class- specific representations.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A novel Triplet Transform Learning (TTL) model for learning discriminative representations of primate faces is proposed, where it outperforms the existing approaches and attains state-of-the-art performance on the primates database.
Abstract: Automated primate face recognition has enormous potential in effective conservation of species facing endangerment or extinction. The task is characterized by lack of training data, low inter-class variations, and large intra-class differences. Owing to the challenging nature of the problem, limited research has been performed to automate the process of primate face recognition. In this research, we propose a novel Triplet Transform Learning (TTL) model for learning discriminative representations of primate faces. The proposed model reduces the intra-class variations and increases the inter-class variations to obtain robust sparse representations for the primate faces. It is utilized to present a novel framework for primate face recognition, which is evaluated on the primate dataset, comprising of 80 identities including monkeys, gorillas, and chimpanzees. Experimental results demonstrate the efficacy of the proposed approach, where it outperforms the existing approaches and attains state-of-the-art performance on the primates database.