scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

On Matching Faces with Alterations due to Plastic Surgery and Disguise

TL;DR: A novel framework is proposed which transfers fundamental visual features learnt from a generic image dataset to supplement a supervised face recognition model which combines off-the-shelf supervised classifier and a generic, task independent network which encodes information related to basic visual cues such as color, shape, and texture.
Abstract: Plastic surgery and disguise variations are two of the most challenging co-variates of face recognition. The stateof-art deep learning models are not sufficiently successful due to the availability of limited training samples. In this paper, a novel framework is proposed which transfers fundamental visual features learnt from a generic image dataset to supplement a supervised face recognition model. The proposed algorithm combines off-the-shelf supervised classifier and a generic, task independent network which encodes information related to basic visual cues such as color, shape, and texture. Experiments are performed on IIITD plastic surgery face dataset and Disguised Faces in the Wild (DFW) dataset. Results showcase that the proposed algorithm achieves state of the art results on both the datasets. Specifically on the DFW database, the proposed algorithm yields over 87% verification accuracy at 1% false accept rate which is 53.8% better than baseline results com- puted using VGG Face.
Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.

353 citations

Journal ArticleDOI
TL;DR: A critical review on the different issues of face recognition systems are presented, and different approaches to solving these issues are analyzed by presenting existing techniques that have been proposed in the literature.
Abstract: Face recognition is an efficient technique and one of the most preferred biometric modalities for the identification and verification of individuals as compared to voice, fingerprint, iris, retina eye scan, gait, ear and hand geometry. This has over the years necessitated researchers in both the academia and industry to come up with several face recognition techniques making it one of the most studied research area in computer vision. A major reason why it remains a fast-growing research lies in its application in unconstrained environments, where most existing techniques do not perform optimally. Such conditions include pose, illumination, ageing, occlusion, expression, plastic surgery and low resolution. In this paper, a critical review on the different issues of face recognition systems are presented, and different approaches to solving these issues are analyzed by presenting existing techniques that have been proposed in the literature. Furthermore, the major and challenging face datasets that consist of the different facial constraints which depict real-life scenarios are also discussed stating the shortcomings associated with them. Also, recognition performance on the different datasets by researchers are also reported. The paper is concluded, and directions for future works are highlighted.

52 citations


Cites methods from "On Matching Faces with Alterations ..."

  • ...[83] The proposed algorithm combines off-the-shelf supervised classifier and a generic, task independent network which encodes information related to basic visual cues such as color, shape, and texture....

    [...]

Journal ArticleDOI
08 Mar 2019
TL;DR: The disguised faces in the wild (DFW) dataset as discussed by the authors contains over 11,000 images of 1000 identities with variations across different types of disguise accessories, including impersonator and genuine obfuscated face images for each subject.
Abstract: Research in face recognition has seen tremendous growth over the past couple of decades. Beginning from algorithms capable of performing recognition in constrained environments, existing face recognition systems achieve very high accuracies on large-scale unconstrained face datasets. While upcoming algorithms continue to achieve improved performance, many of them are susceptible to reduced performance under disguise variations, one of the most challenging covariate of face recognition. In this paper, the disguised faces in the wild (DFW) dataset is presented, which contains over 11000 images of 1000 identities with variations across different types of disguise accessories (the DFW dataset link: http://iab-rubric.org/resources/dfw.html ). The images are collected from the Internet, resulting in unconstrained variations similar to real-world settings. This is a unique dataset that contains impersonator and genuine obfuscated face images for each subject. The DFW dataset has been analyzed in terms of three levels of difficulty: 1) easy; 2) medium; and 3) hard, in order to showcase the challenging nature of the problem. The dataset was released as part of the First International Workshop and Competition on DFW at the Conference on Computer Vision and Pattern Recognition, 2018. This paper presents the DFW dataset in detail, including the evaluation protocols, baseline results, performance analysis of the submissions received as part of the competition, and three levels of difficulties of the DFW challenge dataset.

52 citations

Journal ArticleDOI
TL;DR: This paper has studied the efficacy of deep learning methods incorporating simple noise-based data augmentation for disguise invariant face recognition (DIFR), and compared four different pre-trained 2D CNNs based on their classification accuracy and execution time for selecting a suitable model for DIFR.
Abstract: Face recognition is diversely used in modern biometric and security applications. Most of the current face recognition techniques show good results in a constrained environment. However, these techniques face many problems in real-world scenarios such as low-quality images, temporal variations and facial disguises creating variations in facial features. The reason for these deteriorating results is the employment of handcrafted features having weak generalization capabilities and neglecting the complexities associated with domain adaption in case of deep learning models. In this paper, we have studied the efficacy of deep learning methods incorporating simple noise-based data augmentation for disguise invariant face recognition (DIFR). The proposed method detects face in an image using Viola Jones face detector and classifies it using a pre-trained Convolutional Neural Network (CNN) fine-tuned for DIFR. During transfer learning, a pre-trained CNN learns generalized disguise-invariant features from facial images of several subjects to correctly identify them under varying facial disguises. We have compared four different pre-trained 2D CNNs, each with different number of learning parameters, based on their classification accuracy and execution time for selecting a suitable model for DIFR. Comprehensive experiments and comparative analysis have been conducted on six challenging facial disguise datasets. Resnet-18 gives the best trade-off between accuracy and efficiency, by achieving an average accuracy of 98.19% with an average execution time of 0.32 seconds. The promising results achieved in these experiments reflect the efficiency of the proposed method and outperforms the existing methods in all aspects.

15 citations

Posted Content
TL;DR: A review of deep learning algorithms for small sample size problems in which the algorithms are segregated according to the space in which they operate, i.e. input space, model space, and feature space and a Dynamic Attention Pooling approach which focuses on extracting global information from the most discriminative sub-part of the feature map is presented.
Abstract: The growth and success of deep learning approaches can be attributed to two major factors: availability of hardware resources and availability of large number of training samples. For problems with large training databases, deep learning models have achieved superlative performances. However, there are a lot of \textit{small sample size or $S^3$} problems for which it is not feasible to collect large training databases. It has been observed that deep learning models do not generalize well on $S^3$ problems and specialized solutions are required. In this paper, we first present a review of deep learning algorithms for small sample size problems in which the algorithms are segregated according to the space in which they operate, i.e. input space, model space, and feature space. Secondly, we present Dynamic Attention Pooling approach which focuses on extracting global information from the most discriminative sub-part of the feature map. The performance of the proposed dynamic attention pooling is analyzed with state-of-the-art ResNet model on relatively small publicly available datasets such as SVHN, C10, C100, and TinyImageNet.

13 citations


Cites background from "On Matching Faces with Alterations ..."

  • ...Input space refers to the set of algorithms which increases the database by generating more samples or perturb the samples to optimize the feature space [7], [8], [9], [10]....

    [...]

References
More filters
Proceedings ArticleDOI
Jia Deng1, Wei Dong1, Richard Socher1, Li-Jia Li1, Kai Li1, Li Fei-Fei1 
20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Abstract: The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called “ImageNet”, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

49,639 citations


"On Matching Faces with Alterations ..." refers methods in this paper

  • ...The DenseNet is pre-trained on the ImageNet dataset [6] and is further fine-tuned on the different datasets used in this research....

    [...]

Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


"On Matching Faces with Alterations ..." refers methods in this paper

  • ...The basis pursuit [4] and LASSO [25] are two popular greedy approaches used to replace the `0-norm with `1-norm, but with the trade off of having a high computational complexity....

    [...]

Proceedings ArticleDOI
21 Jul 2017
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

27,821 citations

Journal ArticleDOI
TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.
Abstract: The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method; this connection explains the similar numerical results previously observed for the Lasso and Stagewise, and helps us understand the properties of both methods, which are seen as constrained versions of the simpler LARS algorithm. (3) A simple approximation for the degrees of freedom of a LARS estimate is available, from which we derive a Cp estimate of prediction error; this allows a principled choice among the range of possible LARS estimates. LARS and its variants are computationally efficient: the paper describes a publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates.

7,828 citations


"On Matching Faces with Alterations ..." refers methods in this paper

  • ...The representation is learnt using an unsupervised dictionary learning method based on stagewise least angle regression (st-LARS) [9] approach....

    [...]

Proceedings ArticleDOI
07 Dec 2015
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.
Abstract: Predicting face attributes in the wild is challenging due to complex face variations. We propose a novel deep learning framework for attribute prediction in the wild. It cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently. LNet is pre-trained by massive general object categories for face localization, while ANet is pre-trained by massive face identities for attribute prediction. This framework not only outperforms the state-of-the-art with a large margin, but also reveals valuable facts on learning face representation. (1) It shows how the performances of face localization (LNet) and attribute prediction (ANet) can be improved by different pre-training strategies. (2) It reveals that although the filters of LNet are fine-tuned only with image-level attribute tags, their response maps over entire images have strong indication of face locations. This fact enables training LNet for face localization with only image-level annotations, but without face bounding boxes or landmarks, which are required by all attribute recognition works. (3) It also demonstrates that the high-level hidden neurons of ANet automatically discover semantic concepts after pre-training with massive face identities, and such concepts are significantly enriched after fine-tuning with attribute tags. Each attribute can be well explained with a sparse linear combination of these concepts.

6,273 citations