scispace - formally typeset
Search or ask a question

Showing papers on "Three-dimensional face recognition published in 2016"


Book ChapterDOI
08 Oct 2016
TL;DR: This paper proposes a new supervision signal, called center loss, for face recognition task, which simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers.
Abstract: Convolutional neural networks (CNNs) have been widely used in computer vision community, significantly improving the state-of-the-art. In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this paper proposes a new supervision signal, called center loss, for face recognition task. Specifically, the center loss simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers. More importantly, we prove that the proposed center loss function is trainable and easy to optimize in the CNNs. With the joint supervision of softmax loss and center loss, we can train a robust CNNs to obtain the deep features with the two key learning objectives, inter-class dispension and intra-class compactness as much as possible, which are very essential to face recognition. It is encouraging to see that our CNNs (with such joint supervision) achieve the state-of-the-art accuracy on several important face recognition benchmarks, Labeled Faces in the Wild (LFW), YouTube Faces (YTF), and MegaFace Challenge. Especially, our new approach achieves the best results on MegaFace (the largest public domain face benchmark) under the protocol of small training set (contains under 500000 images and under 20000 persons), significantly improving the previous results and setting new state-of-the-art for both face recognition and face verification tasks.

3,464 citations


Proceedings ArticleDOI
07 Mar 2016
TL;DR: OpenFace is the first open source tool capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation and allows for easy integration with other applications and devices through a lightweight messaging system.
Abstract: Over the past few years, there has been an increased interest in automatic facial behavior analysis and understanding. We present OpenFace — an open source tool intended for computer vision and machine learning researchers, affective computing community and people interested in building interactive applications based on facial behavior analysis. OpenFace is the first open source tool capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation. The computer vision algorithms which represent the core of OpenFace demonstrate state-of-the-art results in all of the above mentioned tasks. Furthermore, our tool is capable of real-time performance and is able to run from a simple webcam without any specialist hardware. Finally, OpenFace allows for easy integration with other applications and devices through a lightweight messaging system.

1,151 citations


Proceedings ArticleDOI
07 Mar 2016
TL;DR: A deep neural network architecture to address the FER problem across multiple well-known standard face datasets is proposed, comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.
Abstract: Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem in computer vision. Despite efforts made in developing various methods for FER, existing approaches lack generalizability when applied to unseen images or those that are captured in wild setting (i.e. the results are not significant). Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyper-parameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publicly available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of our proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.

816 citations


Proceedings ArticleDOI
07 Mar 2016
TL;DR: The aim of this data set is to isolate the factor of pose variation in terms of extreme poses like profile, where many features are occluded, along with other `in the wild' variations to suggest that there is a gap between human performance and automatic face recognition methods for large pose variations in unconstrained images.
Abstract: We have collected a new face data set that will facilitate research in the problem of frontal to profile face verification ‘in the wild’. The aim of this data set is to isolate the factor of pose variation in terms of extreme poses like profile, where many features are occluded, along with other ‘in the wild’ variations. We call this data set the Celebrities in Frontal-Profile (CFP) data set. We find that human performance on Frontal-Profile verification in this data set is only slightly worse (94.57% accuracy) than that on Frontal-Frontal verification (96.24% accuracy). However we evaluated many state-of-the-art algorithms, including Fisher Vector, Sub-SML and a Deep learning algorithm. We observe that all of them degrade more than 10% from Frontal-Frontal to Frontal-Profile verification. The Deep learning implementation, which performs comparable to humans on Frontal-Frontal, performs significantly worse (84.91% accuracy) on Frontal-Profile. This suggests that there is a gap between human performance and automatic face recognition methods for large pose variation in unconstrained images.

618 citations


Journal ArticleDOI
TL;DR: This paper introduces a novel and appealing approach for detecting face spoofing using a colour texture analysis that exploits the joint colour-texture information from the luminance and the chrominance channels by extracting complementary low-level feature descriptions from different colour spaces.
Abstract: Research on non-intrusive software-based face spoofing detection schemes has been mainly focused on the analysis of the luminance information of the face images, hence discarding the chroma component, which can be very useful for discriminating fake faces from genuine ones. This paper introduces a novel and appealing approach for detecting face spoofing using a colour texture analysis. We exploit the joint colour-texture information from the luminance and the chrominance channels by extracting complementary low-level feature descriptions from different colour spaces. More specifically, the feature histograms are computed over each image band separately. Extensive experiments on the three most challenging benchmark data sets, namely, the CASIA face anti-spoofing database, the replay-attack database, and the MSU mobile face spoof database, showed excellent results compared with the state of the art. More importantly, unlike most of the methods proposed in the literature, our proposed approach is able to achieve stable performance across all the three benchmark data sets. The promising results of our cross-database evaluation suggest that the facial colour texture representation is more stable in unknown conditions compared with its gray-scale counterparts.

449 citations


Journal ArticleDOI
TL;DR: Experimental results indicate that DCP outperforms the state-of-the-art local descriptors for both face identification and face verification tasks and the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme.
Abstract: To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract “Multi-Directional Multi-Level Dual-Cross Patterns” (MDML-DCPs) from face images. Specifically, the MDML-DCPs scheme exploits the first derivative of Gaussian operator to reduce the impact of differences in illumination and then computes the DCP feature at both the holistic and component levels. DCP is a novel face image descriptor inspired by the unique textural structure of human faces. It is computationally efficient and only doubles the cost of computing local binary patterns, yet is extremely robust to pose and expression variations. MDML-DCPs comprehensively yet efficiently encodes the invariant characteristics of a face image from multiple levels into patterns that are highly discriminative of inter-personal differences but robust to intra-personal variations. Experimental results on the FERET, CAS-PERL-R1, FRGC 2.0, and LFW databases indicate that DCP outperforms the state-of-the-art local descriptors (e.g., LBP, LTP, LPQ, POEM, tLBP, and LGXP) for both face identification and face verification tasks. More impressively, the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme.

344 citations


Book ChapterDOI
08 Oct 2016
TL;DR: In this paper, the authors propose a domain specific data augmentation method to enrich an existing dataset with important facial appearance variations by manipulating the faces it contains, which is also used when matching query images represented by standard convolutional neural networks.
Abstract: Face recognition capabilities have recently made extraordinary leaps. Though this progress is at least partially due to ballooning training set sizes – huge numbers of face images downloaded and labeled for identity – it is not clear if the formidable task of collecting so many images is truly necessary. We propose a far more accessible means of increasing training data sizes for face recognition systems: Domain specific data augmentation. We describe novel methods of enriching an existing dataset with important facial appearance variations by manipulating the faces it contains. This synthesis is also used when matching query images represented by standard convolutional neural networks. The effect of training and testing with synthesized images is tested on the LFW and IJB-A (verification and identification) benchmarks and Janus CS2. The performances obtained by our approach match state of the art results reported by systems trained on millions of downloaded images.

338 citations


Posted Content
TL;DR: Wang et al. as mentioned in this paper proposed a Neural Aggregation Network (NAN) for video face recognition, which consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them.
Abstract: This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.

291 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: A method to push the frontiers of unconstrained face recognition in the wild by using multiple pose specific models and rendered face images called Pose-Aware Models (PAMs), which achieve remarkably better performance than commercial products and surprisingly also outperform methods that are specifically fine-tuned on the target dataset.
Abstract: We propose a method to push the frontiers of unconstrained face recognition in the wild, focusing on the problem of extreme pose variations. As opposed to current techniques which either expect a single model to learn pose invariance through massive amounts of training data, or which normalize images to a single frontal pose, our method explicitly tackles pose variation by using multiple posespecific models and rendered face images. We leverage deep Convolutional Neural Networks (CNNs) to learn discriminative representations we call Pose-Aware Models (PAMs) using 500K images from the CASIA WebFace dataset. We present a comparative evaluation on the new IARPA Janus Benchmark A (IJB-A) and PIPA datasets. On these datasets PAMs achieve remarkably better performance than commercial products and surprisingly also outperform methods that are specifically fine-tuned on the target dataset.

281 citations


Journal ArticleDOI
TL;DR: The inherent difficulties in PIFR are discussed and a comprehensive review of established techniques are presented, that is, pose-robust feature extraction approaches, multiview subspace learning approaches, face synthesis approaches, and hybrid approaches.
Abstract: The capacity to recognize faces under varied poses is a fundamental human ability that presents a unique challenge for computer vision systems. Compared to frontal face recognition, which has been intensively studied and has gradually matured in the past few decades, Pose-Invariant Face Recognition (PIFR) remains a largely unsolved problem. However, PIFR is crucial to realizing the full potential of face recognition for real-world applications, since face recognition is intrinsically a passive biometric technology for recognizing uncooperative subjects. In this article, we discuss the inherent difficulties in PIFR and present a comprehensive review of established techniques. Existing PIFR methods can be grouped into four categories, that is, pose-robust feature extraction approaches, multiview subspace learning approaches, face synthesis approaches, and hybrid approaches. The motivations, strategies, pros/cons, and performance of representative approaches are described and compared. Moreover, promising directions for future research are discussed.

269 citations


Proceedings ArticleDOI
07 Mar 2016
TL;DR: This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training.
Abstract: This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training. The key idea is to analyze the change in the recognition scores when artificially masking out different regions of the image. The masking out of a region that includes the object typically causes a significant drop in recognition score. This idea is embedded into an agglomerative clustering technique that generates self-taught localization hypotheses. Our object localization scheme outperforms existing proposal methods in both precision and recall for small number of subwindow proposals (e.g., on ILSVRC-2012 it produces a relative gain of 23.4% over the state-of-the-art for top-1 hypothesis). Furthermore, our experiments show that the annotations automatically-generated by our method can be used to train object detectors yielding recognition results remarkably close to those obtained by training on manually-annotated bounding boxes.

Proceedings ArticleDOI
26 May 2016
TL;DR: This work presents a novel approach to Facial Action Unit detection using a combination of Convolutional and Bi-directional Long Short-Term Memory Neural Networks (CNN-BLSTM), which jointly learns shape, appearance and dynamics in a deep learning manner.
Abstract: Spontaneous facial expression recognition under uncontrolled conditions is a hard task. It depends on multiple factors including shape, appearance and dynamics of the facial features, all of which are adversely affected by environmental noise and low intensity signals typical of such conditions. In this work, we present a novel approach to Facial Action Unit detection using a combination of Convolutional and Bi-directional Long Short-Term Memory Neural Networks (CNN-BLSTM), which jointly learns shape, appearance and dynamics in a deep learning manner. In addition, we introduce a novel way to encode shape features using binary image masks computed from the locations of facial landmarks. We show that the combination of dynamic CNN features and Bi-directional Long Short-Term Memory excels at modelling the temporal information. We thoroughly evaluate the contributions of each component in our system and show that it achieves state-of-the-art performance on the FERA-2015 Challenge dataset.

Proceedings ArticleDOI
27 Jun 2016
TL;DR: This work proposes a novel deep face recognition framework to learn the ageinvariant deep face features through a carefully designed CNN model, and is the first attempt to show the effectiveness of deep CNNs in advancing the state-of-the-art of AIFR.
Abstract: While considerable progresses have been made on face recognition, age-invariant face recognition (AIFR) still remains a major challenge in real world applications of face recognition systems. The major difficulty of AIFR arises from the fact that the facial appearance is subject to significant intra-personal changes caused by the aging process over time. In order to address this problem, we propose a novel deep face recognition framework to learn the ageinvariant deep face features through a carefully designed CNN model. To the best of our knowledge, this is the first attempt to show the effectiveness of deep CNNs in advancing the state-of-the-art of AIFR. Extensive experiments are conducted on several public domain face aging datasets (MORPH Album2, FGNET, and CACD-VS) to demonstrate the effectiveness of the proposed model over the state-of the-art. We also verify the excellent generalization of our new model on the famous LFW dataset.

Posted Content
TL;DR: In this paper, a robust partially coupled networks (RPCN) was proposed to solve the very low resolution recognition (VLRR) problem using deep learning methods, taking advantage of techniques primarily in super resolution, domain adaptation and robust regression.
Abstract: Visual recognition research often assumes a sufficient resolution of the region of interest (ROI). That is usually violated in practice, inspiring us to explore the Very Low Resolution Recognition (VLRR) problem. Typically, the ROI in a VLRR problem can be smaller than $16 \times 16$ pixels, and is challenging to be recognized even by human experts. We attempt to solve the VLRR problem using deep learning methods. Taking advantage of techniques primarily in super resolution, domain adaptation and robust regression, we formulate a dedicated deep learning method and demonstrate how these techniques are incorporated step by step. Any extra complexity, when introduced, is fully justified by both analysis and simulation results. The resulting \textit{Robust Partially Coupled Networks} achieves feature enhancement and recognition simultaneously. It allows for both the flexibility to combat the LR-HR domain mismatch, and the robustness to outliers. Finally, the effectiveness of the proposed models is evaluated on three different VLRR tasks, including face identification, digit recognition and font recognition, all of which obtain very impressive performances.

Journal ArticleDOI
TL;DR: A fully automatic face normalization and recognition system robust to most common face variations in unconstrained environments and improves the performance of AAM fitting by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts.
Abstract: We present a fully automatic face normalization and recognition system.It normalizes the face images for both in-plane and out-of-plane pose variations.The performance of AAM fitting is improved using a novel initialization technique.HOG and Gabor features are fused using CCA to have more discriminative features.The proposed system recognizes non-frontal faces using only a single gallery sample. Single sample face recognition have become an important problem because of the limitations on the availability of gallery images. In many real-world applications such as passport or driver license identification, there is only a single facial image per subject available. The variations between the single gallery face image and the probe face images, captured in unconstrained environments, make the single sample face recognition even more difficult. In this paper, we present a fully automatic face recognition system robust to most common face variations in unconstrained environments. Our proposed system is capable of recognizing faces from non-frontal views and under different illumination conditions using only a single gallery sample for each subject. It normalizes the face images for both in-plane and out-of-plane pose variations using an enhanced technique based on active appearance models (AAMs). We improve the performance of AAM fitting, not only by training it with in-the-wild images and using a powerful optimization technique, but also by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts. The proposed initialization technique results in significant improvement of AAM fitting to non-frontal poses and makes the normalization process robust, fast and reliable. Owing to the proper alignment of the face images, made possible by this approach, we can use local feature descriptors, such as Histograms of Oriented Gradients (HOG), for matching. The use of HOG features makes the system robust against illumination variations. In order to improve the discriminating information content of the feature vectors, we also extract Gabor features from the normalized face images and fuse them with HOG features using Canonical Correlation Analysis (CCA). Experimental results performed on various databases outperform the state-of-the-art methods and show the effectiveness of our proposed method in normalization and recognition of face images obtained in unconstrained environments.

Proceedings ArticleDOI
13 Jun 2016
TL;DR: A deep TransfeR NIR-VIS heterogeneous face recognition neTwork (TRIVET) with deep convolutional neural network with ordinal measures to learn discriminative models achieves state-of-the-art recognition performance on the most challenging CASIA Nir-VIS 2.0 Face Database.
Abstract: One task of heterogeneous face recognition is to match a near infrared (NIR) face image to a visible light (VIS) image. In practice, there are often a few pairwise NIR-VIS face images but it is easy to collect lots of VIS face images. Therefore, how to use these unpaired VIS images to improve the NIR-VIS recognition accuracy is an ongoing issue. This paper presents a deep TransfeR NIR-VIS heterogeneous facE recognition neTwork (TRIVET) for NIR-VIS face recognition. First, to utilize large numbers of unpaired VIS face images, we employ the deep convolutional neural network (CNN) with ordinal measures to learn discriminative models. The ordinal activation function (Max-Feature-Map) is used to select discriminative features and make the models robust and lighten. Second, we transfer these models to NIR-VIS domain by fine-tuning with two types of NIR-VIS triplet loss. The triplet loss not only reduces intra-class NIR-VIS variations but also augments the number of positive training sample pairs. It makes fine-tuning deep models on a small dataset possible. The proposed method achieves state-of-the-art recognition performance on the most challenging CASIA NIR-VIS 2.0 Face Database. It achieves a new record on rank-1 accuracy of 95.74% and verification rate of 91.03% at FAR=0.001. It cuts the error rate in comparison with the best accuracy [27] by 69%.

Proceedings ArticleDOI
07 Mar 2016
TL;DR: A novel representation of face recognition using multiple pose-aware deep learning models achieves better results than the state-of-the-art on IARPA's CS2 and NIST's IJB-A in both verification and identification tasks.
Abstract: We introduce our method and system for face recognition using multiple pose-aware deep learning models. In our representation, a face image is processed by several pose-specific deep convolutional neural network (CNN) models to generate multiple pose-specific features. 3D rendering is used to generate multiple face poses from the input image. Sensitivity of the recognition system to pose variations is reduced since we use an ensemble of pose-specific CNN features. The paper presents extensive experimental results on the effect of landmark detection, CNN layer selection and pose model selection on the performance of the recognition pipeline. Our novel representation achieves better results than the state-of-the-art on IARPA's CS2 and NIST's IJB-A in both verification and identification (i.e. search) tasks.

Journal ArticleDOI
TL;DR: A new partial face recognition approach to recognize persons of interest from their partial faces using a robust point set matching method, where both the textural information and geometrical information of local features are explicitly used for matching simultaneously.
Abstract: Over the past three decades, a number of face recognition methods have been proposed in computer vision, and most of them use holistic face images for person identification. In many real-world scenarios especially some unconstrained environments, human faces might be occluded by other objects, and it is difficult to obtain fully holistic face images for recognition. To address this, we propose a new partial face recognition approach to recognize persons of interest from their partial faces. Given a pair of gallery image and probe face patch, we first detect keypoints and extract their local textural features. Then, we propose a robust point set matching method to discriminatively match these two extracted local feature sets, where both the textural information and geometrical information of local features are explicitly used for matching simultaneously. Finally, the similarity of two faces is converted as the distance between these two aligned feature sets. Experimental results on four public face data sets show the effectiveness of the proposed approach.

Book ChapterDOI
08 Oct 2016
TL;DR: Experimental results show that CNNs trained on visible spectrum images can be used to obtain results that are on par or improve over the state-of-the-art for heterogeneous recognition with near-infrared images and sketches.
Abstract: Heterogeneous face recognition aims to recognize faces across different sensor modalities. Typically, gallery images are normal visible spectrum images, and probe images are infrared images or sketches. Recently significant improvements in visible spectrum face recognition have been obtained by CNNs learned from very large training datasets. In this paper, we are interested in the question to what extent the features from a CNN pre-trained on visible spectrum face images can be used to perform heterogeneous face recognition. We explore different metric learning strategies to reduce the discrepancies between the different modalities. Experimental results show that we can use CNNs trained on visible spectrum images to obtain results that are on par or improve over the state-of-the-art for heterogeneous recognition with near-infrared images and sketches.

Journal ArticleDOI
TL;DR: A novel method to automatically produce approximately axis-symmetrical virtual face images that is mathematically very tractable and quite easy to implement and verified in comparison with state-of-the-art dictionary learning algorithms.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: Wang et al. as discussed by the authors investigated the application of deep features extracted from VGG-Net for iris recognition and showed promising results with the best accuracy rate of 99.4%, which outperforms the previous best result.
Abstract: Iris is one of the popular biometrics that is widely used for identity authentication. Different features have been used to perform iris recognition in the past. Most of them are based on hand-crafted features designed by biometrics experts. Due to tremendous success of deep learning in computer vision problems, there has been a lot of interest in applying features learned by convolutional neural networks on general image recognition to other tasks such as segmentation, face recognition, and object detection. In this paper, we have investigated the application of deep features extracted from VGG-Net for iris recognition. The proposed scheme has been tested on two well-known iris databases, and has shown promising results with the best accuracy rate of 99.4%, which outperforms the previous best result.

Journal ArticleDOI
TL;DR: This work presents a comprehensive framework for understanding person recognition as it happens in the real world, and concludes that dynamic information plays the central role in binding multi-modal information from the face, body, and the voice to achieve robust and highly accurate recognition.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: This paper is the first work to explore the possible use of deep learning for micro-expression recognition task and extends evolutionary algorithms to search an optimal set of deep features so that it does not overfit the training data and generalizes well for the test data.
Abstract: Micro-expression recognition is a challenging task in computer vision field due to the repressed facial appearance and short duration. Previous work for micro-expression recognition have used hand-crafted features like LBP-TOP, Gabor filter and optical flow. This paper is the first work to explore the possible use of deep learning for micro-expression recognition task. Due to the lack of data for micro-expression, training a CNN model from micro-expression data is not feasible. Instead, transfer learning from objects and facial expressions based CNN models are used. The aim is to use feature selection to remove the irrelevant deep features for our task. This work extends evolutionary algorithms to search an optimal set of deep features so that it does not overfit the training data and generalizes well for the test data. Promising results are presented for various micro-expression datasets.

Book ChapterDOI
01 Jan 2016
TL;DR: This chapter analyzes the effects of intentional or unintentional face image alterations on face recognition algorithms and the human capabilities to deal with altered images in scenarios where the user template is created from printed photographs rather than from images acquired live during enrollment.
Abstract: Face recognition in controlled environments is nowadays considered rather reliable, and if face is acquired in proper conditions, a good accuracy level can be achieved by state-of-the-art systems. However, we show that, even under these desirable conditions, some intentional or unintentional face image alterations can significantly affect the recognition performance. In particular, in scenarios where the user template is created from printed photographs rather than from images acquired live during enrollment (e.g., identity documents ), digital image alterations can severely affect the recognition results. In this chapter, we analyze both the effects of such alterations on face recognition algorithms and the human capabilities to deal with altered images.

Proceedings ArticleDOI
27 Jun 2016
TL;DR: A novel face alignment method, which cascades several Deep Regression networks coupled with De-corrupt Autoencoders (denoted as DRDA) to explicitly handle partial occlusion problem, which significantly outperforms the state-of-the-art methods.
Abstract: Face alignment or facial landmark detection plays an important role in many computer vision applications, e.g., face recognition, facial expression recognition, face animation, etc. However, the performance of face alignment system degenerates severely when occlusions occur. In this work, we propose a novel face alignment method, which cascades several Deep Regression networks coupled with De-corrupt Autoencoders (denoted as DRDA) to explicitly handle partial occlusion problem. Different from the previous works that can only detect occlusions and discard the occluded parts, our proposed de-corrupt autoencoder network can automatically recover the genuine appearance for the occluded parts and the recovered parts can be leveraged together with those non-occluded parts for more accurate alignment. By coupling de-corrupt autoencoders with deep regression networks, a deep alignment model robust to partial occlusions is achieved. Besides, our method can localize occluded regions rather than merely predict whether the landmarks are occluded. Experiments on two challenging occluded face datasets demonstrate that our method significantly outperforms the state-of-the-art methods.

Journal ArticleDOI
TL;DR: There are many ANN proposed methods which give overview face recognition using ANN, and the strengths and limitations of these literature studies and systems were included, and also the performance analysis of different ANN approach and algorithm is analysing.
Abstract: Face recognition from the real data, capture images, sensor images and database images is challenging problem due to the wide variation of face appearances, illumination effect and the complexity of the image background. Face recognition is one of the most effective and relevant applications of image processing and biometric systems. In this paper we are discussing the face recognition methods, algorithms proposed by many researchers using artificial neural networks (ANN) which have been used in the field of image processing and pattern recognition. How ANN will used for the face recognition system and how it is effective than another methods will also discuss in this paper. There are many ANN proposed methods which give overview face recognition using ANN. Therefore, this research includes a general review of face detection studies and systems which based on different ANN approaches and algorithms. The strengths and limitations of these literature studies and systems were included, and also the performance analysis of different ANN approach and algorithm is analysing in this research study.

Proceedings ArticleDOI
13 Jun 2016
TL;DR: This work presents a novel, cross-modal approach that enhances existing solutions for face verification and uses multispectral short wave infrared (SWIR) imaging to ensure the authenticity of a face even in the presence of partial disguises and masks.
Abstract: Recent studies point out that spoofing attacks using facial masks still are a severe problem for current biometric face recognition (FR) systems. As such systems are becoming more frequently used, for example, for automated border crossing or access control to critical infrastructure, advanced anti-spoofing techniques are necessary to counter these attacks. This work presents a novel, cross-modal approach that enhances existing solutions for face verification and uses multispectral short wave infrared (SWIR) imaging to ensure the authenticity of a face even in the presence of partial disguises and masks. It is evaluated on a dataset containing 137 subjects and a variety of spoofing attacks. Using a commercial FR system, it successfully rejects all attempts to counterfeit a foreign face with a false acceptance rate FAR cf = 0% and most attempts to disguise the own identity with FAR dg = 1% at a false rejection rate of FRR < 5% using SWIR images for verification.

Journal ArticleDOI
TL;DR: A novel method for automatically recognizing facial expressions using Deep Convolutional Neural Network features is proposed and it is found that using DCNN features, it can achieve the state-of-the-art recognition rate.

Posted Content
TL;DR: In this article, an alternative way of employing the power of deep representations from CNNs was proposed, combining with conventional face localization techniques, using off-the-shelf architectures trained for face recognition to build facial descriptors.
Abstract: Predicting attributes from face images in the wild is a challenging computer vision problem. To automatically describe face attributes from face containing images, traditionally one needs to cascade three technical blocks --- face localization, facial descriptor construction, and attribute classification --- in a pipeline. As a typical classification problem, face attribute prediction has been addressed using deep learning. Current state-of-the-art performance was achieved by using two cascaded Convolutional Neural Networks (CNNs), which were specifically trained to learn face localization and attribute description. In this paper, we experiment with an alternative way of employing the power of deep representations from CNNs. Combining with conventional face localization techniques, we use off-the-shelf architectures trained for face recognition to build facial descriptors. Recognizing that the describable face attributes are diverse, our face descriptors are constructed from different levels of the CNNs for different attributes to best facilitate face attribute prediction. Experiments on two large datasets, LFWA and CelebA, show that our approach is entirely comparable to the state-of-the-art. Our findings not only demonstrate an efficient face attribute prediction approach, but also raise an important question: how to leverage the power of off-the-shelf CNN representations for novel tasks.

Journal ArticleDOI
TL;DR: The benefits of, as well as the challenges to the use of face recognition as a biometric tool are exposed, and a detailed survey of some well-known methods by expressing each method’s principle is provided.
Abstract: Despite the existence of various biometric techniques, like fingerprints, iris scan, as well as hand geometry, the most efficient and more widely-used one is face recognition. This is because it is inexpensive, non-intrusive and natural. Therefore, researchers have developed dozens of face recognition techniques over the last few years. These techniques can generally be divided into three categories, based on the face data processing methodology. There are methods that use the entire face as input data for the proposed recognition system, methods that do not consider the whole face, but only some features or areas of the face and methods that use global and local face characteristics simultaneously. In this paper, we present an overview of some well-known methods in each of these categories. First, we expose the benefits of, as well as the challenges to the use of face recognition as a biometric tool. Then, we present a detailed survey of the well-known methods by expressing each method’s principle. After that, a comparison between the three categories of face recognition techniques is provided. Furthermore, the databases used in face recognition are mentioned, and some results of the applications of these methods on face recognition databases are presented. Finally, we highlight some new promising research directions that have recently appeared.