Showing papers on "Three-dimensional face recognition published in 2017"

PDF

Open Access

Proceedings Article•DOI•

Disentangled Representation Learning GAN for Pose-Invariant Face Recognition

[...]

Luan Tran¹, Xi Yin¹, Xiaoming Liu¹•Institutions (1)

01 Jul 2017

TL;DR: Quantitative and qualitative evaluation on both controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art.

...read moreread less

Abstract: The large pose discrepancy between two face images is one of the key challenges in face recognition. Conventional approaches for pose-invariant face recognition either perform face frontalization on, or learn a pose-invariant representation from, a non-frontal face image. We argue that it is more desirable to perform both tasks jointly to allow them to leverage each other. To this end, this paper proposes Disentangled Representation learning-Generative Adversarial Network (DR-GAN) with three distinct novelties. First, the encoder-decoder structure of the generator allows DR-GAN to learn a generative and discriminative representation, in addition to image synthesis. Second, this representation is explicitly disentangled from other face variations such as pose, through the pose code provided to the decoder and pose estimation in the discriminator. Third, DR-GAN can take one or multiple images as the input, and generate one unified representation along with an arbitrary number of synthetic images. Quantitative and qualitative evaluation on both controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art.

...read moreread less

1,016 citations

Journal Article•DOI•

Facial expression recognition with Convolutional Neural Networks

[...]

Andr Teixeira Lopes, Edilson de Aguiar, Alberto F. De Souza, Thiago Oliveira-Santos

01 Jan 2017-Pattern Recognition

TL;DR: A simple solution for facial expression recognition that uses a combination of Convolutional Neural Network and specific image pre-processing steps to extract only expression specific features from a face image and explore the presentation order of the samples during training.

...read moreread less

639 citations

Proceedings Article•DOI•

Finding Tiny Faces

[...]

Peiyun Hu¹, Deva Ramanan¹•Institutions (1)

Carnegie Mellon University¹

21 Jul 2017

TL;DR: In this article, the authors explore three aspects of the problem in the context of finding small faces: the role of scale invariance, image resolution, and contextual reasoning, and train separate detectors for different scales.

...read moreread less

Abstract: Though tremendous strides have been made in object recognition, one of the remaining open challenges is detecting small objects. We explore three aspects of the problem in the context of finding small faces: the role of scale invariance, image resolution, and contextual reasoning. While most recognition approaches aim to be scale-invariant, the cues for recognizing a 3px tall face are fundamentally different than those for recognizing a 300px tall face. We take a different approach and train separate detectors for different scales. To maintain efficiency, detectors are trained in a multi-task fashion: they make use of features extracted from multiple layers of single (deep) feature hierarchy. While training detectors for large objects is straightforward, the crucial challenge remains training detectors for small objects. We show that context is crucial, and define templates that make use of massively-large receptive fields (where 99% of the template extends beyond the object of interest). Finally, we explore the role of scale in pre-trained deep networks, providing ways to extrapolate networks tuned for limited scales to rather extreme ranges. We demonstrate state-of-the-art results on massively-benchmarked face datasets (FDDB and WIDER FACE). In particular, when compared to prior art on WIDER FACE, our results reduce error by a factor of 2 (our models produce an AP of 82% while prior art ranges from 29-64%).

...read moreread less

579 citations

Proceedings Article•DOI•

Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network

[...]

Anh Tuan Tran¹, Tal Hassner², Iacopo Masi¹, Gerard Medioni¹•Institutions (2)

AmeriCorps VISTA¹, Open University of Israel²

21 Jul 2017

TL;DR: This paper used a CNN to regress 3DMM shape and texture parameters directly from an input photo and achieved state-of-the-art results on the LFW, YTF and IJB-A benchmarks.

...read moreread less

Abstract: The 3D shapes of faces are well known to be discriminative. Yet despite this, they are rarely used for face recognition and always under controlled viewing conditions. We claim that this is a symptom of a serious but often overlooked problem with existing methods for single view 3D face reconstruction: when applied in the wild, their 3D estimates are either unstable and change for different photos of the same subject or they are over-regularized and generic. In response, we describe a robust method for regressing discriminative 3D morphable face models (3DMM). We use a convolutional neural network (CNN) to regress 3DMM shape and texture parameters directly from an input photo. We overcome the shortage of training data required for this purpose by offering a method for generating huge numbers of labeled examples. The 3D estimates produced by our CNN surpass state of the art accuracy on the MICC data set. Coupled with a 3D-3D face matching pipeline, we show the first competitive face recognition results on the LFW, YTF and IJB-A benchmarks using 3D face shapes as representations, rather than the opaque deep feature vectors used by other modern systems.

...read moreread less

451 citations

Journal Article•DOI•

Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras.

[...]

Dat Tien Nguyen¹, Hyung Gil Hong¹, Ki Wan Kim¹, Kang Ryoung Park¹•Institutions (1)

Dongguk University¹

16 Mar 2017-Sensors

TL;DR: The experimental results show that the proposed person recognition method using the information extracted from body images is efficient for enhancing recognition accuracy compared to systems that use only visible light or thermal images of the human body.

...read moreread less

Abstract: The human body contains identity information that can be used for the person recognition (verification/recognition) problem. In this paper, we propose a person recognition method using the information extracted from body images. Our research is novel in the following three ways compared to previous studies. First, we use the images of human body for recognizing individuals. To overcome the limitations of previous studies on body-based person recognition that use only visible light images for recognition, we use human body images captured by two different kinds of camera, including a visible light camera and a thermal camera. The use of two different kinds of body image helps us to reduce the effects of noise, background, and variation in the appearance of a human body. Second, we apply a state-of-the art method, called convolutional neural network (CNN) among various available methods, for image features extraction in order to overcome the limitations of traditional hand-designed image feature extraction methods. Finally, with the extracted image features from body images, the recognition task is performed by measuring the distance between the input and enrolled samples. The experimental results show that the proposed method is efficient for enhancing recognition accuracy compared to systems that use only visible light or thermal images of the human body.

...read moreread less

335 citations

Book Chapter•DOI•

CMS-RCNN: Contextual Multi-Scale Region-Based CNN for Unconstrained Face Detection

[...]

Chenchen Zhu¹, Yutong Zheng¹, Khoa Luu¹, Marios Savvides¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: A face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above and allows explicit body contextual reasoning in the network inspired from the intuition of human vision system.

...read moreread less

Abstract: Robust face detection in the wild is one of the ultimate components to support various facial related problems, i.e., unconstrained face recognition, facial periocular recognition, facial landmarking and pose estimation, facial expression recognition, 3D facial model construction, etc. Although the face detection problem has been intensely studied for decades with various commercial applications, it still meets problems in some real-world scenarios due to numerous challenges, e.g., heavy facial occlusions, extremely low resolutions, strong illumination, exceptional pose variations, image or video compression artifacts, etc. In this paper, we present a face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above. Similar to the region-based CNNs, our proposed network consists of the region proposal component and the region-of-interest (RoI) detection component. However, far apart of that network, there are two main contributions in our proposed network that play a significant role to achieve the state-of-the-art performance in face detection. First, the multi-scale information is grouped both in region proposal and RoI detection to deal with tiny face regions. Second, our proposed network allows explicit body contextual reasoning in the network inspired from the intuition of human vision system. The proposed approach is benchmarked on two recent challenging face detection databases, i.e., the WIDER FACE Dataset which contains high degree of variability, as well as the Face Detection Dataset and Benchmark (FDDB). The experimental results show that our proposed approach trained on WIDER FACE Dataset outperforms strong baselines on WIDER FACE Dataset by a large margin, and consistently achieves competitive results on FDDB against the recent state-of-the-art face detection methods.

...read moreread less

256 citations

Proceedings Article•DOI•

Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks

[...]

Behzad Hasani¹, Mohammad H. Mahoor¹•Institutions (1)

University of Denver¹

01 Jul 2017

TL;DR: In this article, a 3D Convolutional Neural Network (CNN) is proposed for facial expression recognition in videos, which consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video.

...read moreread less

Abstract: Deep Neural Networks (DNNs) have shown to outperform traditional methods in various visual recognition tasks including Facial Expression Recognition (FER). In spite of efforts made to improve the accuracy of FER systems using DNN, existing methods still are not generalizable enough in practical applications. This paper proposes a 3D Convolutional Neural Network method for FER in videos. This new network architecture consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video. Facial landmark points are also used as inputs to our network which emphasize on the importance of facial components rather than the facial regions that may not contribute significantly to generating facial expressions. Our proposed method is evaluated using four publicly available databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.

...read moreread less

220 citations

Journal Article•DOI•

Microexpression Identification and Categorization Using a Facial Dynamics Map

[...]

Feng Xu¹, Junping Zhang¹, James Z. Wang²•Institutions (2)

Fudan University¹, Penn State College of Information Sciences and Technology²

01 Apr 2017-IEEE Transactions on Affective Computing

TL;DR: A novel method called the Facial Dynamics Map is proposed to characterize the movements of a microexpression in different granularity, and a classifier is developed to identify the presence of microexpressions and to categorize different types.

...read moreread less

Abstract: Unlike conventional facial expressions, microexpressions are instantaneous and involuntary reflections of human emotion. Because microexpressions are fleeting, lasting only a few frames within a video sequence, they are difficult to perceive and interpret correctly, and they are highly challenging to identify and categorize automatically. Existing recognition methods are often ineffective at handling subtle face displacements, which can be prevalent in typical microexpression applications due to the constant movements of the individuals being observed. To address this problem, a novel method called the Facial Dynamics Map is proposed to characterize the movements of a microexpression in different granularity. Specifically, an algorithm based on optical flow estimation is used to perform pixel-level alignment for microexpression sequences. Each expression sequence is then divided into spatiotemporal cuboids in the chosen granularity. We also present an iterative optimal strategy to calculate the principal optical flow direction of each cuboid for better representation of the local facial dynamics. With these principal directions, the resulting Facial Dynamics Map can characterize a microexpression sequence. Finally, a classifier is developed to identify the presence of microexpressions and to categorize different types. Experimental results on four benchmark datasets demonstrate higher recognition performance and improved interpretability.

...read moreread less

217 citations

Journal Article•DOI•

Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification

[...]

Pau Rodríguez¹, Guillem Cucurull¹, Jordi Gonzàlez¹, Josep M. Gonfaus, Kamal Nasrollahi², Thomas B. Moeslund², F. Xavier Roca¹ - Show less +3 more•Institutions (2)

Autonomous University of Barcelona¹, Aalborg University²

09 Feb 2017-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: It is suggested that the performance of pain assessment can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data.

...read moreread less

Abstract: Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.

...read moreread less

216 citations

Proceedings Article•DOI•

Joint Face Detection and Facial Expression Recognition with MTCNN

[...]

Jia Xiang, Gengming Zhu

01 Jul 2017

TL;DR: The inherent correlation between face detection and facial express-ion recognition is exploited, and the results of facial expression recognition based on MTCNN are reported.

...read moreread less

Abstract: The Multi-task Cascaded Convolutional Networks (MTCNN) has recently demonstrated impressive results on jointly face detection and alignment. By using the hard sample ming and training a model on FER2013 datasets, we exploit the inherent correlation between face detection and facial express-ion recognition, and report the results of facial expression recognition based on MTCNN.

...read moreread less

170 citations

Journal Article•DOI•

Convolutional Neural Network-Based Finger-Vein Recognition Using NIR Image Sensors.

[...]

Hyung Gil Hong¹, Min Beom Lee¹, Kang Ryoung Park¹•Institutions (1)

Dongguk University¹

06 Jun 2017-Sensors

TL;DR: A finger-vein recognition method that is robust to various database types and environmental changes based on the convolutional neural network (CNN) is proposed and showed a better performance compared to the conventional methods.

...read moreread less

Abstract: Conventional finger-vein recognition systems perform recognition based on the finger-vein lines extracted from the input images or image enhancement, and texture feature extraction from the finger-vein images. In these cases, however, the inaccurate detection of finger-vein lines lowers the recognition accuracy. In the case of texture feature extraction, the developer must experimentally decide on a form of the optimal filter for extraction considering the characteristics of the image database. To address this problem, this research proposes a finger-vein recognition method that is robust to various database types and environmental changes based on the convolutional neural network (CNN). In the experiments using the two finger-vein databases constructed in this research and the SDUMLA-HMT finger-vein database, which is an open database, the method proposed in this research showed a better performance compared to the conventional methods.

...read moreread less

Journal Article•DOI•

A survey of local feature methods for 3D face recognition

[...]

Sima Soltanpour¹, Boubakeur Boufama¹, Q. M. Jonathan Wu¹•Institutions (1)

University of Windsor¹

01 Dec 2017-Pattern Recognition

TL;DR: This survey presents a state-of-the-art for 3D face recognition using local features, with the main focus being the extraction of these features.

...read moreread less

Journal Article•DOI•

Accurate recognition of words in scenes without character segmentation using recurrent neural network

[...]

Bolan Su¹, Shijian Lu¹•Institutions (1)

Institute for Infocomm Research Singapore¹

01 Mar 2017-Pattern Recognition

TL;DR: This paper proposes a novel scene text recognition technique that performs word level recognition without character segmentation and adapts the recurrent neural network with Long Short Term Memory, the technique that has been widely used for handwriting recognition in recent years.

...read moreread less

Proceedings Article•DOI•

On the vulnerability of face recognition systems towards morphed face attacks

[...]

Ulrich Scherhag¹, Ramachandra Raghavendra², Kiran B. Raja², Marta Gomez-Barrero¹, Christian Rathgeb¹, Christoph Busch¹ - Show less +2 more•Institutions (2)

Darmstadt University of Applied Sciences¹, Norwegian University of Science and Technology²

04 Apr 2017

TL;DR: The vulnerability of biometric systems to morphed face attacks is investigated by evaluating the techniques proposed to detect morphed face images and two new databases are created to study the vulnerability of state-of-the-art face recognition systems with a comprehensive evaluation.

...read moreread less

Abstract: Morphed face images are artificially generated images, which blend the facial images of two or more different data subjects into one. The resulting morphed image resembles the constituent faces, both in visual and feature representation. If a morphed image is enroled as a probe in a biometric system, the data subjects contributing to the morphed image will be verified against the enroled probe. As a result of this infiltration, which is referred to as morphed face attack, the unambiguous assignment of data subjects is not warranted, i.e. the unique link between subject and probe is annulled. In this work, we investigate the vulnerability of biometric systems to such morphed face attacks by evaluating the techniques proposed to detect morphed face images. We create two new databases by printing and scanning digitally morphed images using two different types of scanners, a flatbed scanner and a line scanner. Further, the newly created databases are employed to study the vulnerability of state-of-the-art face recognition systems with a comprehensive evaluation.

...read moreread less

Journal Article•DOI•

Simultaneous Feature and Dictionary Learning for Image Set Based Face Recognition

[...]

Jiwen Lu¹, Gang Wang², Jie Zhou¹•Institutions (2)

Tsinghua University¹, Alibaba Group²

08 Jun 2017-IEEE Transactions on Image Processing

TL;DR: To better exploit the nonlinearity of face samples from different image sets, a deep SFDL (D-SFDL) method is proposed by jointly learning hierarchical non-linear transformations and class-specific dictionaries to further improve the recognition performance.

...read moreread less

Abstract: In this paper, we propose a simultaneous feature and dictionary learning (SFDL) method for image set-based face recognition, where each training and testing example contains a set of face images, which were captured from different variations of pose, illumination, expression, resolution, and motion. While a variety of feature learning and dictionary learning methods have been proposed in recent years and some of them have been successfully applied to image set-based face recognition, most of them learn features and dictionaries for facial image sets individually, which may not be powerful enough because some discriminative information for dictionary learning may be compromised in the feature learning stage if they are applied sequentially, and vice versa. To address this, we propose a SFDL method to learn discriminative features and dictionaries simultaneously from raw face pixels so that discriminative information from facial image sets can be jointly exploited by a one-stage learning procedure. To better exploit the nonlinearity of face samples from different image sets, we propose a deep SFDL (D-SFDL) method by jointly learning hierarchical non-linear transformations and class-specific dictionaries to further improve the recognition performance. Extensive experimental results on five widely used face data sets clearly shows that our SFDL and D-SFDL achieve very competitive or even better performance with the state-of-the-arts.

...read moreread less

Journal Article•DOI•

Pose-invariant face recognition with homography-based normalization

[...]

Changxing Ding¹, Dacheng Tao¹•Institutions (1)

University of Technology, Sydney¹

01 Jun 2017-Pattern Recognition

TL;DR: An highly efficient PIFR algorithm that effectively handles the main challenges caused by pose variation and an effective approach for occlusion detection, which enables face recognition with visible patches only is proposed.

...read moreread less

Proceedings Article•DOI•

Not Afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding

[...]

José Lezama, Qiang Qiu¹, Guillermo Sapiro¹•Institutions (1)

Duke University¹

01 Jul 2017

TL;DR: This paper proposes an approach to extend the deep learning breakthrough for VIS face recognition to the NIR spectrum, without retraining the underlying deep models that see only VIS faces, and obtains state-of-the-art accuracy on the CASIA NIR-VIS v2.0 benchmark.

...read moreread less

Abstract: Surveillance cameras today often capture NIR (near infrared) images in low-light environments. However, most face datasets accessible for training and verification are only collected in the VIS (visible light) spectrum. It remains a challenging problem to match NIR to VIS face images due to the different light spectrum. Recently, breakthroughs have been made for VIS face recognition by applying deep learning on a huge amount of labeled VIS face samples. The same deep learning approach cannot be simply applied to NIR face recognition for two main reasons: First, much limited NIR face images are available for training compared to the VIS spectrum. Second, face galleries to be matched are mostly available only in the VIS spectrum. In this paper, we propose an approach to extend the deep learning breakthrough for VIS face recognition to the NIR spectrum, without retraining the underlying deep models that see only VIS faces. Our approach consists of two core components, cross-spectral hallucination and low-rank embedding, to optimize respectively input and output of a VIS deep model for cross-spectral face recognition. Cross-spectral hallucination produces VIS faces from NIR images through a deep learning approach. Low-rank embedding restores a low-rank structure for faces deep features across both NIR and VIS spectrum. We observe that it is often equally effective to perform hallucination to input NIR images or low-rank embedding to output deep features for a VIS deep model for cross-spectral recognition. When hallucination and low-rank embedding are deployed together, we observe significant further improvement, we obtain state-of-the-art accuracy on the CASIA NIR-VIS v2.0 benchmark, without the need at all to re-train the recognition system.

...read moreread less

Journal Article•DOI•

Data augmentation for face recognition

[...]

Jiang-Jing Lv¹, Xiaohu Shao¹, Jia-Shui Huang¹, Xiang-Dong Zhou¹, Xi Zhou¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

22 Mar 2017-Neurocomputing

TL;DR: Five data augmentation methods dedicated to face images are proposed, including landmark perturbation and four synthesis methods (hairstyles, glasses, poses, illuminations), which effectively enlarge the training dataset, which alleviates the impacts of misalignment, pose variance, illumination changes and partial occlusions.

...read moreread less

Journal Article•DOI•

Local Directional Ternary Pattern for Facial Expression Recognition

[...]

Byungyong Ryu¹, Adin Ramirez Rivera², Jaemyun Kim¹, Oksam Chae¹•Institutions (2)

Kyung Hee University¹, State University of Campinas²

11 Jul 2017-IEEE Transactions on Image Processing

TL;DR: This paper presents a new face descriptor, local directional ternary pattern (LDTP), for facial expression recognition that uses a two-level grid to construct the face descriptor while sampling expression-related information at different scales and shows that the approaches improve the overall accuracy of facialexpression recognition on six data sets.

...read moreread less

Abstract: This paper presents a new face descriptor, local directional ternary pattern (LDTP), for facial expression recognition. LDTP efficiently encodes information of emotion-related features (i.e., eyes, eyebrows, upper nose, and mouth) by using the directional information and ternary pattern in order to take advantage of the robustness of edge patterns in the edge region while overcoming weaknesses of edge-based methods in smooth regions. Our proposal, unlike existing histogram-based face description methods that divide the face into several regions and sample the codes uniformly, uses a two-level grid to construct the face descriptor while sampling expression-related information at different scales. We use a coarse grid for stable codes (highly related to non-expression), and a finer one for active codes (highly related to expression). This multi-level approach enables us to do a finer grain description of facial motions while still characterizing the coarse features of the expression. Moreover, we learn the active LDTP codes from the emotion-related facial regions. We tested our method by using person-dependent and independent cross-validation schemes to evaluate the performance. We show that our approaches improve the overall accuracy of facial expression recognition on six data sets.

...read moreread less

Proceedings Article•DOI•

FaceTime — Deep learning based face recognition attendance system

[...]

Marko Arsenovic¹, Srdjan Sladojevic¹, Andras Anderla¹, Darko Stefanovic¹•Institutions (1)

University of Novi Sad Faculty of Technical Sciences¹

01 Sep 2017

TL;DR: A new deep learning based face recognition attendance system that is composed of several essential steps developed using today's most advanced techniques: CNN cascade for face detection and CNN for generating face embeddings.

...read moreread less

Abstract: In the interest of recent accomplishments in the development of deep convolutional neural networks (CNNs) for face detection and recognition tasks, a new deep learning based face recognition attendance system is proposed in this paper. The entire process of developing a face recognition model is described in detail. This model is composed of several essential steps developed using today's most advanced techniques: CNN cascade for face detection and CNN for generating face embeddings. The primary goal of this research was the practical employment of these state-of-the-art deep learning approaches for face recognition tasks. Due to the fact that CNNs achieve the best results for larger datasets, which is not the case in production environment, the main challenge was applying these methods on smaller datasets. A new approach for image augmentation for face recognition tasks is proposed. The overall accuracy was 95.02% on a small dataset of the original face images of employees in the real-time environment. The proposed face recognition model could be integrated in another system with or without some minor alternations as a supporting or a main component for monitoring purposes.

...read moreread less

Proceedings Article•DOI•

Automatic facial expression recognition based on a deep convolutional-neural-network structure

[...]

Ke Shan¹, Junqi Guo¹, Wenwan You¹, Di Lu¹, Rongfang Bie¹ - Show less +1 more•Institutions (1)

Beijing Normal University¹

07 Jun 2017

TL;DR: A deep convolutional neural network is employed to devise a facial expression recognition system, which is capable to discover deeper feature representation of facial expression to achieve automatic recognition.

...read moreread less

Abstract: Facial expression recognition, which many researchers have put much effort in, is an important portion of affective computing and artificial intelligence. However, human facial expressions change so subtly that recognition accuracy of most traditional approaches largely depend on feature extraction. Meanwhile, deep learning is a hot research topic in the field of machine learning recently, which intends to simulate the organizational structure of human brain's nerve and combine low-level features to form a more abstract level. In this paper, we employ a deep convolutional neural network (CNN) to devise a facial expression recognition system, which is capable to discover deeper feature representation of facial expression to achieve automatic recognition. The proposed system is composed of the Input Module, the Pre-processing Module, the Recognition Module and the Output Module. We introduce both the Japanese Female Facial Expression Database(JAFFE) and the Extended Cohn-Kanade Dataset(CK+) to simulate and evaluate the recognition performance under the influence of different factors (network structure, learning rate and pre-processing). We also introduce a K-nearest neighbor (KNN) algorithm compared with CNN to make the results more convincing. The accuracy performance of the proposed system reaches 76.7442% and 80.303% in the JAFFE and CK+, respectively, which demonstrates feasibility and effectiveness of our system.

...read moreread less

Journal Article•DOI•

Age and gender recognition in the wild with deep attention

[...]

Pau Rodrguez, Guillem Cucurull, Josep M. Gonfaus, F. Xavier Roca, Jordi Gonzlez - Show less +1 more

01 Dec 2017-Pattern Recognition

TL;DR: Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy.

...read moreread less

Proceedings Article•DOI•

FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge

[...]

Michel Valstar¹, Enrique Sánchez-Lozano¹, Jeffrey F. Cohn², László A. Jeni³, Jeffrey M. Girard², Zheng Zhang⁴, Lijun Yin⁴, Maja Pantic⁵ - Show less +4 more•Institutions (5)

University of Nottingham¹, University of Pittsburgh², Carnegie Mellon University³, Binghamton University⁴, Imperial College London⁵

01 May 2017

TL;DR: This work outlines the evaluation protocol, the data used, and the results of a baseline method for both sub-challenges in automatic recognition of facial expressions, to be held in conjunction with the 12th IEEE conference on Face and Gesture Recognition, May 2017.

...read moreread less

Abstract: The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditions where faces appear in a wide range of poses (or camera views), displaying ecologically valid expressions. The main obstacle for assessing this is the availability of suitable data, and the challenge proposed here addresses this limitation. The FG 2017 Facial Expression Recognition and Analysis challenge (FERA 2017) extends FERA 2015 to the estimation of Action Units occurrence and intensity under different camera views. In this paper we present the third challenge in automatic recognition of facial expressions, to be held in conjunction with the 12th IEEE conference on Face and Gesture Recognition, May 2017, in Washington, United States. Two sub-challenges are defined: the detection of AU occurrence, and the estimation of AU intensity. In this work we outline the evaluation protocol, the data used, and the results of a baseline method for both sub-challenges.

...read moreread less

Journal Article•DOI•

Deep spatial-temporal feature fusion for facial expression recognition in static images

[...]

Ning Sun¹, Qi Li¹, Ruizhi Huan¹, Jixin Liu¹, Guang Han¹ - Show less +1 more•Institutions (1)

Nanjing University of Posts and Telecommunications¹

01 Oct 2017-Pattern Recognition Letters

TL;DR: The results show that the optical flow information from emotional-face and neutral-face is a useful complement to spatial feature and can effectively improve the performance of facial expression recognition from static images.

...read moreread less

Journal Article•DOI•

Heterogeneous Face Recognition: A Common Encoding Feature Discriminant Approach

[...]

Dihong Gong¹, Zhifeng Li¹, Weilin Huang¹, Xuelong Li¹, Dacheng Tao² - Show less +1 more•Institutions (2)

Chinese Academy of Sciences¹, University of Sydney²

01 May 2017-IEEE Transactions on Image Processing

TL;DR: This paper proposes a new feature descriptor called common encoding model for heterogeneous face recognition, which is able to capture common discriminant information, such that the large modality gap can be significantly reduced at the feature extraction stage.

...read moreread less

Abstract: Heterogeneous face recognition is an important, yet challenging problem in face recognition community. It refers to matching a probe face image to a gallery of face images taken from alternate imaging modality. The major challenge of heterogeneous face recognition lies in the great discrepancies between different image modalities. Conventional face feature descriptors, e.g., local binary patterns, histogram of oriented gradients, and scale-invariant feature transform, are mostly designed in a handcrafted way and thus generally fail to extract the common discriminant information from the heterogeneous face images. In this paper, we propose a new feature descriptor called common encoding model for heterogeneous face recognition, which is able to capture common discriminant information, such that the large modality gap can be significantly reduced at the feature extraction stage. Specifically, we turn a face image into an encoded one with the encoding model learned from the training data, where the difference of the encoded heterogeneous face images of the same person can be minimized. Based on the encoded face images, we further develop a discriminant matching method to infer the hidden identity information of the cross-modality face images for enhanced recognition performance. The effectiveness of the proposed approach is demonstrated (on several public-domain face datasets) in two typical heterogeneous face recognition scenarios: matching NIR faces to VIS faces and matching sketches to photographs.

...read moreread less

Journal Article•DOI•

Effective recognition of facial micro-expressions with video motion magnification

[...]

Yandan Wang¹, John See², Yee-Hui Oh², Raphael C.-W. Phan², Yogachandran Rahulamathavan³, Huo-Chong Ling⁴, Su-Wei Tan², Xujie Li¹ - Show less +4 more•Institutions (4)

Wenzhou University¹, Multimedia University², City University London³, Curtin University⁴

01 Oct 2017-Multimedia Tools and Applications

TL;DR: This article proposes a new micro- expression recognition approach based on the Eulerian motion magnification technique, which could reveal the hidden information and accentuate the subtle changes in micro-expression motion.

...read moreread less

Abstract: Facial expression recognition has been intensively studied for decades, notably by the psychology community and more recently the pattern recognition community. What is more challenging, and the subject of more recent research, is the problem of recognizing subtle emotions exhibited by so-called micro-expressions. Recognizing a micro-expression is substantially more challenging than conventional expression recognition because these micro-expressions are only temporally exhibited in a fraction of a second and involve minute spatial changes. Until now, work in this field is at a nascent stage, with only a few existing micro-expression databases and methods. In this article, we propose a new micro-expression recognition approach based on the Eulerian motion magnification technique, which could reveal the hidden information and accentuate the subtle changes in micro-expression motion. Validation of our proposal was done on the recently proposed CASME II dataset in comparison with baseline and state-of-the-art methods. We achieve a good recognition accuracy of up to 75.30 % by using leave-one-out cross validation evaluation protocol. Extensive experiments on various factors at play further demonstrate the effectiveness of our proposed approach.

...read moreread less

Journal Article•DOI•

A Kinect-Based Wearable Face Recognition System to Aid Visually Impaired Users

[...]

Laurindo de Sousa Britto Neto¹, Felipe Grijalva², Vanessa Regina Margareth Lima Maike², Luiz Cesar Martini², Dinei Florencio³, Maria Cecília Calani Baranauskas², Anderson Rocha², Siome Goldenstein² - Show less +4 more•Institutions (3)

Federal University of Piauí¹, State University of Campinas², Microsoft³

01 Feb 2017-IEEE Transactions on Human-Machine Systems

TL;DR: The system uses a Microsoft Kinect sensor as a wearable device, performs face detection, and uses temporal coherence along with a simple biometric procedure to generate a sound associated with the identified person, virtualized at his/her estimated 3-D location.

...read moreread less

Abstract: In this paper, we introduce a real-time face recognition (and announcement) system targeted at aiding the blind and low-vision people. The system uses a Microsoft Kinect sensor as a wearable device, performs face detection, and uses temporal coherence along with a simple biometric procedure to generate a sound associated with the identified person, virtualized at his/her estimated 3-D location. Our approach uses a variation of the K-nearest neighbors algorithm over histogram of oriented gradient descriptors dimensionally reduced by principal component analysis. The results show that our approach, on average, outperforms traditional face recognition methods while requiring much less computational resources (memory, processing power, and battery life) when compared with existing techniques in the literature, deeming it suitable for the wearable hardware constraints. We also show the performance of the system in the dark, using depth-only information acquired with Kinect's infrared camera. The validation uses a new dataset available for download, with 600 videos of 30 people, containing variation of illumination, background, and movement patterns. Experiments with existing datasets in the literature are also considered. Finally, we conducted user experience evaluations on both blindfolded and visually impaired users, showing encouraging results.

...read moreread less

Journal Article•DOI•

Facial Expression Recognition From Image Sequence Based on LBP and Taylor Expansion

[...]

Yuanyuan Ding¹, Zhao Qin¹, Baoqing Li¹, Xiaobing Yuan¹•Institutions (1)

Chinese Academy of Sciences¹

09 Aug 2017-IEEE Access

TL;DR: Experimental results on the JAFFE and Cohn-Kanade data sets show that the proposed TFP method outperforms some state-of-the-art LBP-based feature extraction methods for facial expression feature extraction and can be suited for real-time applications.

...read moreread less

Abstract: The aim of an automatic video-based facial expression recognition system is to detect and classify human facial expressions from image sequence. An integrated automatic system often involves two components: 1) peak expression frame detection and 2) expression feature extraction. In comparison with the image-based expression recognition system, the video-based recognition system often performs online detection, which prefers low-dimensional feature representation for cost-effectiveness. Moreover, effective feature extraction is needed for classification. Many recent recognition systems often incorporate rich additional subjective information and thus become less efficient for real-time application. In our facial expression recognition system, first, we propose the double local binary pattern (DLBP) to detect the peak expression frame from the video. The proposed DLBP method has a much lower-dimensional size and can successfully reduce detection time. Besides, to handle the illumination variations in LBP, logarithm-laplace (LL) domain is further proposed to get a more robust facial feature for detection. Finally, the Taylor expansion theorem is employed in our system for the first time to extract facial expression feature. We propose the Taylor feature pattern (TFP) based on the LBP and Taylor expansion to obtain an effective facial feature from the Taylor feature map. Experimental results on the JAFFE and Cohn-Kanade data sets show that the proposed TFP method outperforms some state-of-the-art LBP-based feature extraction methods for facial expression feature extraction and can be suited for real-time applications.

...read moreread less

Proceedings Article•DOI•

Review and comparison of face detection algorithms

[...]

Kirti Dang¹, Shanu Sharma¹•Institutions (1)

Amity University¹

01 Jan 2017

TL;DR: Various face detection algorithms are discussed and analyzed like Viola-Jones, SMQT features & SNOW Classifier, Neural Network-Based Face Detection and Support Vector Machine-Based face detection and all these face detection methods are compared based on the precision and recall value calculated using a DetEval Software.

...read moreread less

Abstract: With the tremendous increase in video and image database there is a great need of automatic understanding and examination of data by the intelligent systems as manually it is becoming out of reach. Narrowing it down to one specific domain, one of the most specific objects that can be traced in the images are people i.e. faces. Face detection is becoming a challenge by its increasing use in number of applications. It is the first step for face recognition, face analysis and detection of other features of face. In this paper, various face detection algorithms are discussed and analyzed like Viola-Jones, SMQT features & SNOW Classifier, Neural Network-Based Face Detection and Support Vector Machine-Based face detection. All these face detection methods are compared based on the precision and recall value calculated using a DetEval Software which deals with precised values of the bounding boxes around the faces to give accurate results.

...read moreread less

Proceedings Article•DOI•

Cross-Database Facial Expression Recognition Based on Fine-Tuned Deep Convolutional Network

[...]

Marcus Vinicius Zavarez, Rodrigo F. Berriel¹, Thiago Oliveira-Santos¹•Institutions (1)

Universidade Federal do Espírito Santo¹

01 Oct 2017

TL;DR: The results show a significant improvement on the use of pre-trained models against randomly initialized Convolutional Neural Networks on the facial expression recognition problem, for example achieving 88.58%, 67.97%, and 72.55% average accuracy testing in the CK+, MMI, RaFD, and KDEF, respectively.

...read moreread less

Abstract: Facial expression recognition is a very important research field to understand human emotions. Many facial expression recognition systems have been proposed in the literature over the years. Some of these methods use neural network approaches with deep architectures to address the problem. Although it seems that the facial expression recognition problem has been solved, there is a large difference between the results achieved using the same database to train and test the network and the cross-database protocol. In this paper, we extensively investigate the performance influence of fine-tuning with cross-database approach. In order to perform the study, the VGG-Face Deep Convolutional Network model (pre-trained for face recognition) was fine-tuned to recognize facial expressions considering different well-established databases in the literature: CK+, JAFFE, MMI, RaFD, KDEF, BU3DFE, and AR Face. The cross-database experiments were organized so that one of the databases was separated as test set and the others as training, and each experiment was ran multiple times to ensure the results. Our results show a significant improvement on the use of pre-trained models against randomly initialized Convolutional Neural Networks on the facial expression recognition problem, for example achieving 88.58%, 67.03%, 85.97%, and 72.55% average accuracy testing in the CK+, MMI, RaFD, and KDEF, respectively. Additionally, in absolute terms, the results show an improvement in the literature for cross-database facial expression recognition with the use of pre-trained models.

...read moreread less

Collapse