scispace - formally typeset
Search or ask a question

Showing papers on "Face detection published in 2018"


Proceedings ArticleDOI
01 Feb 2018
TL;DR: The IARPA Janus Benchmark–C (IJB-C) face dataset advances the goal of robust unconstrained face recognition, improving upon the previous public domain IJB-B dataset, by increasing dataset size and variability, and by introducing end-to-end protocols that more closely model operational face recognition use cases.
Abstract: Although considerable work has been done in recent years to drive the state of the art in facial recognition towards operation on fully unconstrained imagery, research has always been restricted by a lack of datasets in the public domain In addition, traditional biometrics experiments such as single image verification and closed set recognition do not adequately evaluate the ways in which unconstrained face recognition systems are used in practice The IARPA Janus Benchmark–C (IJB-C) face dataset advances the goal of robust unconstrained face recognition, improving upon the previous public domain IJB-B dataset, by increasing dataset size and variability, and by introducing end-to-end protocols that more closely model operational face recognition use cases IJB-C adds 1,661 new subjects to the 1,870 subjects released in IJB-B, with increased emphasis on occlusion and diversity of subject occupation and geographic origin with the goal of improving representation of the global population Annotations on IJB-C imagery have been expanded to allow for further covariate analysis, including a spatial occlusion grid to standardize analysis of occlusion Due to these enhancements, the IJB-C dataset is significantly more challenging than other datasets in the public domain and will advance the state of the art in unconstrained face recognition

510 citations


Journal ArticleDOI
TL;DR: This work improves the state-of-the-art Faster RCNN framework by combining a number of strategies, including feature concatenation, hard negative mining, multi-scale training, model pre-training, and proper calibration of key parameters.

461 citations


Proceedings ArticleDOI
01 Oct 2018
TL;DR: The survey provides a clear, structured presentation of the principal, state-of-the-art (SOTA) face recognition techniques appearing within the past five years in top computer vision venues with some open issues currently overlooked by the community.
Abstract: Face recognition made tremendous leaps in the last five years with a myriad of systems proposing novel techniques substantially backed by deep convolutional neural networks (DCNN). Although face recognition performance sky-rocketed using deep-learning in classic datasets like LFW, leading to the belief that this technique reached human performance, it still remains an open problem in unconstrained environments as demonstrated by the newly released IJB datasets. This survey aims to summarize the main advances in deep face recognition and, more in general, in learning face representations for verification and identification. The survey provides a clear, structured presentation of the principal, state-of-the-art (SOTA) face recognition techniques appearing within the past five years in top computer vision venues. The survey is broken down into multiple parts that follow a standard face recognition pipeline: (a) how SOTA systems are trained and which public data sets have they used; (b) face preprocessing part (detection, alignment, etc.); (c) architecture and loss functions used for transfer learning (d) face recognition for verification and identification. The survey concludes with an overview of the SOTA results at a glance along with some open issues currently overlooked by the community.

347 citations


Book ChapterDOI
Xu Tang1, Daniel K. Du1, Zeqiang He1, Jingtuo Liu1
08 Sep 2018
TL;DR: Zhang et al. as discussed by the authors proposed a context-assisted single shot face detector, named PyramidBox, to handle the hard face detection problem, which improves the utilization of contextual information in the following three aspects.
Abstract: Face detection has been well studied for many years and one of remaining challenges is to detect small, blurred and partially occluded faces in uncontrolled environment. This paper proposes a novel context-assisted single shot face detector, named PyramidBox to handle the hard face detection problem. Observing the importance of the context, we improve the utilization of contextual information in the following three aspects. First, we design a novel context anchor to supervise high-level contextual feature learning by a semi-supervised method, which we call it PyramidAnchors. Second, we propose the Low-level Feature Pyramid Network to combine adequate high-level context semantic feature and Low-level facial feature together, which also allows the PyramidBox to predict faces of all scales in a single shot. Third, we introduce a context-sensitive structure to increase the capacity of prediction network to improve the final accuracy of output. In addition, we use the method of Data-anchor-sampling to augment the training samples across different scales, which increases the diversity of training data for smaller faces. By exploiting the value of context, PyramidBox achieves superior performance among the state-of-the-art over the two common face detection benchmarks, FDDB and WIDER FACE. Our code is available in PaddlePaddle: https://github.com/PaddlePaddle/models/tree/develop/fluid/face_detection.

208 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: The proposed algorithm to directly generate a clear high-resolution face from a blurry small one by adopting a generative adversarial network (GAN) is proposed and the detection performance outperforms other state-of-the-art methods.
Abstract: Face detection techniques have been developed for decades, and one of remaining open challenges is detecting small faces in unconstrained conditions. The reason is that tiny faces are often lacking detailed information and blurring. In this paper, we proposed an algorithm to directly generate a clear high-resolution face from a blurry small one by adopting a generative adversarial network (GAN). Toward this end, the basic GAN formulation achieves it by super-resolving and refining sequentially (e.g. SR-GAN and cycle-GAN). However, we design a novel network to address the problem of super-resolving and refining jointly. We also introduce new training losses to guide the generator network to recover fine details and to promote the discriminator network to distinguish real vs. fake and face vs. non-face simultaneously. Extensive experiments on the challenging dataset WIDER FACE demonstrate the effectiveness of our proposed method in restoring a clear high-resolution face from a blurry small one, and show that the detection performance outperforms other state-of-the-art methods.

195 citations


Journal ArticleDOI
TL;DR: An overview of deep-learning methods used for face recognition is provided and different modules involved in designing an automatic face recognition system are discussed and the role of deep learning for each of them is discussed.
Abstract: Recent developments in deep convolutional neural networks (DCNNs) have shown impressive performance improvements on various object detection/recognition problems. This has been made possible due to the availability of large annotated data and a better understanding of the nonlinear mapping between images and class labels, as well as the affordability of powerful graphics processing units (GPUs). These developments in deep learning have also improved the capabilities of machines in understanding faces and automatically executing the tasks of face detection, pose estimation, landmark localization, and face recognition from unconstrained images and videos. In this article, we provide an overview of deep-learning methods used for face recognition. We discuss different modules involved in designing an automatic face recognition system and the role of deep learning for each of them. Some open issues regarding DCNNs for face recognition problems are then discussed. This article should prove valuable to scientists, engineers, and end users working in the fields of face recognition, security, visual surveillance, and biometrics.

183 citations


Journal ArticleDOI
TL;DR: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel, and can achieve comparable performance with easier procedures.
Abstract: Facial expression recognition (FER) is a significant task for the machines to understand the emotional changes in human beings. However, accurate hand-crafted features that are highly related to changes in expression are difficult to extract because of the influences of individual difference and variations in emotional intensity. Therefore, features that can accurately describe the changes in facial expressions are urgently required. Method: A weighted mixture deep neural network (WMDNN) is proposed to automatically extract the features that are effective for FER tasks. Several pre-processing approaches, such as face detection, rotation rectification, and data augmentation, are implemented to restrict the regions for FER. Two channels of facial images, including facial grayscale images and their corresponding local binary pattern (LBP) facial images, are processed by WMDNN. Expression-related features of facial grayscale images are extracted by fine-tuning a partial VGG16 network, the parameters of which are initialized using VGG16 model trained on ImageNet database. Features of LBP facial images are extracted by a shallow convolutional neural network (CNN) built based on DeepID. The outputs of both channels are fused in a weighted manner. The result of final recognition is calculated using softmax classification. Results: Experimental results indicate that the proposed algorithm can recognize six basic facial expressions (happiness, sadness, anger, disgust, fear, and surprise) with high accuracy. The average recognition accuracies for benchmarking data sets “CK+,” “JAFFE,” and “Oulu-CASIA” are 0.970, 0.922, and 0.923, respectively. Conclusions: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel. Compared with the deep networks that use multiple channels, our proposed network can achieve comparable performance with easier procedures. Fine-tuning is effective to FER tasks with a well pre-trained model if sufficient samples cannot be collected.

160 citations


Posted Content
TL;DR: This work trains GoogLeNet to detect tampering artifacts in a face classification stream, and train a patch based triplet network to leverage features capturing local noise residuals and camera characteristics as a second stream for face tampering detection.
Abstract: We propose a two-stream network for face tampering detection. We train GoogLeNet to detect tampering artifacts in a face classification stream, and train a patch based triplet network to leverage features capturing local noise residuals and camera characteristics as a second stream. In addition, we use two different online face swapping applications to create a new dataset that consists of 2010 tampered images, each of which contains a tampered face. We evaluate the proposed two-stream network on our newly collected dataset. Experimental results demonstrate the effectiveness of our method.

146 citations


Journal ArticleDOI
TL;DR: In this article, a method to generate very large training data sets of synthetic images by compositing real face images in a given data set is proposed, which enables to learn models from as few as 10, 000 training images, which perform on par with models trained from 500, 000 images.
Abstract: Deep convolutional neural networks have recently proven extremely effective for difficult face recognition problems in uncontrolled settings. To train such networks, very large training sets are needed with millions of labeled images. For some applications, such as near-infrared (NIR) face recognition, such large training data sets are not publicly available and difficult to collect. In this paper, we propose a method to generate very large training data sets of synthetic images by compositing real face images in a given data set. We show that this method enables to learn models from as few as 10 000 training images, which perform on par with models trained from 500 000 images. Using our approach, we also obtain state-of-the-art results on the CASIA NIR-VIS2.0 heterogeneous face recognition data set.

112 citations


Journal ArticleDOI
TL;DR: A deep convolutional neural network is proposed for face detection leveraging on facial attributes based supervision and achieves promising performance on popular benchmarks including FDDB, PASCAL Faces, AFW, and WIDER FACE.
Abstract: We propose a deep convolutional neural network (CNN) for face detection leveraging on facial attributes based supervision. We observe a phenomenon that part detectors emerge within CNN trained to classify attributes from uncropped face images, without any explicit part supervision. The observation motivates a new method for finding faces through scoring facial parts responses by their spatial structure and arrangement. The scoring mechanism is data-driven, and carefully formulated considering challenging cases where faces are only partially visible. This consideration allows our network to detect faces under severe occlusion and unconstrained pose variations. Our method achieves promising performance on popular benchmarks including FDDB, PASCAL Faces, AFW, and WIDER FACE.

112 citations


Journal ArticleDOI
TL;DR: This paper performs the first, to the best of the knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300 VW benchmark and reveals future avenues for further research on the topic.
Abstract: Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300 VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.

Proceedings ArticleDOI
03 Apr 2018
TL;DR: Various Object Detection Algorithms such as face detection, skin detection, colour detection, shape detection, and target detection are simulated and implemented using MATLAB 2017b to detect various types of objects for video surveillance applications with improved accuracy.
Abstract: Object Detection algorithms find application in various fields such as defence, security, and healthcare. In this paper various Object Detection Algorithms such as face detection, skin detection, colour detection, shape detection, target detection are simulated and implemented using MATLAB 2017b to detect various types of objects for video surveillance applications with improved accuracy. Further, various challenges and applications of Object Detection methods are elaborated.

Proceedings ArticleDOI
01 Sep 2018
TL;DR: A new database of more than 53,000 images, from 150 videos, originating from multiple sources of digitally generated fakes including Computer Graphics Image (CGI) generation and many tampering based approaches is collected to answer if the current fake face detection methods can be generalizable.
Abstract: With advancements in technology, it is now possible to create representations of human faces in a seamless manner for fake media, leveraging the large-scale availability of videos. These fake faces can be used to conduct personation attacks on the targeted subjects. Availability of open source software and a variety of commercial applications provides an opportunity to generate fake videos of a particular target subject in a number of ways. In this article, we evaluate the generalizability of the fake face detection methods through a series of studies to benchmark the detection accuracy. To this extent, we have collected a new database of more than 53,000 images, from 150 videos, originating from multiple sources of digitally generated fakes including Computer Graphics Image (CGI) generation and many tampering based approaches. In addition, we have also included images (with more than 3,200) from the predominantly used Swap-Face application that is commonly available on smartphones. Extensive experiments are carried out using both texture-based handcrafted detection methods and deep learning based detection methods to find the suitability of detection methods. Through the set of evaluation, we attempt to answer if the current fake face detection methods can be generalizable.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: PCN as discussed by the authors proposes a coarse-to-fine approach to perform rotation-invariant face detection with arbitrary rotation-in-plane (RIP) angles by dividing the calibration process into several progressive steps and only predicting coarse orientations in early stages.
Abstract: Rotation-invariant face detection, i.e. detecting faces with arbitrary rotation-in-plane (RIP) angles, is widely required in unconstrained applications but still remains as a challenging task, due to the large variations of face appearances. Most existing methods compromise with speed or accuracy to handle the large RIP variations. To address this problem more efficiently, we propose Progressive Calibration Networks (PCN) to perform rotation-invariant face detection in a coarse-to-fine manner. PCN consists of three stages, each of which not only distinguishes the faces from non-faces, but also calibrates the RIP orientation of each face candidate to upright progressively. By dividing the calibration process into several progressive steps and only predicting coarse orientations in early stages, PCN can achieve precise and fast calibration. By performing binary classification of face vs. non-face with gradually decreasing RIP ranges, PCN can accurately detect faces with full 360° RIP angles. Such designs lead to a real-time rotation-invariant face detector. The experiments on multi-oriented FDDB and a challenging subset of WIDER FACE containing rotated faces in the wild show that our PCN achieves quite promising performance.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: Zhang et al. as discussed by the authors introduced a novel anchor design principle to support anchor-based face detection for superior scale-invariant performance, especially on tiny faces, which can theoretically explain the low overlapping issue and inspire several effective strategies of new anchor design leading to higher face overlaps, including anchor stride reduction, extra shifted anchors, and stochastic face shifting.
Abstract: This paper introduces a novel anchor design principle to support anchor-based face detection for superior scaleinvariant performance, especially on tiny faces. To achieve this, we explicitly address the problem that anchor-based detectors drop performance drastically on faces with tiny sizes, e.g. less than 16 A— 16 pixels. In this paper, we investigate why this is the case. We discover that current anchor design cannot guarantee high overlaps between tiny faces and anchor boxes, which increases the difficulty of training. The new Expected Max Overlapping (EMO) score is proposed which can theoretically explain the low overlapping issue and inspire several effective strategies of new anchor design leading to higher face overlaps, including anchor stride reduction with new network architectures, extra shifted anchors, and stochastic face shifting. Comprehensive experiments show that our proposed method significantly outperforms the baseline anchor-based detector, while consistently achieving state-of-the-art results on challenging face detection datasets with competitive runtime speed.

Proceedings ArticleDOI
23 Jul 2018
TL;DR: Experimental results show that the face detection method based on YOLO has stronger robustness and faster detection speed than other similar target detection systems and can meet real-time detection requirements.
Abstract: As a target detection system, YOLO has a fast detection speed and is suitable for target detection in real-time environment. Compared with other similar target detection systems, it has better detection accuracy and faster detection time. This paper is based on YOLO network and applied to face detection. In this paper, YOLO target detection system is applied to face detection. Experimental results show that the face detection method based on YOLO has stronger robustness and faster detection speed. Still in a complex environment can guarantee the high detection accuracy. At the same time, the detection speed can meet real-time detection requirements

Journal ArticleDOI
01 Jul 2018
TL;DR: Experimental results show that the recognition performance of the FERS technique can be better than that of other existing methods under consideration in the paper.
Abstract: This paper presents a novel facial expression recognition (FER) technique based on support vector machine (SVM) for the FER Here it is called the FERS technique First, the FERS technique develops a face detection method that combines the Haar-like features method with the self-quotient image (SQI) filter As a result, the FERS technique possesses better detection rate because the face detection method gets more accurate in locating face regions of an image The main reason is that the SQI filter can overcome the insufficient light and shade light Subsequently, three schemes, the angular radial transform (ART), the discrete cosine transform (DCT) and the Gabor filter (GF), are simultaneously employed in the design of the feature extraction for facial expression in the FERS technique More specifically, they are employed in constructing a set of training patterns for the training of an SVM The FERS technique then exploits the trained SVM to recognize the facial expression for a query face image Finally, experimental results show that the recognition performance of the FERS technique can be better than that of other existing methods under consideration in the paper

Proceedings ArticleDOI
31 May 2018
TL;DR: This paper proposes a novel strategy to craft adversarial examples by solving a constrained optimization problem using an adversarial generator network, and shows that the same trained generator is capable of attacking new images without explicitly optimizing on them.
Abstract: Adversarial attacks involve adding, small, often imperceptible, perturbations to inputs with the goal of getting a machine learning model to misclassifying them. While many different adversarial attack strategies have been proposed on image classification models, object detection pipelines have been much harder to break. In this paper, we propose a novel strategy to craft adversarial examples by solving a constrained optimization problem using an adversarial generator network. Our approach is fast and scalable, requiring only a forward pass through our trained generator network to craft an adversarial sample. Unlike in many attack strategies we show that the same trained generator is capable of attacking new images without explicitly optimizing on them. We evaluate our attack on a trained Faster R-CNN face detector on the cropped 300-W face dataset where we manage to reduce the number of detected faces to 0.5% of all originally detected faces. In a different experiment, also on 300-W, we demonstrate the robustness of our attack to a JPEG compression based defense typical JPEG compression level of 75% reduces the effectiveness of our attack from only 0.5% of detected faces to a modest 5.0%.

Journal ArticleDOI
TL;DR: A novel deep transfer neural network method based on multi-label learning for facial attribute classification, termed FMTNet, which consists of three sub-networks: the Face detection Network (FNet), the Multi-labelLearning Network (MNet) and the Transfer learning Network (TNet).

Proceedings ArticleDOI
01 Nov 2018
TL;DR: This paper compares the accuracy of detecting the face in an efficient manner with respect to the traditional approach and uses the convolutional neural network as an approach of deep learning for detecting faces from videos.
Abstract: Deep learning is nowadays a buzzword and is considered a new era of machine learning which trains the computers in finding the pattern from a massive amount of data. It mainly describes the learning at multiple levels of representation which helps to make sense on the data consisting of text, sound and images. Many organizations are using a type of deep learning known as a convolutional neural network to deal with the objects in a video sequence. Deep Convolution Neural Networks (CNNs) have proved to be impressive in terms of performance for detecting the objects, classification of images and semantic segmentation. Object detection is defined as a combination of classification and localization. Face detection is one of the most challenging problems of pattern recognition. Various face related applications like face verification, facial recognition, clustering of face etc. are a part of face detection. Effective training needs to be carried out for detection and recognition. The accuracy in face detection using the traditional approach did not yield a good result. This paper focuses on improving the accuracy of detecting the face using the model of deep learning. YOLO (You only look once), a popular deep learning library is used to implement the proposed work. The paper compares the accuracy of detecting the face in an efficient manner with respect to the traditional approach. The proposed model uses the convolutional neural network as an approach of deep learning for detecting faces from videos. The FDDB dataset is used for training and testing of our model. A model is fine-tuned on various performance parameters and the best suitable values are taken into consideration. It is also compared the execution of training time and the performance of the model on two different GPUs.

Journal ArticleDOI
TL;DR: A critical review of various types of face recognition techniques and challenges is presented, to improve efficiency and recognition rate for identifying face images in large database, with comparison of accuracy or recognition rate.

Posted Content
TL;DR: A detailed designed Faster RCNN method named FDNet1.0 for face detection is proposed, which achieves two 1th places and one 2nd place in three tasks over WIDER FACE validation dataset.
Abstract: Faster RCNN has achieved great success for generic object detection including PASCAL object detection and MS COCO object detection. In this report, we propose a detailed designed Faster RCNN method named FDNet1.0 for face detection. Several techniques were employed including multi-scale training, multi-scale testing, light-designed RCNN, some tricks for inference and a vote-based ensemble method. Our method achieves two 1th places and one 2nd place in three tasks over WIDER FACE validation dataset (easy set, medium set, hard set).

Journal ArticleDOI
TL;DR: An efficient and fast facial expression recognition system that outperforms existing methods is presented and a new feature called W_HOG where W indicates discrete wavelet transform and HOG indicates histogram of oriented gradients feature is introduced.
Abstract: Facial expression recognition plays a significant role in human behavior detection. In this study, we present an efficient and fast facial expression recognition system. We introduce a new feature called W_HOG where W indicates discrete wavelet transform and HOG indicates histogram of oriented gradients feature. The proposed framework comprises of four stages: (i) Face processing, (ii) Domain transformation, (iii) Feature extraction and (iv) Expression recognition. Face processing is composed of face detection, cropping and normalization steps. In domain transformation, spatial domain features are transformed into the frequency domain by applying discrete wavelet transform (DWT). Feature extraction is performed by retrieving Histogram of Oriented Gradients (HOG) feature in DWT domain which is termed as W_HOG feature. For expression recognition, W_HOG feature is supplied to a well-designed tree based multiclass support vector machine (SVM) classifier with one-versus-all architecture. The proposed system is trained and tested with benchmark CK+, JAFFE and Yale facial expression datasets. Experimental results of the proposed method are effective towards facial expression recognition and outperforms existing methods.

Posted Content
TL;DR: This paper provides a review of deep learning-based object detection frameworks and focuses on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.
Abstract: Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles which combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy and optimization function, etc. In this paper, we provide a review on deep learning based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely Convolutional Neural Network (CNN). Then we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network based learning systems.

Proceedings ArticleDOI
01 Oct 2018
TL;DR: In this article, the authors identify the next set of challenges that require attention from the research community and collect a new dataset of face images that involve these issues such as weather-based degradations, motion blur, focus blur and several others.
Abstract: Face detection has witnessed immense progress in the last few years, with new milestones being surpassed every year. While many challenges such as large variations in scale, pose, appearance are successfully addressed, there still exist several issues which are not specifically captured by existing methods or datasets. In this work, we identify the next set of challenges that requires attention from the research community and collect a new dataset of face images that involve these issues such as weather-based degradations, motion blur, focus blur and several others. We demonstrate that there is a considerable gap in the performance of state-of-the-art detectors and real-world requirements. Hence, in an attempt to fuel further research in unconstrained face detection, we present a new annotated Unconstrained Face Detection Dataset (UFDD) with several challenges and benchmark recent methods. Additionally, we provide an in-depth analysis of the results and failure cases of these methods. The UFDD dataset as well as baseline results, evaluation code and image source are available at: www.ufdd.info/

Proceedings Article
15 Jun 2018
TL;DR: Results show that the tools are generally proficient at determining gender, with accuracy rates greater than 90%, except for IBM Bluemix, while race appears to be a challenging problem, as all four tools performed poorly.
Abstract: In this research, we evaluate four widely used face detection tools, which are Face++, IBM Bluemix Visual Recognition, AWS Rekognition, and Microsoft Azure Face API, using multiple datasets to determine their accuracy in inferring user attributes, including gender, race, and age. Results show that the tools are generally proficient at determining gender, with accuracy rates greater than 90%, except for IBM Bluemix. Concerning race, only one of the four tools provides this capability, Face++, with an accuracy rate of greater than 90%, although the evaluation was performed on a high-quality dataset. Inferring age appears to be a challenging problem, as all four tools performed poorly. The findings of our quantitative evaluation are helpful for future computational social science research using these tools, as their accuracy needs to be taken into account when applied to classifying individuals on social media and other contexts. Triangulation and manual verification are suggested for researchers employing these tools.

Proceedings ArticleDOI
10 May 2018
TL;DR: The proposed face recognition process was done using a hybrid process of Haar Cascades and Eigenface methods, which can detect multiple faces in a single detection process and was able to recognize multiple faces with 91.67% accuracy level.
Abstract: Face recognition can be considered one of the most successful biometric identification methods among several types of biometric identification including fingerprints, DNA, palm print, hand geometry, iris recognition, retina and odour/scent. Face recognition provides biometric identification that utilizes the uniqueness of faces for security purposes. The problem with face recognition using biometric identification is its lengthy process and the accuracy of the results. This paper proposes solutions for a faster face recognition process with accurate results. The proposed face recognition process was done using a hybrid process of Haar Cascades and Eigenface methods, which can detect multiple faces (55 faces) in a single detection process. This improved face recognition approach was able to recognize multiple faces with 91.67% accuracy level.

Journal ArticleDOI
TL;DR: In this paper, the authors present the design details of a deep learning system for unconstrained face recognition, including modules for face detection, association, alignment, and face verification.
Abstract: Over the last 5 years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear mapping between input images and class labels as well as the affordability of GPUs. In this paper, we present the design details of a deep learning system for unconstrained face recognition, including modules for face detection, association, alignment and face verification. The quantitative performance evaluation is conducted using the IARPA Janus Benchmark A (IJB-A), the JANUS Challenge Set 2 (JANUS CS2), and the Labeled Faces in the Wild (LFW) dataset. The IJB-A dataset includes real-world unconstrained faces of 500 subjects with significant pose and illumination variations which are much harder than the LFW and Youtube Face datasets. JANUS CS2 is the extended version of IJB-A which contains not only all the images/frames of IJB-A but also includes the original videos. Some open issues regarding DCNNs for face verification problems are then discussed.

Journal ArticleDOI
TL;DR: The aim of this research is to provide comprehensive literature review over face recognition along with its applications and some of the major findings are given in conclusion.
Abstract: With the rapid growth in multimedia contents, among such content face recognition has got much attention especially in past few years. Face as an object consists of distinct features for detection; therefore, it remains most challenging research area for scholars in the field of computer vision and image processing. In this survey paper, we have tried to address most endeavoring face features such as pose invariance, aging, illuminations and partial occlusion. They are considered to be indispensable factors in face recognition system when realized over facial images. This paper also studies state of the art face detection techniques, approaches, viz. Eigen face, Artificial Neural Networks (ANN), Support Vector Machines (SVM), Principal Component Analysis (PCA), Independent Component Analysis (ICA), Gabor Wavelets, Elastic Bunch Graph Matching, 3D morphable Model and Hidden Markov Models. In addition to the aforementioned works, we have mentioned different testing face databases which include AT & T (ORL), AR, FERET, LFW, YTF, and Yale, respectively for results analysis. However, aim of this research is to provide comprehensive literature review over face recognition along with its applications. And after in depth discussion, some of the major findings are given in conclusion.

Book ChapterDOI
08 Sep 2018
TL;DR: It is shown how large numbers of hard negatives can be obtained automatically by analyzing the output of a trained detector on video sequences, and how retraining detectors on these automatically obtained examples often significantly improves performance.
Abstract: Important gains have recently been obtained in object detection by using training objectives that focus on hard negative examples, i.e., negative examples that are currently rated as positive or ambiguous by the detector. These examples can strongly influence parameters when the network is trained to correct them. Unfortunately, they are often sparse in the training data, and are expensive to obtain. In this work, we show how large numbers of hard negatives can be obtained automatically by analyzing the output of a trained detector on video sequences. In particular, detections that are isolated in time, i.e., that have no associated preceding or following detections, are likely to be hard negatives. We describe simple procedures for mining large numbers of such hard negatives (and also hard positives) from unlabeled video data. Our experiments show that retraining detectors on these automatically obtained examples often significantly improves performance. We present experiments on multiple architectures and multiple data sets, including face detection, pedestrian detection and other object categories.