scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

IARPA Janus Benchmark-B Face Dataset

TL;DR: The IARPA Janus Benchmark-B (NIST IJB-B) dataset is introduced, a superset of IJB -A that represents operational use cases including access point identification, forensic quality media searches, surveillance video searches, and clustering.
Abstract: Despite the importance of rigorous testing data for evaluating face recognition algorithms, all major publicly available faces-in-the-wild datasets are constrained by the use of a commodity face detector, which limits, among other conditions, pose, occlusion, expression, and illumination variations. In 2015, the NIST IJB-A dataset, which consists of 500 subjects, was released to mitigate these constraints. However, the relatively low number of impostor and genuine matches per split in the IJB-A protocol limits the evaluation of an algorithm at operationally relevant assessment points. This paper builds upon IJB-A and introduces the IARPA Janus Benchmark-B (NIST IJB-B) dataset, a superset of IJB-A. IJB-B consists of 1,845 subjects with human-labeled ground truth face bounding boxes, eye/nose locations, and covariate metadata such as occlusion, facial hair, and skintone for 21,798 still images and 55,026 frames from 7,011 videos. IJB-B was also designed to have a more uniform geographic distribution of subjects across the globe than that of IJB-A. Test protocols for IJB-B represent operational use cases including access point identification, forensic quality media searches, surveillance video searches, and clustering. Finally, all images and videos in IJB-B are published under a Creative Commons distribution license and, therefore, can be freely distributed among the research community.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
15 Jun 2019
TL;DR: This paper presents arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks, and shows that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead.
Abstract: One of the main challenges in feature learning using Deep Convolutional Neural Networks (DCNNs) for large-scale face recognition is the design of appropriate loss functions that can enhance the discriminative power. Centre loss penalises the distance between deep features and their corresponding class centres in the Euclidean space to achieve intra-class compactness. SphereFace assumes that the linear transformation matrix in the last fully connected layer can be used as a representation of the class centres in the angular space and therefore penalises the angles between deep features and their corresponding weights in a multiplicative way. Recently, a popular line of research is to incorporate margins in well-established loss functions in order to maximise face class separability. In this paper, we propose an Additive Angular Margin Loss (ArcFace) to obtain highly discriminative features for face recognition. The proposed ArcFace has a clear geometric interpretation due to its exact correspondence to geodesic distance on a hypersphere. We present arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks which includes a new large-scale image database with trillions of pairs and a large-scale video dataset. We show that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead. To facilitate future research, the code has been made available.

4,312 citations


Cites background or methods from "IARPA Janus Benchmark-B Face Datase..."

  • ...The IJB-C dataset [37] is a further extension of IJB-B, having 3, 531 subjects with 31....

    [...]

  • ...The IJB-B dataset [37] Methods Id (%) Ver (%)...

    [...]

  • ...MegaFace [12], IJB-B [37], IJB-C [18] and Trillion-Pairs [1]) and video datasets (iQIYI-VID [17])....

    [...]

  • ...MegaFace [12] 530 (P) 1M (G) IJB-B [37] 1,845 76....

    [...]

Proceedings ArticleDOI
Qiong Cao1, Li Shen1, Weidi Xie1, Omkar M. Parkhi1, Andrew Zisserman1 
15 May 2018
TL;DR: VGGFace2 as discussed by the authors is a large-scale face dataset with 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject.
Abstract: In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimise the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS-Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on the IJB-A and IJB-B face recognition benchmarks, exceeding the previous state-of-the-art by a large margin. The dataset and models are publicly available.

2,365 citations

Posted Content
TL;DR: This article proposed an additive angular margin loss (ArcFace) to obtain highly discriminative features for face recognition, which has a clear geometric interpretation due to the exact correspondence to the geodesic distance on the hypersphere.
Abstract: One of the main challenges in feature learning using Deep Convolutional Neural Networks (DCNNs) for large-scale face recognition is the design of appropriate loss functions that enhance discriminative power. Centre loss penalises the distance between the deep features and their corresponding class centres in the Euclidean space to achieve intra-class compactness. SphereFace assumes that the linear transformation matrix in the last fully connected layer can be used as a representation of the class centres in an angular space and penalises the angles between the deep features and their corresponding weights in a multiplicative way. Recently, a popular line of research is to incorporate margins in well-established loss functions in order to maximise face class separability. In this paper, we propose an Additive Angular Margin Loss (ArcFace) to obtain highly discriminative features for face recognition. The proposed ArcFace has a clear geometric interpretation due to the exact correspondence to the geodesic distance on the hypersphere. We present arguably the most extensive experimental evaluation of all the recent state-of-the-art face recognition methods on over 10 face recognition benchmarks including a new large-scale image database with trillion level of pairs and a large-scale video dataset. We show that ArcFace consistently outperforms the state-of-the-art and can be easily implemented with negligible computational overhead. We release all refined training data, training codes, pre-trained models and training logs, which will help reproduce the results in this paper.

1,122 citations

Proceedings ArticleDOI
01 Feb 2018
TL;DR: The IARPA Janus Benchmark–C (IJB-C) face dataset advances the goal of robust unconstrained face recognition, improving upon the previous public domain IJB-B dataset, by increasing dataset size and variability, and by introducing end-to-end protocols that more closely model operational face recognition use cases.
Abstract: Although considerable work has been done in recent years to drive the state of the art in facial recognition towards operation on fully unconstrained imagery, research has always been restricted by a lack of datasets in the public domain In addition, traditional biometrics experiments such as single image verification and closed set recognition do not adequately evaluate the ways in which unconstrained face recognition systems are used in practice The IARPA Janus Benchmark–C (IJB-C) face dataset advances the goal of robust unconstrained face recognition, improving upon the previous public domain IJB-B dataset, by increasing dataset size and variability, and by introducing end-to-end protocols that more closely model operational face recognition use cases IJB-C adds 1,661 new subjects to the 1,870 subjects released in IJB-B, with increased emphasis on occlusion and diversity of subject occupation and geographic origin with the goal of improving representation of the global population Annotations on IJB-C imagery have been expanded to allow for further covariate analysis, including a spatial occlusion grid to standardize analysis of occlusion Due to these enhancements, the IJB-C dataset is significantly more challenging than other datasets in the public domain and will advance the state of the art in unconstrained face recognition

510 citations

Journal ArticleDOI
TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.

353 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

13,037 citations

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work revisits both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network.
Abstract: In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify. We revisit both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network. This deep network involves more than 120 million parameters using several locally connected layers without weight sharing, rather than the standard convolutional layers. Thus we trained it on the largest facial dataset to-date, an identity labeled dataset of four million facial images belonging to more than 4, 000 identities. The learned representations coupling the accurate model-based alignment with the large facial database generalize remarkably well to faces in unconstrained environments, even with a simple classifier. Our method reaches an accuracy of 97.35% on the Labeled Faces in the Wild (LFW) dataset, reducing the error of the current state of the art by more than 27%, closely approaching human-level performance.

6,132 citations

01 Oct 2008
TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.
Abstract: Most face databases have been created under controlled conditions to facilitate the study of specific parameters on the face recognition problem. These parameters include such variables as position, pose, lighting, background, camera quality, and gender. While there are many applications for face recognition technology in which one can control the parameters of image acquisition, there are also many applications in which the practitioner has little or no control over such parameters. This database, Labeled Faces in the Wild, is provided as an aid in studying the latter, unconstrained, recognition problem. The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life. The database exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background. In addition to describing the details of the database, we provide specific experimental paradigms for which the database is suitable. This is done in an effort to make research performed with the database as consistent and comparable as possible. We provide baseline results, including results of a state of the art face recognition system combined with a face alignment system. To facilitate experimentation on the database, we provide several parallel databases, including an aligned version.

5,742 citations


"IARPA Janus Benchmark-B Face Datase..." refers background in this paper

  • ...The release of the “Labeled Faces in the Wild” (LFW) dataset in 2007 [7] spurred significant research activity in unconstrained face recognition....

    [...]

  • ...Figure 1: Sample imagery from (a) IJB-B; (b) IJB-A; and (c) LFW [7] datasets....

    [...]

  • ...Extensive research in face recognition have now resulted in algorithms whose performance on the LFW benchmark has become saturated....

    [...]

  • ...Another issue with the LFW dataset in 1 particular is the overlap of subjects in training and testing splits, which can result in training bias....

    [...]

  • ...To remedy this, the Benchmark of Large Scale Unconstrained Face Recognition (BLUFR) protocol was developed [11] as an alternative to the LFW protocol....

    [...]

Proceedings ArticleDOI
01 Jan 2015
TL;DR: It is shown how a very large scale dataset can be assembled by a combination of automation and human in the loop, and the trade off between data purity and time is discussed.
Abstract: The goal of this paper is face recognition – from either a single photograph or from a set of faces tracked in a video. Recent progress in this area has been due to two factors: (i) end to end learning for the task using a convolutional neural network (CNN), and (ii) the availability of very large scale training datasets. We make two contributions: first, we show how a very large scale dataset (2.6M images, over 2.6K people) can be assembled by a combination of automation and human in the loop, and discuss the trade off between data purity and time; second, we traverse through the complexities of deep network training and face recognition to present methods and procedures to achieve comparable state of the art results on the standard LFW and YTF face benchmarks.

5,308 citations


"IARPA Janus Benchmark-B Face Datase..." refers background or methods in this paper

  • ...Indeed, the best performance under the LFW protocol [15] exceeds 99% true accept rate at a 1....

    [...]

  • ...While several large training datasets have become available in the public domain [15], few unconstrained datasets have been released with operationally relevant protocols....

    [...]

  • ...0 0 0 true limited VGG [15] 2,622 982,803 375 0 0 – limited MegaFace [13] – 1M – 0 – – full WIDER FACE [19] – 32,203 – 0 – true full CASIA Webface [20] 10,575 494,414 46....

    [...]

  • ...The CNN benchmark used an open-source, state-of-the-art model trained on the VGG face dataset [15]....

    [...]

  • ...After deduplication with the publicly available VGG dataset [15] and the CASIA Webface dataset [20], 106 overlapping subjects were removed to keep the subjects in external training sets and IJB-B disjoint....

    [...]

Journal Article
TL;DR: A semi-automatical way to collect face images from Internet is proposed and a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace is built, based on which a 11-layer CNN is used to learn discriminative representation and obtain state-of-theart accuracy on LFW and YTF.
Abstract: Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, ie, 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, this paper proposes a semi- automatical way to collect face images from Internet and builds a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace. Based on the database, we use a 11-layer CNN to learn discriminative representation and obtain state- of-theart accuracy on LFW and YTF.

1,705 citations


"IARPA Janus Benchmark-B Face Datase..." refers methods in this paper

  • ...0 0 0 true limited VGG [15] 2,622 982,803 375 0 0 – limited MegaFace [13] – 1M – 0 – – full WIDER FACE [19] – 32,203 – 0 – true full CASIA Webface [20] 10,575 494,414 46....

    [...]

  • ...After deduplication with the publicly available VGG dataset [15] and the CASIA Webface dataset [20], 106 overlapping subjects were removed to keep the subjects in external training sets and IJB-B disjoint....

    [...]