scispace - formally typeset
Open AccessProceedings ArticleDOI

Neural Aggregation Network for Video Face Recognition

TLDR
This NAN is trained with a standard classification or verification loss without any extra supervision signal, and it is found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces.
Abstract
This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

VGGFace2: A Dataset for Recognising Faces across Pose and Age

TL;DR: VGGFace2 as discussed by the authors is a large-scale face dataset with 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject.
Proceedings ArticleDOI

Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks

TL;DR: A new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs) is presented, and the superiority of the proposed method over the state-of-the-art approaches is proved.
Journal ArticleDOI

Deep face recognition: A survey

TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.
Proceedings ArticleDOI

Deep Face Recognition: A Survey

TL;DR: The survey provides a clear, structured presentation of the principal, state-of-the-art (SOTA) face recognition techniques appearing within the past five years in top computer vision venues with some open issues currently overlooked by the community.
Proceedings ArticleDOI

Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-identification

TL;DR: A novel end-to-end trainable framework, called Dual ATtention Matching network (DuATM), to learn context-aware feature sequences and perform attentive sequence comparison simultaneously, in which both intrasequence and inter-sequence attention strategies are used for feature refinement and feature-pair alignment.
References
More filters
Proceedings ArticleDOI

Going deeper with convolutions

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Proceedings ArticleDOI

FaceNet: A unified embedding for face recognition and clustering

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.
Proceedings ArticleDOI

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

TL;DR: This work revisits both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network.
Proceedings ArticleDOI

Deep face recognition

TL;DR: It is shown how a very large scale dataset can be assembled by a combination of automation and human in the loop, and the trade off between data purity and time is discussed.