scispace - formally typeset
Search or ask a question
Author

Tal Hassner

Bio: Tal Hassner is an academic researcher from Facebook. The author has contributed to research in topics: Facial recognition system & Face (geometry). The author has an hindex of 42, co-authored 113 publications receiving 9225 citations. Previous affiliations of Tal Hassner include Open University & University of Southern California.


Papers
More filters
Proceedings ArticleDOI
20 Jun 2011
TL;DR: A comprehensive database of labeled videos of faces in challenging, uncontrolled conditions, the ‘YouTube Faces’ database, along with benchmark, pair-matching tests are presented and a novel set-to-set similarity measure, the Matched Background Similarity (MBGS), is described.
Abstract: Recognizing faces in unconstrained videos is a task of mounting importance. While obviously related to face recognition in still images, it has its own unique characteristics and algorithmic requirements. Over the years several methods have been suggested for this problem, and a few benchmark data sets have been assembled to facilitate its study. However, there is a sizable gap between the actual application needs and the current state of the art. In this paper we make the following contributions. (a) We present a comprehensive database of labeled videos of faces in challenging, uncontrolled conditions (i.e., ‘in the wild’), the ‘YouTube Faces’ database, along with benchmark, pair-matching tests1. (b) We employ our benchmark to survey and compare the performance of a large variety of existing video face recognition techniques. Finally, (c) we describe a novel set-to-set similarity measure, the Matched Background Similarity (MBGS). This similarity is shown to considerably improve performance on the benchmark tests.

1,423 citations

Journal ArticleDOI
TL;DR: This paper presents a robust face alignment technique, which explicitly considers the uncertainties of facial feature detectors, and describes the dropout-support vector machine approach used by the system for face attribute estimation, in order to avoid over-fitting.
Abstract: This paper concerns the estimation of facial attributes—namely, age and gender—from images of faces acquired in challenging, in the wild conditions This problem has received far less attention than the related problem of face recognition, and in particular, has not enjoyed the same dramatic improvement in capabilities demonstrated by contemporary face recognition systems Here, we address this problem by making the following contributions First, in answer to one of the key problems of age estimation research—absence of data—we offer a unique data set of face images, labeled for age and gender, acquired by smart-phones and other mobile devices, and uploaded without manual filtering to online image repositories We show the images in our collection to be more challenging than those offered by other face-photo benchmarks Second, we describe the dropout-support vector machine approach used by our system for face attribute estimation, in order to avoid over-fitting This method, inspired by the dropout learning techniques now popular with deep belief networks, is applied here for training support vector machines, to the best of our knowledge, for the first time Finally, we present a robust face alignment technique, which explicitly considers the uncertainties of facial feature detectors We report extensive tests analyzing both the difficulty levels of contemporary benchmarks as well as the capabilities of our own system These show our method to outperform state-of-the-art by a wide margin

710 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: This work explores the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces, and shows that this leads to a straightforward, efficient and easy to implement method for frontalization.
Abstract: “Frontalization” is the process of synthesizing frontal facing views of faces appearing in single unconstrained photos. Recent reports have suggested that this process may substantially boost the performance of face recognition systems. This, by transforming the challenging problem of recognizing faces viewed from unconstrained viewpoints to the easier problem of recognizing faces in constrained, forward facing poses. Previous frontalization methods did this by attempting to approximate 3D facial shapes for each query image. We observe that 3D face shape estimation from unconstrained photos may be a harder problem than frontalization and can potentially introduce facial misalignments. Instead, we explore the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces. We show that this leads to a straightforward, efficient and easy to implement method for frontalization. More importantly, it produces aesthetic new frontal views and is surprisingly effective when used for face recognition and gender estimation.

548 citations

01 Oct 2008
TL;DR: This paper explores how well this performance carries over to the related task of multi-option face identification, specifically on the Labeled Faces in the Wild (LFW) image set, and seeks to compare the performance of similarity learning methods to descriptor based methods.
Abstract: Recent methods for learning similarity between images have presented impressive results in the problem of pair matching (same/notsame classification) of face images. In this paper we explore how well this performance carries over to the related task of multi-option face identification, specifically on the Labeled Faces in the Wild (LFW) image set. In addition, we seek to compare the performance of similarity learning methods to descriptor based methods. We present the following results: (1) Descriptor-Based approaches that efficiently encode the appearance of each face image as a vector outperform the leading similarity based method in the task of multi-option face identification. (2) Straightforward use of Euclidean distance on the descriptor vectors performs somewhat worse than the similarity learning methods on the task of pair matching. (3) Adding a learning stage, the performance of descriptor based methods matches and exceeds that of similarity methods on the pair matching task. (4) A novel patch based descriptor we propose is able to improve the performance of the successful Local Binary Pattern (LBP) descriptor in both multi-option identification and same/not-same classification.

504 citations

Posted Content
TL;DR: In this article, the authors explore the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces and show that this leads to a straightforward, efficient and easy to implement method for frontalization.
Abstract: "Frontalization" is the process of synthesizing frontal facing views of faces appearing in single unconstrained photos. Recent reports have suggested that this process may substantially boost the performance of face recognition systems. This, by transforming the challenging problem of recognizing faces viewed from unconstrained viewpoints to the easier problem of recognizing faces in constrained, forward facing poses. Previous frontalization methods did this by attempting to approximate 3D facial shapes for each query image. We observe that 3D face shape estimation from unconstrained photos may be a harder problem than frontalization and can potentially introduce facial misalignments. Instead, we explore the simpler approach of using a single, unmodified, 3D surface as an approximation to the shape of all input faces. We show that this leads to a straightforward, efficient and easy to implement method for frontalization. More importantly, it produces aesthetic new frontal views and is surprisingly effective when used for face recognition and gender estimation.

486 citations


Cited by
More filters
Proceedings Article
21 Jun 2010
TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Abstract: Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all have the same weights but have progressively more negative biases. The learning and inference rules for these "Stepped Sigmoid Units" are unchanged. They can be approximated efficiently by noisy, rectified linear units. Compared with binary units, these units learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset. Unlike binary units, rectified linear units preserve information about relative intensities as information travels through multiple layers of feature detectors.

14,799 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.
Abstract: Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.

8,289 citations

Proceedings ArticleDOI
07 Dec 2015
TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.
Abstract: We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets, 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets, and 3) Our learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks. In addition, the features are compact: achieving 52.8% accuracy on UCF101 dataset with only 10 dimensions and also very efficient to compute due to the fast inference of ConvNets. Finally, they are conceptually very simple and easy to train and use.

7,091 citations

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work revisits both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network.
Abstract: In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify. We revisit both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network. This deep network involves more than 120 million parameters using several locally connected layers without weight sharing, rather than the standard convolutional layers. Thus we trained it on the largest facial dataset to-date, an identity labeled dataset of four million facial images belonging to more than 4, 000 identities. The learned representations coupling the accurate model-based alignment with the large facial database generalize remarkably well to faces in unconstrained environments, even with a simple classifier. Our method reaches an accuracy of 97.35% on the Labeled Faces in the Wild (LFW) dataset, reducing the error of the current state of the art by more than 27%, closely approaching human-level performance.

6,132 citations