scispace - formally typeset
Open AccessProceedings ArticleDOI

It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation

Reads0
Chats0
TLDR
In this paper, the authors propose an appearance-based method that only takes the full face image as input, which achieves improvements of up to 14.3% on MPIIGaze and 27.7% on EYEDIAP for person-independent gaze estimation.
Abstract
Eye gaze is an important non-verbal cue for human affect analysis. Recent gaze estimation work indicated that information from the full face region can benefit performance. Pushing this idea further, we propose an appearance-based method that, in contrast to a long-standing line of work in computer vision, only takes the full face image as input. Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps to flexibly suppress or enhance information in different facial regions. Through extensive evaluation, we show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation, achieving improvements of up to 14.3% on MPIIGaze and 27.7% on EYEDIAP for person-independent 3D gaze estimation. We further show that this improvement is consistent across different illumination conditions and gaze directions and particularly pronounced for the most challenging extreme head poses.

read more

Citations
More filters
Journal ArticleDOI

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

TL;DR: GazeNet as discussed by the authors proposes a deep appearance-based gaze estimation method for unconstrained gaze estimation from a monocular RGB camera without assumptions regarding user, environment, or camera.
Book ChapterDOI

RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

TL;DR: This work addresses the issue of ground truth annotation by measuring head pose using a motion capture system and eye gaze using mobile eyetracking glasses and applies semantic image inpainting to the area covered by the glasses to bridge the gap between training and testing images by removing the obtrusiveness of the glasses.
Posted Content

MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

TL;DR: It is shown that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative, and GazeNet is proposed, the first deep appearance-based gaze estimation method.
Proceedings ArticleDOI

Gaze360: Physically Unconstrained Gaze Estimation in the Wild

TL;DR: Gaze360 as discussed by the authors is a large-scale remote gaze tracking dataset and method for robust 3D gaze estimation in unconstrained images, which consists of 238 subjects in indoor and outdoor environments with labelled three-dimensional (3D) gaze across a wide range of head poses and distances.
Proceedings ArticleDOI

Learning to find eye region landmarks for remote gaze estimation in unconstrained settings

TL;DR: This work presents a novel learning-based method for eye region landmark localization that enables conventional methods to be competitive to latest appearance-based methods and exceeds the state of the art for iris localization and eye shape registration on real-world imagery.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Proceedings ArticleDOI

Fast R-CNN

TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.
Book ChapterDOI

Visualizing and Understanding Convolutional Networks

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.
Related Papers (5)