scispace - formally typeset
Proceedings ArticleDOI

Identifying individuals in video by combining 'generative' and discriminative head models

Mark Everingham, +1 more
- Vol. 2, pp 1103-1110
Reads0
Chats0
TLDR
Two areas of innovation are described: the first is to capture the 3-D appearance of the entire head, rather than just the face region, so that visual features such as the hairline can be exploited, and the second is to combine discriminative and 'generative' approaches for detection and recognition.
Abstract
The objective of this work is automatic detection and identification of individuals in unconstrained consumer video, given a minimal number of labelled faces as training data. Whilst much work has been done on (mainly frontal) face detection and recognition, current methods are not sufficiently robust to deal with the wide variations in pose and appearance found in such video. These include variations in scale, illumination, expression, partial occlusion, motion blur, etc. We describe two areas of innovation: the first is to capture the 3-D appearance of the entire head, rather than just the face region, so that visual features such as the hairline can be exploited. The second is to combine discriminative and 'generative' approaches for detection and recognition. Images rendered using the head model are used to train a discriminative tree-structured classifier giving efficient detection and pose estimates over a very wide pose range with three degrees of freedom. Subsequent verification of the identity is obtained using the head model in a 'generative' framework. We demonstrate excellent performance in detecting and identifying three characters and their poses in a TV situation comedy

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

"Hello! My name is... Buffy" - Automatic Naming of Characters in TV Video

TL;DR: It is demonstrated that high precision can be achieved by combining multiple sources of information, both visual and textual, by automatic generation of time stamped character annotation by aligning subtitles and transcripts.
Journal ArticleDOI

A survey of appearance models in visual object tracking

TL;DR: A detailed review of the existing 2D appearance models for visual object tracking can be found in this article, where the authors decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling.
Posted Content

A Survey of Appearance Models in Visual Object Tracking

TL;DR: This survey provides a detailed review of the existing 2D appearance models for visual object tracking and takes a module-based architecture that enables readers to easily grasp the key points ofVisual object tracking.
Journal ArticleDOI

Taking the bite out of automated naming of characters in TV video

TL;DR: It is demonstrated that high precision can be achieved by combining multiple sources of information, both visual and textual, by automatic generation of time stamped character annotation by aligning subtitles and transcripts.
Journal ArticleDOI

Detecting People Looking at Each Other in Videos

TL;DR: The objective of this work is to determine if people are interacting in TV video by detecting whether they are looking at each other or not and to determine both the temporal period of the interaction and also spatially localize the relevant people.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.
Journal ArticleDOI

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Proceedings Article

Experiments with a new boosting algorithm

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Related Papers (5)