Identifying individuals in video by combining 'generative' and discriminative head models

doi:10.1109/ICCV.2005.116

Proceedings ArticleDOI

Identifying individuals in video by combining 'generative' and discriminative head models

Mark Everingham, +1 more

- Vol. 2, pp 1103-1110

Chats0

TLDR

Two areas of innovation are described: the first is to capture the 3-D appearance of the entire head, rather than just the face region, so that visual features such as the hairline can be exploited, and the second is to combine discriminative and 'generative' approaches for detection and recognition.

Abstract:

The objective of this work is automatic detection and identification of individuals in unconstrained consumer video, given a minimal number of labelled faces as training data. Whilst much work has been done on (mainly frontal) face detection and recognition, current methods are not sufficiently robust to deal with the wide variations in pose and appearance found in such video. These include variations in scale, illumination, expression, partial occlusion, motion blur, etc. We describe two areas of innovation: the first is to capture the 3-D appearance of the entire head, rather than just the face region, so that visual features such as the hairline can be exploited. The second is to combine discriminative and 'generative' approaches for detection and recognition. Images rendered using the head model are used to train a discriminative tree-structured classifier giving efficient detection and pose estimates over a very wide pose range with three degrees of freedom. Subsequent verification of the identity is obtained using the head model in a 'generative' framework. We demonstrate excellent performance in detecting and identifying three characters and their poses in a TV situation comedy

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

"Hello! My name is... Buffy" - Automatic Naming of Characters in TV Video

Mark Everingham, +2 more

TL;DR: It is demonstrated that high precision can be achieved by combining multiple sources of information, both visual and textual, by automatic generation of time stamped character annotation by aligning subtitles and transcripts.

...read moreread less

Journal ArticleDOI

A survey of appearance models in visual object tracking

Xi Li, +5 more

- 08 Oct 2013 -

ACM Transactions on Intelligent Systems ...

TL;DR: A detailed review of the existing 2D appearance models for visual object tracking can be found in this article, where the authors decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling.

...read moreread less

Posted Content

A Survey of Appearance Models in Visual Object Tracking

Xi Li, +5 more

- 20 Mar 2013 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This survey provides a detailed review of the existing 2D appearance models for visual object tracking and takes a module-based architecture that enables readers to easily grasp the key points ofVisual object tracking.

...read moreread less

Journal ArticleDOI

Taking the bite out of automated naming of characters in TV video

Mark Everingham, +2 more

- 01 Apr 2009 -

Image and Vision Computing

TL;DR: It is demonstrated that high precision can be achieved by combining multiple sources of information, both visual and textual, by automatic generation of time stamped character annotation by aligning subtitles and transcripts.

...read moreread less

Journal ArticleDOI

Detecting People Looking at Each Other in Videos

Manuel J. Marín-Jiménez, +3 more

- 01 Feb 2014 -

International Journal of Computer Vision

TL;DR: The objective of this work is to determine if people are interacting in TV video by detecting whether they are looking at each other or not and to determine both the temporal period of the interaction and also spatially localize the relevant people.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

Paul A. Viola, +1 more

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Journal ArticleDOI

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

Peter N. Belhumeur, +2 more

- 01 Jul 1997 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.

...read moreread less

Proceedings Article

Experiments with a new boosting algorithm

Yoav Freund, +1 more

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

...read moreread less

Experiment with a new boosting algorithm

Y. Freund

Collapse

Identifying individuals in video by combining 'generative' and discriminative head models

Citations

"Hello! My name is... Buffy" - Automatic Naming of Characters in TV Video

A survey of appearance models in visual object tracking

A Survey of Appearance Models in Visual Object Tracking

Taking the bite out of automated naming of characters in TV video

Detecting People Looking at Each Other in Videos

References

Distinctive Image Features from Scale-Invariant Keypoints

Rapid object detection using a boosted cascade of simple features

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

Experiments with a new boosting algorithm

Experiment with a new boosting algorithm

Related Papers (5)

"Hello! My name is... Buffy" - Automatic Naming of Characters in TV Video

Robust Real-Time Face Detection

Person spotting: video shot retrieval for face sets

Names and faces in the news

Distinctive Image Features from Scale-Invariant Keypoints