Book ChapterDOI
Multimodal Person Identification in Movies
Jeroen Vendrig,Marcel Worring +1 more
- pp 175-185
TLDR
Quantitative results show that Who isWho is successful in helping annotators identify movie characters, and employment of a user model enables evaluation of interactivity in WhoIsWho.Abstract:
An important task for annotation of movies is finding out which characters are playing in a shot. Character identification is based on available information sources from various modalities. Fully automatic character identification is not feasible as the modalities are not semantically synchronized. As manual annotation is too time consuming, an interactive tool assisting the annotator is needed. We propose the WhoIsWho function for our interactive i-Notation system.WhoIsWho relates visual content to names extracted from movie scripts, working in both ways. We present extensive evaluation of character identification on six hours of movies. Employment of a user model enables evaluation of interactivity in WhoIsWho. Quantitative results show that WhoIsWho is successful in helping annotators identify movie characters.read more
Citations
More filters
Book ChapterDOI
Challenges of Image and Video Retrieval
TL;DR: The most frequently used image and video retrieval systems are typically oriented around text searches where manual annotation was already performed, which indicates that images and videos in large digital collections are being searched for through text searches.
Proceedings ArticleDOI
Multi-modal Person Identification in a Smart Environment
TL;DR: Experimental results obtained on the CLEAR 2007 evaluation corpus show that CRCM-based modality weighting improves the correct identification rates significantly, and the cumulative ratio of correct matches (CRCM) and distance-to-second-closest (DT2ND) measures are introduced.
Book ChapterDOI
ISL person identification systems in the CLEAR evaluations
Hazim Kemal Ekenel,Qin Jin +1 more
TL;DR: Three person identification systems that have been developed for the CLEAR evaluations are presented, based on single modalities- audio and video, whereas the third system uses both of these modalities.
Journal ArticleDOI
Interactive adaptive movie annotation
Jeroen Vendrig,Marcel Worring +1 more
TL;DR: In this paper, the authors present an interactive and adaptive i-Notation system, which describes actors' names, automatically processes multimodal information sources, and deals with available sources' varying quality.
Book ChapterDOI
ISL Person Identification Systems in the CLEAR 2007 Evaluations
TL;DR: The experimental results show that the face recognition system outperforms the speaker identification system significantly on the short duration test segments and Combination of the individual systems improves the performance further.
References
More filters
Journal ArticleDOI
Name-It: naming and detecting faces in news videos
TL;DR: Name-It, a system that associates faces and names in news videos, takes a multimodal video analysis approach: face sequence extraction and similarity evaluation from videos, name extraction from transcripts, and video-caption recognition.
Journal ArticleDOI
Constructing table-of-content for videos
TL;DR: This paper presents an effective semantic-level ToC construction technique based on intelligent unsupervised clustering that has the characteristics of better modeling the time locality and scene structure.
Journal ArticleDOI
Systematic evaluation of logical story unit segmentation
Jeroen Vendrig,Marcel Worring +1 more
TL;DR: A systematic evaluation of the mutual dependencies of segmentation methods and their performances and introduces a method measuring the quality of a segmentation method and its economic impact rather than the amount of errors.
Journal ArticleDOI
Learning to recognize speech by watching television
P.J. Jang,Alexander G. Hauptmann +1 more
TL;DR: This work describes its approach to collecting almost unlimited amounts of accurately transcribed speech data, which serves as training data for the acoustic model component of most high-accuracy speaker-independent speech-recognition systems.
Journal ArticleDOI
Tools for Browsing a TV Situation Comedy Based on Content Specific Attributes
TL;DR: An evaluation of the learning performance shows that a combination of low-level color signal features outperforms several other combinations of signal features in learning character labels in an episode of the TV situation comedy, Seinfeld.