Multimodal Video Indexing: A Review of the State-of-the-art

doi:10.1023/B:MTAP.0000046380.27575.A5

Journal ArticleDOI

Multimodal Video Indexing: A Review of the State-of-the-art

Cees G. M. Snoek, +1 more

- 01 Jan 2005 -

Multimedia Tools and Applications

- Vol. 25, Iss: 1, pp 5-35

Chats0

TLDR

A unifying and multimodal framework is put forward, which views a video document from the perspective of its author, which forms the guiding principle for identifying index types, for which automatic methods are found in literature.

Abstract:

Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Image retrieval: Ideas, influences, and trends of the new age

Ritendra Datta, +3 more

- 08 May 2008 -

ACM Computing Surveys

TL;DR: Almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation are surveyed, and the spawning of related subfields are discussed, to discuss the adaptation of existing image retrieval techniques to build systems that can be useful in the real world.

...read moreread less

Journal ArticleDOI

Multimodal Machine Learning: A Survey and Taxonomy

Tadas Baltrusaitis, +2 more

- 01 Feb 2019 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy to enable researchers to better understand the state of the field and identify directions for future research.

...read moreread less

Proceedings ArticleDOI

A new approach to cross-modal multimedia retrieval

Nikhil Rasiwasia, +6 more

TL;DR: It is shown that accounting for cross-modal correlations and semantic abstraction both improve retrieval accuracy and are shown to outperform state-of-the-art image retrieval systems on a unimodal retrieval task.

...read moreread less

Journal ArticleDOI

Multimodal fusion for multimedia analysis: a survey

Pradeep K. Atrey, +3 more

- 01 Nov 2010 -

Multimedia Systems

TL;DR: This survey aims at providing multimedia researchers with a state-of-the-art overview of fusion strategies, which are used for combining multiple modalities in order to accomplish various multimedia analysis tasks.

...read moreread less

Journal ArticleDOI

A Survey on Visual Content-Based Video Indexing and Retrieval

Weiming Hu, +4 more

TL;DR: Methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, and video retrieval including query interfaces are analyzed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

Lawrence R. Rabiner

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.

...read moreread less

Book

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

Judea Pearl

TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.

...read moreread less

Journal ArticleDOI

Indexing by Latent Semantic Analysis

Scott Deerwester, +4 more

- 01 Sep 1990 -

Journal of the Association for Informati...

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.

...read moreread less

Journal ArticleDOI

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

Peter N. Belhumeur, +2 more

- 01 Jul 1997 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.

...read moreread less

Book

Foundations of Statistical Natural Language Processing

Christopher D. Manning, +1 more

TL;DR: This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.

...read moreread less