scispace - formally typeset
Search or ask a question
Proceedings Article•DOI•

Using Multimedia Ontology for Generating Conceptual Annotations and Hyperlinks in Video Collections

TL;DR: This work presents a novel approach for defining video domain concepts in an ontology using properties that can be observed from the media and proposes the use of Bayesian network as the reasoning mechanism for doing inferencing tasks in the presence of uncertainty.
Abstract: To enable seamless integration of video information on the semantic web, we require that the knowledge of a video domain be formally specified in an ontology. We present a novel approach for defining video domain concepts in an ontology using properties that can be observed from the media. We use the ontology specified knowledge for recognizing concepts relevant to a video scene by making observations for the media properties of concepts as well as making inferences from other ontological concept definitions and relations. For this purpose we introduce new language constructs to OWL (Web Ontology Language), which are used to specify the inherently uncertain nature of media observations. The new constructs also allow additional semantics concerned with the association of media properties with concepts. We propose the use of Bayesian network as the reasoning mechanism for doing inferencing tasks in the presence of uncertainty. The video is annotated with the relevant concepts defined in the ontology. These conceptual annotations are used to create hyperlinks in the video collection.
Citations
More filters
Proceedings Article•DOI•
24 Sep 2007
TL;DR: This paper presents a comprehensive representation scheme for video semantic ontology in which all the three components are well studied, and leverage LSCOM to construct the concept lexicon, describe concept property as the weights of different modalities which are obtained manually or by data-driven approach.
Abstract: Recent research has discovered that leveraging ontology is an effective way to facilitate semantic video concept detection. As an explicit knowledge representation, a formal ontology definition usually consists of a lexicon, properties, and relations. In this paper, we present a comprehensive representation scheme for video semantic ontology in which all the three components are well studied. Specifically, we leverage LSCOM to construct the concept lexicon, describe concept property as the weights of different modalities which are obtained manually or by data-driven approach, and model two types of concept relations (i.e., pairwise concept correlation and hierarchical relation). In contrast with most existing ontologies which are only focused on one or two components for domain-specific videos, the proposed ontology is more comprehensive and general. To validate the effectiveness of this ontology, we further apply it to video concept detection. The experiments on TRECVID 2005 corpus have demonstrated a superior performance compared to existing key approaches to video concept detection.

56 citations

Proceedings Article•DOI•
30 Oct 2008
TL;DR: This work uses MOWL, a multimedia extension of Web Ontology Language (OWL) which is capable of describing domain concepts in terms of their media properties and of capturing the inherent uncertainties involved.
Abstract: In this work, we offer an approach to combine standard multimedia analysis techniques with knowledge drawn from conceptual metadata provided by domain experts of a specialized scholarly domain, to learn a domain-specific multimedia ontology from a set of annotated examples. A standard Bayesian network learning algorithm that learns structure and parameters of a Bayesian network is extended to include media observables in the learning. An expert group provides domain knowledge to construct a basic ontology of the domain as well as to annotate a set of training videos. These annotations help derive the associations between high-level semantic concepts of the domain and low-level MPEG-7 based features representing audio-visual content of the videos. We construct a more robust and refined version of this ontology by learning from this set of conceptually annotated videos. To encode this knowledge, we use MOWL, a multimedia extension of Web Ontology Language (OWL) which is capable of describing domain concepts in terms of their media properties and of capturing the inherent uncertainties involved. We use the ontology specified knowledge for recognizing concepts relevant to a video to annotate fresh addition to the video database with relevant concepts in the ontology. These conceptual annotations are used to create hyperlinks in the video collection, to provide an effective video browsing interface to the user.

20 citations

Proceedings Article•DOI•
31 Oct 2008
TL;DR: A method for formally defining 3D spatio-temporal relations between elementary media objects using a set of fuzzy membership functions to support soft decision making for multimedia data interpretation and to provide graded ranking.
Abstract: Complex media events are often characterized by spatio-temporal relations between its constituent media objects. A multimedia query language should support specification of such relations for semantic retrieval. We propose a method for formally defining 3D spatio-temporal relations between elementary media objects in this paper. To support soft decision making for multimedia data interpretation and to provide graded ranking, we define these relations using a set of fuzzy membership functions. It is possible to define fuzzy 3D extensions of Allen's relations as well as arbitrary new relations using our method. This method can be incorporated with upcoming multimedia query languages, such as MP7QF.

20 citations


Cites methods from "Using Multimedia Ontology for Gener..."

  • ...[5] proposes a method to compute the belief value of a concept based on the belief values of the media objects detected and the belief value of a spatio-temporal relation between them....

    [...]

  • ...This belief value can be used for semantic interpretation of events in a media stream as in [5]....

    [...]

  • ...This belief value can be used for semantic interpretation of the media [5]....

    [...]

Journal Article•DOI•
TL;DR: A content-directed ontology reasoning approach to produce meaningful sports video summarisation and a sports video descriptive language (SVDL) based on the proposed ontology that can facilitate the metadata acquisition of video and the improvement of query performance.
Abstract: As digital sports video becomes increasingly pervasive, semantic video summary becomes one of the important components for the next generation of multimedia applications. Ontology is a feasible way to mine the semantic information from the video stream. However, current ontology-based methods did not concentrate on the effectiveness and soundness of semantic reasoning. Here, the authors propose a content-directed ontology reasoning approach to produce meaningful sports video summarisation. The proposed ontology can facilitate the metadata acquisition of video and the improvement of query performance. It also provides a flexible way to query the sports video database, which cannot be achieved by simple keyword search. For annotating, describing and managing the sports video content, we propose a sports video descriptive language (SVDL) based on the proposed ontology. Moreover, the semantically meaningful sports video abstraction is produced by reasoning engine which is based on the extension of the Tableau algorithm. Meanwhile, the soundness and completeness of the reasoning algorithm can be solidly proved. Subjective assessment experimental results reveal the reliability and efficiency of the propose scheme.

18 citations

Journal Article•DOI•
TL;DR: This paper presents a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks and focuses on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest.
Abstract: The problems associated with automatic analysis of news telecasts are more severe in a country like India, where there are many national and regional language channels, besides English. In this paper, we present a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks. Further, we focus on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest. English keywords are derived from RSS feed and converted to Indian language equivalents for detection in speech and on ticker texts. Restricting the keyword list to a manageable number results in drastic improvement in indexing performance. We present illustrative examples and detailed experimental results to substantiate our claim.

14 citations


Cites background from "Using Multimedia Ontology for Gener..."

  • ...Finally, Section 6 concludes the paper and provides direction for future work....

    [...]

References
More filters
Journal Article•DOI•
TL;DR: This work presents a high-level overview of the MPEG-7 standard, discussing the scope, basic terminology, and potential applications, and compares the relationship with other standards to highlight its capabilities.
Abstract: MPEG-7, formally known as the Multimedia Content Description Interface, includes standardized tools (descriptors, description schemes, and language) enabling structural, detailed descriptions of audio-visual information at different granularity levels (region, image, video segment, collection) and in different areas (content description, management, organization, navigation, and user interaction). It aims to support and facilitate a wide range of applications, such as media portals, content broadcasting, and ubiquitous multimedia. We present a high-level overview of the MPEG-7 standard. We first discuss the scope, basic terminology, and potential applications. Next, we discuss the constituent components. Then, we compare the relationship with other standards to highlight its capabilities.

734 citations


"Using Multimedia Ontology for Gener..." refers background in this paper

  • ...The MPEG-7 standard [11] has been defined for this purpose....

    [...]

Book•
19 Jun 2012
TL;DR: This text is a reprint of the seminal 1989 book Probabilistic Reasoning in Expert systems: Theory and Algorithms, which helped serve to create the field the authors now call Bayesian networks and provides an insightful comparison of the two most prominent approaches to probability.
Abstract: This text is a reprint of the seminal 1989 book Probabilistic Reasoning in Expert systems: Theory and Algorithms, which helped serve to create the field we now call Bayesian networks. It introduces the properties of Bayesian networks (called causal networks in the text), discusses algorithms for doing inference in Bayesian networks, covers abductive inference, and provides an introduction to decision analysis. Furthermore, it compares rule-base experts systems to ones based on Bayesian networks, and it introduces the frequentist and Bayesian approaches to probability. Finally, it provides a critique of the maximum entropy formalism. Probabilistic Reasoning in Expert Systems was written from the perspective of a mathematician with the emphasis being on the development of theorems and algorithms. Every effort was made to make the material accessible. There are ample examples throughout the text. This text is important reading for anyone interested in both the fundamentals of Bayesian networks and in the history of how they came to be. It also provides an insightful comparison of the two most prominent approaches to probability.

687 citations

Proceedings Article•DOI•
05 Jan 2004
TL;DR: This work proposes to incorporate Bayesian networks (BN), a widely used graphic model for knowledge representation under uncertainty and OWL, the de facto industry standard ontology language recommended by W3C to support uncertain ontology representation and ontology reasoning and mapping.
Abstract: To support uncertain ontology representation and ontology reasoning and mapping, we propose to incorporate Bayesian networks (BN), a widely used graphic model for knowledge representation under uncertainty and OWL, the de facto industry standard ontology language recommended by W3C. First, OWL is augmented to allow additional probabilistic markups, so probabilities can be attached with individual concepts and properties in an OWL ontology. Secondly, a set of translation rules is defined to convert this probabilistically annotated OWL ontology into the directed acyclic graph (DAG) of a BN. Finally, the BN is completed by constructing conditional probability tables (CPT) for each node in the DAG. Our probabilistic extension to OWL is consistent with OWL semantics, and the translated BN is associated with a joint probability distribution over the application domain. General Bayesian network inference procedures (e.g., belief propagation or junction tree) can be used to compute P(C/spl bsol/e): the degree of the overlap or inclusion between a concept C and a concept represented by a description e. We also provide a similarity measure that can be used to find the most similar concept that a given description belongs to.

262 citations

Journal Article•DOI•
Jane Hunter1•
TL;DR: The ABC model's ability to mediate and integrate between multimedia metadata vocabularies is evaluated by illustrating how it can provide the foundation to facilitate semantic interoperability between MPEG-7, MPEG-21 and other domain-specific metadata vocABularies.
Abstract: A core ontology is one of the key building blocks necessary to enable the scalable assimilation of information from diverse multimedia sources. A complete and extensible ontology that expresses the basic concepts that are common across a variety of domains and media types and that can provide the basis for specialization into domain-specific concepts and vocabularies, is essential for well-defined mappings between domain-specific knowledge representations (i.e., metadata vocabularies) and the subsequent building of a variety of services such as cross-domain searching, tracking, browsing, data mining and knowledge acquisition. As more and more communities develop metadata application profiles which combine terms from multiple vocabularies (e.g., Dublin Core, MPEG-7, MPEG-21, CIDOC/CRM, FGDC, IMS), a core ontology will provide a common understanding of the basic entities and relationships, which is essential for semantic interoperability and the development of additional services based on deductive inferencing. In this paper, we first propose such a core ontology (the ABC model) which was developed in response to a need to integrate information from multiple genres of multimedia content within digital libraries and archives. Although the MPEG-21 RDD was influenced by the ABC model and is based on a model extremely similar to ABC, we believe that it is important to define a separate and domain-independent top-level extensible ontology for scenarios in which either MPEG-21 is irrelevant or to enable the attachment of ontologies from communities external to MPEG, for example, the museum domain (CIDOC/CRM) or the biomedical domain (ON9.3). We evaluate the ABC model's ability to mediate and integrate between multimedia metadata vocabularies by illustrating how it can provide the foundation to facilitate semantic interoperability between MPEG-7, MPEG-21 and other domain-specific metadata vocabularies. By expressing the semantics of both MPEG-7 and MPEG-21 metadata terms in RDF Schema/DAML+OIL [and eventually the Web Ontology Language (OWL)] and attaching the MPEG-7 and MPEG-21 class and property hierarchies to the appropriate top-level classes and properties of the ABC model, we have defined a single distributed machine-understandable ontology. The resulting ontology provides semantic knowledge which is nonexistent within declarative XML schemas or XML-encoded metadata descriptions. Finally, in order to illustrate how such an ontology will contribute to the interoperability of data and services across the entire multimedia content delivery chain, we describe a number of valuable services which have been developed or could potentially be developed using the resulting merged ontologies.

157 citations


"Using Multimedia Ontology for Gener..." refers background in this paper

  • ...Several researchers [8, 16, 17] have used domain ontologies to interpret the metadata....

    [...]

Journal Article•DOI•
TL;DR: A novel framework to make some advances toward the final goal to solve the challenging problems of semantic gap, semantic video concept modeling, semanticVideo classification, and concept-oriented video database indexing and access is proposed.
Abstract: Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing.

147 citations


"Using Multimedia Ontology for Gener..." refers methods in this paper

  • ...In contrast, we explicitly model the semantic relationships which exist between the concepts in the structured content domains of videos....

    [...]

  • ...Fan et al[2] model semantic concepts in medical domain using Gaussian mixture models....

    [...]