scispace - formally typeset
Search or ask a question
Book ChapterDOI

Using Concept Recognition to Annotate a Video Collection

15 Dec 2009-pp 507-512
TL;DR: A scheme based on an ontological framework, to recognize concepts in multimedia data, in order to provide effective content-based access to a closed, domain-specific multimedia collection to provide an effective video browsing interface to the user.
Abstract: In this paper, we propose a scheme based on an ontological framework, to recognize concepts in multimedia data, in order to provide effective content-based access to a closed, domain-specific multimedia collection. The ontology for the domain is constructed from high-level knowledge of the domain lying with the domain experts, and further fine-tuned and refined by learning from multimedia data annotated by them. MOWL, a multimedia extension to OWL, is used to encode the concept to media-feature associations in the ontology as well as the uncertainties linked with observation of the perceptual multimedia data. Media feature classifiers help recognize low-level concepts in the videos, but the novelty of our work lies in discovery of high-level concepts in video content using the power of ontological relations between the concepts. This framework is used to provide rich, conceptual annotations to the video database, which can further be used to create hyperlinks in the video collection, to provide an effective video browsing interface to the user.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The efficacy of the ontology-based approach is demonstrated by constructing an ontology for the cultural heritage domain of Indian classical dance, and a browsing application is developed for semantic access to the heritage collection of Indian dance videos.
Abstract: Preservation of intangible cultural heritage, such as music and dance, requires encoding of background knowledge together with digitized records of the performances. We present an ontology-based approach for designing a cultural heritage repository for that purpose. Since dance and music are recorded in multimedia format, we use Multimedia Web Ontology Language (MOWL) to encode the domain knowledge. We propose an architectural framework that includes a method to construct the ontology with a labeled set of training data and use of the ontology to automatically annotate new instances of digital heritage artifacts. The annotations enable creation of a semantic navigation environment in a cultural heritage repository. We have demonstrated the efficacy of our approach by constructing an ontology for the cultural heritage domain of Indian classical dance, and have developed a browsing application for semantic access to the heritage collection of Indian dance videos.

66 citations


Cites background from "Using Concept Recognition to Annota..."

  • ...One of the key ingredients in our architecture is a cultural heritage ontology [Mallik and Chaudhury 2009] encoded in a novel multimedia ontology representation [Ghosh et al. 2007]....

    [...]

Proceedings ArticleDOI
03 Sep 2012
TL;DR: A computational model to represent BN dance steps is proposed as a SMart system for modelling BN steps, where SMart stands for System Modelled art and the detailed description of formulation of a dance position vector that comprises of thirty explicitly identified attributes is presented.
Abstract: BharataNatyam (BN) like any other Indian classical dance comprises of a sequence of possible and legitimate dance steps. It is estimated that using the main body parts namely head, neck, hand and leg itself, more than 5 lakh dance steps can be generated for a single beat. Choreographers and even dancers usually repeat their favorite dance steps or the conventional casual dance steps taught by their teacher while performing for multiple beats. As a result several valid and many other significant non-traditional dance steps remain unexplored. Hence, we propose to have an auto enumeration followed by auto classification of significant BN dance steps that can be used in dance performance and choreography. In short, we try to transform sheer art into a System Modelled art i.e. 'Art to SMart'. The foremost and most challenging task is to have a computational model that represents different BN dance poses. In this paper, we have proposed a computational model to represent BN dance steps and have presented the detailed description of formulation of a dance position vector that comprises of thirty explicitly identified attributes to capture and represent all variations of a BN dance step. We have named it as a SMart system for modelling BN steps, where SMart stands for System Modelled art. We have also demonstrated sample dance steps and their corresponding representations with appropriate dance step images.

14 citations


Cites background from "Using Concept Recognition to Annota..."

  • ...RELATED WORK The work carried out till date for dance can be broadly classified under the following main heads – Animation[17],[18], Ontology based construction of Indian Classical Dance repository[1][2], Motion capture[6][7][11][14][19], extracting dance’s semantic content [4][5][12][15]....

    [...]

Posted Content
TL;DR: This paper is an attempt to review research work reported in the literature, categorize and group significant research work completed in a span of 1967–2020 in the field of automating dance, and identify six major categories corresponding to the use of computers in dance automation.
Abstract: Dance is an art and when technology meets this kind of art, it's a novel attempt in itself. Several researchers have attempted to automate several aspects of dance, right from dance notation to choreography. Furthermore, we have encountered several applications of dance automation like e-learning, heritage preservation, etc. Despite several attempts by researchers for more than two decades in various styles of dance all round the world, we found a review paper that portrays the research status in this area dating to 1990 \cite{politis1990computers}. Hence, we decide to come up with a comprehensive review article that showcases several aspects of dance automation. This paper is an attempt to review research work reported in the literature, categorize and group all research work completed so far in the field of automating dance. We have explicitly identified six major categories corresponding to the use of computers in dance automation namely dance representation, dance capturing, dance semantics, dance generation, dance processing approaches and applications of dance automation systems. We classified several research papers under these categories according to their research approach and functionality. With the help of proposed categories and subcategories one can easily determine the state of research and the new avenues left for exploration in the field of dance automation.

7 citations


Cites background or methods from "Using Concept Recognition to Annota..."

  • ...[34] offered a robust ground for several multimedia search, retrieval and browsing applications....

    [...]

  • ...[34] using the power of MOWL has provided an effective video browsing interface to the user through a Bayesian Network for Indian Classical Dance such as BharataNatyam, Odissi, Kutchipudi and Kathak including music performances like Hindustani and Carnatic music....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors have attempted to automate several aspects of dance, right from dance notation to choreography, and they have shown that dance is an art and when technology meets this kind of art, it is a novel attempt in itself.
Abstract: Dance is an art and when technology meets this kind of art, it is a novel attempt in itself. Many researchers have attempted to automate several aspects of dance, right from dance notation to chore...

5 citations

01 Jan 2015
TL;DR: A review of the ontology-based label extraction based on the input data, the utilized technique, and the type of utilized ontology is presented and the relative advantages and disadvantages of each category are determined.
Abstract: Ontology-based label extraction is extensively used to interpret the semantics found in image and video data. Particularly, ontology-based label extraction is one of the main steps in object class recognition, image annotation, and image disambiguation. These applications have important roles in the field of image analysis, and as such, a number of variations of the ontology-based label extraction used in these applications have been reported in the literature. These variations involve ontology development and utilization, and can affect the applicability (e.g., domain- and application-dependency) as well as the accuracy of the output . Unfortunately, the variability aspect of this variation has neither been established nor tracked. Thus, the variations were not configured. A review of the ontology-based label extraction based on the input data, the utilized technique, and the type of utilized ontology is presented in this paper. The ontology-based label extraction is categorized based on two aspects, namely, the type of input data and the type of ontology used. These two aspects determine the type of the label extraction technique to be used. As a result, the relative advantages and disadvantages of each category are determined. The gaps and future research directions in this field are also highlighted.

3 citations


Cites methods from "Using Concept Recognition to Annota..."

  • ...In label extraction, Mallik and Chaudhary [28] proposed an annotation method, which uses the MOWL specification to create a domain ontology....

    [...]

  • ...ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195 Figure 9: Hierarchy of the top-level concepts in MPEG-7 [4] In label extraction, Mallik and Chaudhary [28] proposed an annotation method, which uses the MOWL specification to create a domain ontology....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings Article
30 Jul 1999
TL;DR: This work proposes a widely applicable generalization of maximum likelihood model fitting by tempered EM, based on a mixture decomposition derived from a latent class model which results in a more principled approach which has a solid foundation in statistics.
Abstract: Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.

2,306 citations


"Using Concept Recognition to Annota..." refers methods in this paper

  • ...• The process automatically learns the probability distributions of the spatio-temporal words and intermediate topics for detecting action categories using pLSA technique [5]....

    [...]

Posted Content
TL;DR: Probabilistic Latent Semantic Analysis (PLSA) as mentioned in this paper is a statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text and in related areas.
Abstract: Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.

2,233 citations

Journal ArticleDOI
TL;DR: This work systematically study the problem of event recognition in unconstrained news video sequences by adopting the discriminative kernel-based method for which video clip similarity plays an important role and develops temporally aligned pyramid matching (TAPM) for measuring video similarity.
Abstract: In this work, we systematically study the problem of event recognition in unconstrained news video sequences. We adopt the discriminative kernel-based method for which video clip similarity plays an important role. First, we represent a video clip as a bag of orderless descriptors extracted from all of the constituent frames and apply the earth mover's distance (EMD) to integrate similarities among frames from two clips. Observing that a video clip is usually comprised of multiple subclips corresponding to event evolution over time, we further build a multilevel temporal pyramid. At each pyramid level, we integrate the information from different subclips with Integer-value-constrained EMD to explicitly align the subclips. By fusing the information from the different pyramid levels, we develop temporally aligned pyramid matching (TAPM) for measuring video similarity. We conduct comprehensive experiments on the TRECVID 2005 corpus, which contains more than 6,800 clips. Our experiments demonstrate that (1) the TAPM multilevel method clearly outperforms single-level EMD (SLEMD) and (2) SLEMD outperforms keyframe and multiframe-based detection methods by a large margin. In addition, we conduct in-depth investigation of various aspects of the proposed techniques such as weight selection in SLEMD, sensitivity to temporal clustering, the effect of temporal alignment, and possible approaches for speedup. Extensive analysis of the results also reveals intuitive interpretation of video event recognition through video subclip alignment at different levels.

141 citations


"Using Concept Recognition to Annota..." refers background in this paper

  • ...In [2], the authors have systematically studied the problem of event recognition in unconstrained news video sequences, by adopting the discriminative kernel-based method....

    [...]

Book ChapterDOI
01 Jan 2007
TL;DR: A new Bayesian Network based probabilistic reasoning framework with M-OWL for semantic interpretation of multimedia data and a new model for ontology integration, based on the similarity of the concepts in the media domain are proposed.
Abstract: An ontology designed for multimedia applications should enable integration of the conceptual and media spaces. We present M-OWL, a new ontology language, that supports this capability. M-OWL supports explicit definition of media properties for the concepts. The language has been defined as an extension of OWL, the standard ontology language for the web. We have proposed a new Bayesian Network based probabilistic reasoning framework with M-OWL for semantic interpretation of multimedia data. We have also proposed a new model for ontology integration, based on the similarity of the concepts in the media domain. It can be used to integrate several multimedia and traditional ontologies.

33 citations


"Using Concept Recognition to Annota..." refers methods in this paper

  • ...MOWL supports probabilistic reasoning with Bayesian Networks in contrast to crisp Description logic based reasoning with traditional ontology languages [4]....

    [...]