scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Ontology driven contextual tagging of multimedia data

TL;DR: This paper proposes a model for tagging of multimedia data on the basis of contextual meaning which has practical applicability in the sense that whenever a new video is uploaded on some media sharing site, the context and content information gets attached automatically to a video.
Abstract: To exhibit multi-modal information and to facilitate people in finding multimedia resources, tagging plays a significant role. Various public events like protests and demonstrations are always consequences of break out of some public outrage resulting from prolonged exploitation and harassment. This outrage can be seen in news footage, blogs, text news and other web data. So, aggregating this variety of data from heterogeneous sources is a prerequisite step for tagging multimedia data with appropriate content. Since content has no meaning without a context, a video should be tagged with its relevant context and content information to assist user in multimedia retrieval. This paper proposes a model for tagging of multimedia data on the basis of contextual meaning. Since context is knowledge based, it has to be guided and learned by ontology which will help fragmented information to be represented in a more meaningful way. Our tagging approach is novel and has practical applicability in the sense that whenever a new video is uploaded on some media sharing site, the context and content information gets attached automatically to a video. Thus, providing relatively complete information associated with the video.
Citations
More filters
Patent
06 Sep 2016
TL;DR: In this article, the authors present methods, program products, and systems to filter content returned by a search tool by associating an indication that content fulfills a first request with the first request, the content that fulfills the content, and other metadata associated with the indication.
Abstract: Embodiments of the present invention provide methods, program products, and systems to filter content returned by a search tool by associating an indication that content fulfills a first request, with the first request, the content that fulfills a first request for information, and other metadata associated with the indication. Embodiments can then add to a database the respectively associated first request for information, the content, and the other metadata and in response to receiving a second request that is related to the first request, identify the added content and additional content that fulfills the second request and compile a list of the added content and the additional content and manipulate the order of the compiled list based, at least in part, on metadata associated with the added content, the additional content identified from the database, and metadata stored in a customizable user profile.

4 citations

Proceedings ArticleDOI
22 Mar 2017
TL;DR: A semantic web-based framework for automatic topic identification from video tutorials in order to identify the concepts and their associated semantically relevant resources in e-learning resource which helps learners in more focused study is proposed.
Abstract: As more and more learners are opting for onlinelearning, e-learning industry is working on improving learningexperience of online user by providing relevant content and lotof additional references. Since online learners mostly prefervideo tutorials, identifying major topics and subtopics coveredin video tutorial is a big challenge. Recently, for efficientknowledge sharing and interoperability over web lot ofattention is given to semantic web. In this paper, we propose asemantic web-based framework for automatic topicidentification from video tutorials in order to identify theconcepts and their associated semantically relevant resources. Our framework identifies relevant topic using disambiguationin e-learning resource which helps learners in more focused study.

2 citations


Cites background from "Ontology driven contextual tagging ..."

  • ...Through their research, Ali Pesaranghader, Norwati Mustapha, Ahmad Pesaranghader [23] explained how Semantic similarity, relatedness and sense of topic are important for topic specific web crawling....

    [...]

  • ...[23] Nisha Pahal, Santanu Chaudhury, Brejesh Lall, “Ontology driven contextual tagging of multimedia data”....

    [...]

  • ...Nisha Pahal, Santanu Chaudhury, Brejesh Lall [23] proposed a model for tagging of multimedia data on the basis of contextual meaning....

    [...]

References
More filters
BookDOI
01 Jan 1983
TL;DR: The research and development in information retrieval is universally compatible with any devices to read, and can be downloaded instantly from the authors' digital library.
Abstract: research and development in information retrieval is available in our digital library an online access to it is set as public so you can get it instantly. Our books collection spans in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the research and development in information retrieval is universally compatible with any devices to read.

1,298 citations

Proceedings ArticleDOI
27 Jun 2004
TL;DR: This work shows how it can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model, which significantly outperforms previously reported results on the task of image and video annotation.
Abstract: Retrieving images in response to textual queries requires some knowledge of the semantics of the picture. Here, we show how we can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model. The model assumes that a training set of images or videos along with keyword annotations is provided. Multiple keywords are provided for an image and the specific correspondence between a keyword and an image is not provided. Each image is partitioned into a set of rectangular regions and a real-valued feature vector is computed over these regions. The relevance model is a joint probability distribution of the word annotations and the image feature vectors and is computed using the training set. The word probabilities are estimated using a multiple Bernoulli model and the image feature probabilities using a non-parametric kernel density estimate. The model is then used to annotate images in a test set. We show experiments on both images from a standard Corel data set and a set of video key frames from NIST's video tree. Comparative experiments show that the model performs better than a model based on estimating word probabilities using the popular multinomial distribution. The results also show that our model significantly outperforms previously reported results on the task of image and video annotation.

815 citations

Journal ArticleDOI
TL;DR: The large-scale concept ontology for multimedia (LSCOM) is the first of its kind designed to simultaneously optimize utility to facilitate end-user access, cover a large semantic space, make automated extraction feasible, and increase observability in diverse broadcast news video data sets.
Abstract: As increasingly powerful techniques emerge for machine tagging multimedia content, it becomes ever more important to standardize the underlying vocabularies. Doing so provides interoperability and lets the multimedia community focus ongoing research on a well-defined set of semantics. This paper describes a collaborative effort of multimedia researchers, library scientists, and end users to develop a large standardized taxonomy for describing broadcast news video. The large-scale concept ontology for multimedia (LSCOM) is the first of its kind designed to simultaneously optimize utility to facilitate end-user access, cover a large semantic space, make automated extraction feasible, and increase observability in diverse broadcast news video data sets

644 citations


"Ontology driven contextual tagging ..." refers methods in this paper

  • ...Naphade and Smith [6] provided a survey on the video tagging algorithms applied to the TRECVID high-level feature extraction task....

    [...]

  • ...[6] Milind Naphade, John R Smith, Jelena Tesic, ShihFu Chang, Winston Hsu, Lyndon Kennedy, Alexander Hauptmann, and Jon Curtis, “Large-scale concept ontology for multimedia,” Multimedia, IEEE, vol. 13, no. 3, pp. 86–91, 2006....

    [...]

  • ...REFERENCES [1] Ching-Yung Lin, Belle L Tseng, and John R Smith, “Videoannex: Ibm mpeg-7 annotation tool for multimedia indexing and concept learning,” in IEEE International Conference on Multimedia and Expo, 2003....

    [...]

Proceedings ArticleDOI
29 Sep 2007
TL;DR: A third paradigm is proposed which simultaneously classifies concepts and models correlations between them in a single step by using a novel Correlative Multi-Label (CML) framework and is compared with the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set.
Abstract: Automatically annotating concepts for video is a key to semantic-level video browsing, search and navigation. The research on this topic evolved through two paradigms. The first paradigm used binary classification to detect each individual concept in a concept set. It achieved only limited success, as it did not model the inherent correlation between concepts, e.g., urban and building. The second paradigm added a second step on top of the individual concept detectors to fuse multiple concepts. However, its performance varies because the errors incurred in the first detection step can propagate to the second fusion step and therefore degrade the overall performance. To address the above issues, we propose a third paradigm which simultaneously classifies concepts and models correlations between them in a single step by using a novel Correlative Multi-Label (CML) framework. We compare the performance between our proposed approach and the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set. We report superior performance from the proposed approach.

490 citations

Journal ArticleDOI
TL;DR: This work proposes a novel neighborhood similarity measure, which explores the local sample and label distributions and shows that the neighborhood similarity between two samples simultaneously takes into account three characteristics: their distance; the distribution difference of the surrounding samples; and the distribution different of surrounding labels.
Abstract: In the past few years, video annotation has benefited a lot from the progress of machine learning techniques. Recently, graph-based semi-supervised learning has gained much attention in this domain. However, as a crucial factor of these algorithms, the estimation of pairwise similarity has not been sufficiently studied. Generally, the similarity of two samples is estimated based on the Euclidean distance between them. But we will show that the similarity between two samples is not merely related to their distance but also related to the distribution of surrounding samples and labels. It is shown that the traditional distance-based similarity measure may lead to high classification error rates even on several simple datasets. To address this issue, we propose a novel neighborhood similarity measure, which explores the local sample and label distributions. We show that the neighborhood similarity between two samples simultaneously takes into account three characteristics: 1) their distance; 2) the distribution difference of the surrounding samples; and 3) the distribution difference of surrounding labels. Extensive experiments have demonstrated the superiority of neighborhood similarity over the existing distance-based similarity.

284 citations


Additional excerpts

  • ...[4] provided an efficient video annotation scheme, in which large-scale unlabeled data, multiple modalities, multiple distance functions, and video temporal consistency could be simultaneously tackled in a unified manner....

    [...]