scispace - formally typeset
Search or ask a question
Author

S. Vakkalanka

Bio: S. Vakkalanka is an academic researcher from Indian Institute of Technology Madras. The author has contributed to research in topics: Random subspace method & Smacker video. The author has an hindex of 2, co-authored 2 publications receiving 34 citations.

Papers
More filters
Proceedings ArticleDOI
14 Nov 2005
TL;DR: The efficacy of the performance based fusion method is demonstrated by applying it to classification of short video clips into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis.
Abstract: In this paper, we investigate the problem of video classification into predefined genre, by combining the evidence from multiple classifiers. It is well known in the pattern recognition community that the accuracy of classification obtained by combining decisions made by independent classifiers can be substantially higher than the accuracy of the individual classifiers. The conventional method for combining individual classifiers weighs each classifier equally (sum or vote rule fusion). In this paper, we study a method that estimates the performances of the individual classifiers and combines the individual classifiers by weighing them according to their estimated performance. We demonstrate the efficacy of the performance based fusion method by applying it to classification of short video clips (20 seconds) into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis. The individual classifiers are trained using different spatial and temporal features derived from the video sequences, and two different classifier methodologies, namely hidden Markov models (HMMs) and support vector machines (SVMs). The experiments were carried out on more than 3 hours of video data. A classification rate of 93.12% for all the six classes and 97.14% for sports category alone has been achieved, which is significantly higher than the performance of the individual classifiers.

44 citations

Proceedings ArticleDOI
14 Nov 2005
TL;DR: This paper presents a system for automatic news video indexing, browsing and retrieval that employs visual clues to effectively parse the news video into story units, to generate visual-table-of-contents and indexes for supporting browse and retrieval.
Abstract: This paper presents a system for automatic news video indexing, browsing and retrieval. The system employs visual clues to effectively parse the news video into story units, to generate visual-table-of-contents and indexes for supporting browsing and retrieval. The efficiency of the system has been tested on several video sequences.

4 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications, which classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates.
Abstract: We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data-which, if presented in its raw format, is rather unwieldy and costly-have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other.

84 citations

Journal ArticleDOI
TL;DR: This article proposes in this article a methodology for classifying the genre of television programmes, which reaches a classification accuracy rate of 95% and is used for training a parallel neural network system able to distinguish between seven video genres.
Abstract: Improvements in digital technology have made possible the production and distribution of huge quantities of digital multimedia data. Tools for high-level multimedia documentation are becoming indispensable to efficiently access and retrieve desired content from such data. In this context, automatic genre classification provides a simple and effective solution to describe multimedia contents in a structured and well understandable way. We propose in this article a methodology for classifying the genre of television programmes. Features are extracted from four informative sources, which include visual-perceptual information (colour, texture and motion), structural information (shot length, shot distribution, shot rhythm, shot clusters duration and saturation), cognitive information (face properties, such as number, positions and dimensions) and aural information (transcribed text, sound characteristics). These features are used for training a parallel neural network system able to distinguish between seven video genres: football, cartoons, music, weather forecast, newscast, talk show and commercials. Experiments conducted on more than 100 h of audiovisual material confirm the effectiveness of the proposed method, which reaches a classification accuracy rate of 95%.

60 citations

Journal ArticleDOI
TL;DR: The objective of video data mining is to discover and describe interesting patterns from the huge amount ofVideo data as it is one of the core problem areas of the data-mining research community.
Abstract: Data mining is a process of extracting previously unknown knowledge and detecting the interesting patterns from a massive set of data. Thanks to the extensive use of information technology and the recent developments in multimedia systems, the amount of multimedia data available to users has increased exponentially. Video is an example of multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio. It is widely used in many major potential applications like security and surveillance, entertainment, medicine, education programs and sports. The objective of video data mining is to discover and describe interesting patterns from the huge amount of video data as it is one of the core problem areas of the data-mining research community. Compared to the mining of other types of data, video data mining is still in its infancy. There are many challenging research problems existing with video mining. Beginning with an overview of the video data-mining literature, this paper concludes with the applications of video mining.

51 citations

Journal ArticleDOI
TL;DR: This work extracts domain knowledge about sport events recorded by multiple users, by classifying the sport type into soccer, American football, basketball, tennis, ice-hockey, or volleyball, by using a multi-user and multimodal approach.
Abstract: The recent proliferation of mobile video content has emphasized the need for applications such as automatic organization and automatic editing of videos. These applications could greatly benefit from domain knowledge about the content. However, extracting semantic information from mobile videos is a challenging task, due to their unconstrained nature. We extract domain knowledge about sport events recorded by multiple users, by classifying the sport type into soccer, American football, basketball, tennis, ice-hockey, or volleyball. We adopt a multi-user and multimodal approach, where each user simultaneously captures audio-visual content and auxiliary sensor data (from magnetometers and accelerometers). Firstly, each modality is separately analyzed; then, analysis results are fused for obtaining the sport type. The auxiliary sensor data is used for extracting more discriminative spatio-temporal visual features and efficient camera motion features. The contribution of each modality to the fusion process is adapted according to the quality of the input data. We performed extensive experiments on data collected at public sport events, showing the merits of using different combinations of modalities and fusion methods. The results indicate that analyzing multimodal and multi-user data, coupled with adaptive fusion, improves classification accuracies in most tested cases, up to 95.45%.

31 citations

Proceedings ArticleDOI
13 Dec 2007
TL;DR: This paper inspects the problem of automatic video classification using static and dynamic features using hidden Markov model (HMM) as the classifier and demonstrates the efficiency of the system by applying it on a broad range of video data.
Abstract: Automatic classification of video content is receiving increased impact in the multimedia information processing. This paper inspects the problem of automatic video classification using static and dynamic features. Five different genres such as cartoon, sports, commercials, news and TV serial are studied for assessment. The approach exploits edge information and color histogram as static features and motion information as the dynamic feature with hidden Markov model (HMM) as the classifier. The results are evaluated by constructing individual HMM for each of the features and finally the results obtained are combined to assess the output genre. The method demonstrates the efficiency of the system by applying it on a broad range of video data: 3 hours of video is used for training purpose and a further 1 hour of video as test set. Overall classification accuracy of 95.6% is accomplished.

25 citations