Combining multiple evidence for video classification

doi:10.1109/ICISIP.2005.1529446

Home
/
Papers
/
Combining multiple evidence for video classification

Proceedings Article•DOI•

Combining multiple evidence for video classification

S. Vakkalanka¹, C. Krishna Mohan¹, R. Kumaraswamy¹, B. Yegnanarayana¹•Institutions (1)

Indian Institute of Technology Madras¹

14 Nov 2005-pp 187-192

TL;DR: The efficacy of the performance based fusion method is demonstrated by applying it to classification of short video clips into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis.

read less

Abstract: In this paper, we investigate the problem of video classification into predefined genre, by combining the evidence from multiple classifiers. It is well known in the pattern recognition community that the accuracy of classification obtained by combining decisions made by independent classifiers can be substantially higher than the accuracy of the individual classifiers. The conventional method for combining individual classifiers weighs each classifier equally (sum or vote rule fusion). In this paper, we study a method that estimates the performances of the individual classifiers and combines the individual classifiers by weighing them according to their estimated performance. We demonstrate the efficacy of the performance based fusion method by applying it to classification of short video clips (20 seconds) into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis. The individual classifiers are trained using different spatial and temporal features derived from the video sequences, and two different classifier methodologies, namely hidden Markov models (HMMs) and support vector machines (SVMs). The experiments were carried out on more than 3 hours of video data. A classification rate of 93.12% for all the six classes and 97.14% for sports category alone has been achieved, which is significantly higher than the performance of the individual classifiers.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Parallel neural networks for multimodal video genre classification

[...]

Maurizio Montagnuolo¹, Alberto Messina•Institutions (1)

University of Turin¹

01 Jan 2009-Multimedia Tools and Applications

TL;DR: This article proposes in this article a methodology for classifying the genre of television programmes, which reaches a classification accuracy rate of 95% and is used for training a parallel neural network system able to distinguish between seven video genres.

...read moreread less

Abstract: Improvements in digital technology have made possible the production and distribution of huge quantities of digital multimedia data. Tools for high-level multimedia documentation are becoming indispensable to efficiently access and retrieve desired content from such data. In this context, automatic genre classification provides a simple and effective solution to describe multimedia contents in a structured and well understandable way. We propose in this article a methodology for classifying the genre of television programmes. Features are extracted from four informative sources, which include visual-perceptual information (colour, texture and motion), structural information (shot length, shot distribution, shot rhythm, shot clusters duration and saturation), cognitive information (face properties, such as number, positions and dimensions) and aural information (transcribed text, sound characteristics). These features are used for training a parallel neural network system able to distinguish between seven video genres: football, cartoons, music, weather forecast, newscast, talk show and commercials. Experiments conducted on more than 100 h of audiovisual material confirm the effectiveness of the proposed method, which reaches a classification accuracy rate of 95%.

...read moreread less

60 citations

Journal Article•DOI•

A study on video data mining

[...]

V. Vijayakumar¹, V. Vijayakumar², R. Nedunchezhian¹•Institutions (2)

Sri Ramakrishna Engineering College¹, Bharathiar University²

25 Aug 2012-International Journal of Multimedia Information Retrieval

TL;DR: The objective of video data mining is to discover and describe interesting patterns from the huge amount ofVideo data as it is one of the core problem areas of the data-mining research community.

...read moreread less

Abstract: Data mining is a process of extracting previously unknown knowledge and detecting the interesting patterns from a massive set of data. Thanks to the extensive use of information technology and the recent developments in multimedia systems, the amount of multimedia data available to users has increased exponentially. Video is an example of multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio. It is widely used in many major potential applications like security and surveillance, entertainment, medicine, education programs and sports. The objective of video data mining is to discover and describe interesting patterns from the huge amount of video data as it is one of the core problem areas of the data-mining research community. Compared to the mining of other types of data, video data mining is still in its infancy. There are many challenging research problems existing with video mining. Beginning with an overview of the video data-mining literature, this paper concludes with the applications of video mining.

...read moreread less

51 citations

Cites background from "Combining multiple evidence for vid..."

...A feature is defined as a descriptive parameter extracted from an image or a video stream [92]....
[...]

Journal Article•DOI•

Sport Type Classification of Mobile Videos

[...]

Francesco Cricri¹, Mikko J. Roininen¹, Jussi Leppänen², Sujeet Shyamsundar Mate², Igor Danilo Diego Curcio², Stefan Uhlmann¹, Moncef Gabbouj¹ - Show less +3 more•Institutions (2)

Tampere University of Technology¹, Nokia²

01 Jun 2014-IEEE Transactions on Multimedia

TL;DR: This work extracts domain knowledge about sport events recorded by multiple users, by classifying the sport type into soccer, American football, basketball, tennis, ice-hockey, or volleyball, by using a multi-user and multimodal approach.

...read moreread less

Abstract: The recent proliferation of mobile video content has emphasized the need for applications such as automatic organization and automatic editing of videos. These applications could greatly benefit from domain knowledge about the content. However, extracting semantic information from mobile videos is a challenging task, due to their unconstrained nature. We extract domain knowledge about sport events recorded by multiple users, by classifying the sport type into soccer, American football, basketball, tennis, ice-hockey, or volleyball. We adopt a multi-user and multimodal approach, where each user simultaneously captures audio-visual content and auxiliary sensor data (from magnetometers and accelerometers). Firstly, each modality is separately analyzed; then, analysis results are fused for obtaining the sport type. The auxiliary sensor data is used for extracting more discriminative spatio-temporal visual features and efficient camera motion features. The contribution of each modality to the fusion process is adapted according to the quality of the input data. We performed extensive experiments on data collected at public sport events, showing the merits of using different combinations of modalities and fusion methods. The results indicate that analyzing multimodal and multi-user data, coupled with adaptive fusion, improves classification accuracies in most tested cases, up to 95.45%.

...read moreread less

31 citations

Cites background from "Combining multiple evidence for vid..."

...In [13], the authors successfully applied late fusion for discriminating among six TV broadcast genres and sub-genres: cartoon, commercial, news, cricket, football, and tennis....
[...]

Proceedings Article•DOI•

HMM Based Automatic Video Classification Using Static and Dynamic Features

[...]

M.K. Geetha¹, S. Palanivel¹•Institutions (1)

Annamalai University¹

13 Dec 2007

TL;DR: This paper inspects the problem of automatic video classification using static and dynamic features using hidden Markov model (HMM) as the classifier and demonstrates the efficiency of the system by applying it on a broad range of video data.

...read moreread less

Abstract: Automatic classification of video content is receiving increased impact in the multimedia information processing. This paper inspects the problem of automatic video classification using static and dynamic features. Five different genres such as cartoon, sports, commercials, news and TV serial are studied for assessment. The approach exploits edge information and color histogram as static features and motion information as the dynamic feature with hidden Markov model (HMM) as the classifier. The results are evaluated by constructing individual HMM for each of the features and finally the results obtained are combined to assess the output genre. The method demonstrates the efficiency of the system by applying it on a broad range of video data: 3 hours of video is used for training purpose and a further 1 hour of video as test set. Overall classification accuracy of 95.6% is accomplished.

...read moreread less

25 citations

Cites methods from "Combining multiple evidence for vid..."

...The method demonstrates the efficiency of the system by applying it on a broad range of video data: 3 hours of video is used for training purpose and a further 1 hour of video as test set....
[...]

Book Chapter•DOI•

Video Structure Analysis and Content-Based Indexing in the Automatic Video Indexer AVI

[...]

Kazimierz Choroś¹•Institutions (1)

Wrocław University of Technology¹

01 Jan 2010

TL;DR: Experimental results show good performance of the scheme of video scene detection of a given sport discipline in TV sports news, using the Automatic Video Indexer.

...read moreread less

Abstract: Similarly to text, video is hierarchically structured. The analogies of text and video structures are discussed. Then the juxtaposition is presented of two indexing processes, i.e. of text and video indexing based on the content analysis of their structure units. Several frameworks of automatic detection and categorisation of video shots and scenes reporting the sport events in a given discipline in TV sports news have already been proposed. It has been observed that many sport videos such as archery, diving, soccer, and tennis have repetitive structure patterns. In the tests performed using the Automatic Video Indexer AVI shots and then scenes have been detected in tested TV news videos. Experimental results show good performance of the scheme of video scene detection of a given sport discipline in TV sports news. The Automatic Video Indexer is a research project investigating tools and techniques of automatic video indexing for retrieval systems.

...read moreread less

24 citations

1
2
3
4
…
5
6
7
8
9

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

On combining classifiers

[...]

Josef Kittler¹, M. Hatef², Robert P. W. Duin³, Jiri Matas¹•Institutions (3)

University of Surrey¹, ERA Technology Ltd², Delft University of Technology³

01 Mar 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A common theoretical framework for combining classifiers which use distinct pattern representations is developed and it is shown that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision.

...read moreread less

Abstract: We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions-the sum rule-outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.

...read moreread less

5,670 citations

Journal Article•DOI•

An introduction to hidden Markov models

[...]

Lawrence R. Rabiner¹, Biing-Hwang Juang•Institutions (1)

Bell Labs¹

01 Jan 1986-IEEE Assp Magazine

TL;DR: The purpose of this tutorial paper is to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.

...read moreread less

Abstract: The basic theory of Markov chains has been known to mathematicians and engineers for close to 80 years, but it is only in the past decade that it has been applied explicitly to problems in speech processing. One of the major reasons why speech models, based on Markov chains, have not been developed until recently was the lack of a method for optimizing the parameters of the Markov model to match observed signal patterns. Such a method was proposed in the late 1960's and was immediately applied to speech processing in several research institutions. Continued refinements in the theory and implementation of Markov modelling techniques have greatly enhanced the method, leading to a wide range of applications of these models. It is the purpose of this tutorial paper to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.

...read moreread less

4,546 citations

Journal Article•DOI•

Methods of combining multiple classifiers and their applications to handwriting recognition

[...]

Lei Xu¹, Adam Krzyżak¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

01 May 1992

TL;DR: On applying these methods to combine several classifiers for recognizing totally unconstrained handwritten numerals, the experimental results show that the performance of individual classifiers can be improved significantly.

...read moreread less

Abstract: Possible solutions to the problem of combining classifiers can be divided into three categories according to the levels of information available from the various classifiers. Four approaches based on different methodologies are proposed for solving this problem. One is suitable for combining individual classifiers such as Bayesian, k-nearest-neighbor, and various distance classifiers. The other three could be used for combining any kind of individual classifiers. On applying these methods to combine several classifiers for recognizing totally unconstrained handwritten numerals, the experimental results show that the performance of individual classifiers can be improved significantly. For example, on the US zipcode database, 98.9% recognition with 0.90% substitution and 0.2% rejection can be obtained, as well as high reliability with 95% recognition, 0% substitution, and 5% rejection. >

...read moreread less

2,389 citations

Proceedings Article•DOI•

Comparing images using color coherence vectors

[...]

Greg Pass¹, Ramin Zabih¹, Justin Miller¹•Institutions (1)

Cornell University¹

01 Feb 1997

TL;DR: It is shown that CCV’s can give superior results to color histogram-based methods for comparing images that incorporates spatial information, and to whom correspondence should be addressed tograms for image retrieval.

...read moreread less

Abstract: Color histograms are used to compare images in many applications. Their advantages are efficiency, and insensitivity to small changes in camera viewpoint. However, color histograms lack spatial information, so images with very different appearances can have similar histograms. For example, a picture of fall foliage might contain a large number of scattered red pixels; this could have a similar color histogram to a picture with a single large red object. We describe a histogram-based method for comparing images that incorporates spatial information. We classify each pixel in a given color bucket as either coherent or incoherent, based on whether or not it is part of a large similarly-colored region. A color coherence vector (CCV) stores the number of coherent versus incoherent pixels with each color. By separating coherent pixels from incoherent pixels, CCV’s provide finer distinctions than color histograms. CCV’s can be computed at over 5 images per second on a standard workstation. A database with 15,000 images can be queried for the images with the most similar CCV’s in under 2 seconds. We show that CCV’s can give superior results to color his∗To whom correspondence should be addressed tograms for image retrieval.

...read moreread less

931 citations

Support Vector Machines for Large-Scale Regression Problems

[...]

Ronan Collobert, Samy Bengio

01 Jan 2000

TL;DR: In this paper, learning reference EPFL-REPORT-82604 is used to learn Reference EPFL this paper. But learning reference is not considered in this paper. http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10

...read moreread less

Abstract: Keywords: learning Reference EPFL-REPORT-82604 URL: http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10

...read moreread less

904 citations