Combining multiple evidence for video classification
TL;DR: The efficacy of the performance based fusion method is demonstrated by applying it to classification of short video clips into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis.
Abstract: In this paper, we investigate the problem of video classification into predefined genre, by combining the evidence from multiple classifiers. It is well known in the pattern recognition community that the accuracy of classification obtained by combining decisions made by independent classifiers can be substantially higher than the accuracy of the individual classifiers. The conventional method for combining individual classifiers weighs each classifier equally (sum or vote rule fusion). In this paper, we study a method that estimates the performances of the individual classifiers and combines the individual classifiers by weighing them according to their estimated performance. We demonstrate the efficacy of the performance based fusion method by applying it to classification of short video clips (20 seconds) into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis. The individual classifiers are trained using different spatial and temporal features derived from the video sequences, and two different classifier methodologies, namely hidden Markov models (HMMs) and support vector machines (SVMs). The experiments were carried out on more than 3 hours of video data. A classification rate of 93.12% for all the six classes and 97.14% for sports category alone has been achieved, which is significantly higher than the performance of the individual classifiers.
...read more
Citations
60 citations
45 citations
Cites background from "Combining multiple evidence for vid..."
...A feature is defined as a descriptive parameter extracted from an image or a video stream [92]....
[...]
24 citations
Cites methods from "Combining multiple evidence for vid..."
...The method demonstrates the efficiency of the system by applying it on a broad range of video data: 3 hours of video is used for training purpose and a further 1 hour of video as test set....
[...]
24 citations
Cites background from "Combining multiple evidence for vid..."
...In [13], the authors successfully applied late fusion for discriminating among six TV broadcast genres and sub-genres: cartoon, commercial, news, cricket, football, and tennis....
[...]
22 citations
References
5,535 citations
4,293 citations
2,340 citations
912 citations
904 citations