scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Combining multiple evidence for video classification

TL;DR: The efficacy of the performance based fusion method is demonstrated by applying it to classification of short video clips into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis.
Abstract: In this paper, we investigate the problem of video classification into predefined genre, by combining the evidence from multiple classifiers. It is well known in the pattern recognition community that the accuracy of classification obtained by combining decisions made by independent classifiers can be substantially higher than the accuracy of the individual classifiers. The conventional method for combining individual classifiers weighs each classifier equally (sum or vote rule fusion). In this paper, we study a method that estimates the performances of the individual classifiers and combines the individual classifiers by weighing them according to their estimated performance. We demonstrate the efficacy of the performance based fusion method by applying it to classification of short video clips (20 seconds) into six popular TV broadcast genre, namely cartoon, commercial, news, cricket, football, and tennis. The individual classifiers are trained using different spatial and temporal features derived from the video sequences, and two different classifier methodologies, namely hidden Markov models (HMMs) and support vector machines (SVMs). The experiments were carried out on more than 3 hours of video data. A classification rate of 93.12% for all the six classes and 97.14% for sports category alone has been achieved, which is significantly higher than the performance of the individual classifiers.
Citations
More filters
Journal ArticleDOI
TL;DR: It is found that, BICC outperforms other visual features such as edge, motion and histogram which are commonly used for video classification, and reduces the redundancy while exploiting the correlations between the feature elements.
Abstract: Appropriate organization of video databases is essential for pertinent indexing and retrieval of visual information. This paper proposes a new feature called block intensity comparison code (BICC) for video classification and retrieval. Block intensity comparison code represents the average block intensity difference between blocks of a frame. The extracted feature is further processed using principal component analysis (PCA) to reduce the redundancy while exploiting the correlations between the feature elements. The temporal nature of video is modeled by hidden Markov model (HMM) with BICC as the features. It is found that, BICC outperforms other visual features such as edge, motion and histogram which are commonly used for video classification.

22 citations

Journal ArticleDOI
TL;DR: In the new indexing procedures implemented in the AVI - the Automatic Video Indexer - the knowledge of the structure of video shots and scenes allowed to reduce significantly the number of frames analysed in a given process.
Abstract: Text and video seem to be totally different digital media. Nevertheless, similarly to text, video is also hierarchically structured. The knowledge of the structures of textual data as well as of video data allows for more efficient indexing. The analogies of text and video structures are observed and discussed. Next, the juxtaposition is presented of two indexing processes, i.e., of text and video indexing based on the content analysis of their structure units. Then, different possible strategies of video indexing are also considered. Several frameworks of automatic detection and content-based categorisation of video shots and scenes have already been proposed in scientific literature. The experiments mainly concern with the classification of TV news, the categorisation of the sport events in TV sports news, the detection of people or special events etc. It has been observed that many sport videos have repetitive structure patterns. In the new indexing procedures implemented in the AVI - the Automatic Video Indexer - the knowledge of the structure of video shots and scenes allowed to reduce significantly the number of frames analysed in a given process.

19 citations


Cites background from "Combining multiple evidence for vid..."

  • ...In Vakkalanka et al. (2005), it has been shown that the weighting of individual classifier according to their estimated performance gives better results in automatic classifications....

    [...]

Book ChapterDOI
28 Jun 2010
TL;DR: Experimental results show good performance of the proposed scheme on detecting scenes on a given sport discipline in TV sports news, and a special software called AVI - the Automatic Video Indexer has been used to detect shots and then scenes in tested TV news videos.
Abstract: A large amount of digital video data is stored in local or network visual retrieval systems. The new technology advances in multimedia information processing as well as in network transmission have made video data publicly and relatively easy available. Users need the adequate tools to locate their desired video or video segments quickly and efficiently, for example in Internet video collections, TV shows archives, video-on-demand systems, personal video archives offered by many public Internet services, etc. Detection of scenes in TV videos is difficult because the diversity of effects used in video editing puts up a barrier to construct an appropriate model. The framework of automatic recognition and classification of scenes reporting the sport events in a given discipline in TV sports news have been proposed. Experimental results show good performance of the proposed scheme on detecting scenes on a given sport discipline in TV sports news. In the tests a special software called AVI - the Automatic Video Indexer has been used to detect shots and then scenes in tested TV news videos.

18 citations


Cites background from "Combining multiple evidence for vid..."

  • ...It has been also shown [ 10 ] that the weighting of individual classifier according to their estimated performance gives better results in automatic classifications....

    [...]

Proceedings ArticleDOI
13 Dec 2007
TL;DR: Four schemes for extracting the useful features of the ECG signals for use with artificial neural networks and two statistical techniques like linear discriminant analysis and tree clustering are evaluated.
Abstract: Many of the cardiac problems are visible as distortions in the electrocardiogram (ECG). Since the abnormal heart beats can occur randomly it becomes very tedious and time-consuming to analyze say a 24 hour ECG signal, as it may contain hundreds of thousands of heart beats. Hence it is desired to automate the entire process of heart beat classification and preferably diagnose it accurately In this paper the authors have focused on the various schemes for extracting the useful features of the ECG signals for use with artificial neural networks. Arrhythmia is one such type of abnormality detectable by an ECG signal. The three classes of ECG signals are Normal, Fusion and Premature Ventricular Contraction (PVC). The task of an ANN based system is to correctly identify the three classes, most importantly the PVC type, this being a fatal cardiac condition. Discrete Fourier Transform, Principal Component Analysis, and Discrete Wavelet Transform and Discrete Cosine Transform are the four schemes discussed and compared in this paper. For comparison the statistical techniques like linear discriminant analysis and tree clustering are also evaluated.

18 citations

Proceedings ArticleDOI
24 Apr 2008
TL;DR: A robust scene classification and segmentation approach based on support vector machine (SVM) is presented, which extracts both audio and visual features and analyzes their inter-relations to identify and classify video scenes.
Abstract: Video scene classification and segmentation are fundamental steps for multimedia retrieval, indexing and browsing. In this paper, a robust scene classification and segmentation approach based on support vector machine (SVM) is presented, which extracts both audio and visual features and analyzes their inter-relations to identify and classify video scenes. Our system works on content from a diverse range of genres by allowing sets of features to be combined and compared automatically without the use of thresholds. With the temporal behaviors of different scene classes, SVM classifier can effectively classify presegmented video clips into one of the predefined scene classes. After identifying scene classes, the scene change boundary can be easily detected. The experimental results show that the proposed system not only improves precision and recall, but also performs better than the other classification systems using the decision tree (DT), K nearest neighbor (K-NN) and neural network (NN).

12 citations


Cites background from "Combining multiple evidence for vid..."

  • ...For more detailed descriptions, please refer to [10]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A common theoretical framework for combining classifiers which use distinct pattern representations is developed and it is shown that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision.
Abstract: We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions-the sum rule-outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.

5,670 citations

Journal ArticleDOI
TL;DR: The purpose of this tutorial paper is to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.
Abstract: The basic theory of Markov chains has been known to mathematicians and engineers for close to 80 years, but it is only in the past decade that it has been applied explicitly to problems in speech processing. One of the major reasons why speech models, based on Markov chains, have not been developed until recently was the lack of a method for optimizing the parameters of the Markov model to match observed signal patterns. Such a method was proposed in the late 1960's and was immediately applied to speech processing in several research institutions. Continued refinements in the theory and implementation of Markov modelling techniques have greatly enhanced the method, leading to a wide range of applications of these models. It is the purpose of this tutorial paper to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.

4,546 citations

Journal ArticleDOI
01 May 1992
TL;DR: On applying these methods to combine several classifiers for recognizing totally unconstrained handwritten numerals, the experimental results show that the performance of individual classifiers can be improved significantly.
Abstract: Possible solutions to the problem of combining classifiers can be divided into three categories according to the levels of information available from the various classifiers. Four approaches based on different methodologies are proposed for solving this problem. One is suitable for combining individual classifiers such as Bayesian, k-nearest-neighbor, and various distance classifiers. The other three could be used for combining any kind of individual classifiers. On applying these methods to combine several classifiers for recognizing totally unconstrained handwritten numerals, the experimental results show that the performance of individual classifiers can be improved significantly. For example, on the US zipcode database, 98.9% recognition with 0.90% substitution and 0.2% rejection can be obtained, as well as high reliability with 95% recognition, 0% substitution, and 5% rejection. >

2,389 citations

Proceedings ArticleDOI
01 Feb 1997
TL;DR: It is shown that CCV’s can give superior results to color histogram-based methods for comparing images that incorporates spatial information, and to whom correspondence should be addressed tograms for image retrieval.
Abstract: Color histograms are used to compare images in many applications. Their advantages are efficiency, and insensitivity to small changes in camera viewpoint. However, color histograms lack spatial information, so images with very different appearances can have similar histograms. For example, a picture of fall foliage might contain a large number of scattered red pixels; this could have a similar color histogram to a picture with a single large red object. We describe a histogram-based method for comparing images that incorporates spatial information. We classify each pixel in a given color bucket as either coherent or incoherent, based on whether or not it is part of a large similarly-colored region. A color coherence vector (CCV) stores the number of coherent versus incoherent pixels with each color. By separating coherent pixels from incoherent pixels, CCV’s provide finer distinctions than color histograms. CCV’s can be computed at over 5 images per second on a standard workstation. A database with 15,000 images can be queried for the images with the most similar CCV’s in under 2 seconds. We show that CCV’s can give superior results to color his∗To whom correspondence should be addressed tograms for image retrieval.

931 citations

01 Jan 2000
TL;DR: In this paper, learning reference EPFL-REPORT-82604 is used to learn Reference EPFL this paper. But learning reference is not considered in this paper. http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10
Abstract: Keywords: learning Reference EPFL-REPORT-82604 URL: http://publications.idiap.ch/downloads/reports/2000/rr00-17.pdf Record created on 2006-03-10, modified on 2017-05-10

904 citations