scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Automatic Video Classification: A Survey of the Literature

01 May 2008-Vol. 38, Iss: 3, pp 416-430
TL;DR: This paper surveys the video classification literature and finds that features are drawn from three modalities - text, audio, and visual - and that a large variety of combinations of features and classification have been explored.
Abstract: There is much video available today. To help viewers find video of interest, work has begun on methods of automatic video classification. In this paper, we survey the video classification literature. We find that features are drawn from three modalities - text, audio, and visual - and that a large variety of combinations of features and classification have been explored. We describe the general features chosen and summarize the research in this area. We conclude with ideas for further research.
Citations
More filters
Journal ArticleDOI
01 Nov 2011
TL;DR: Methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, and video retrieval including query interfaces are analyzed.
Abstract: Video indexing and retrieval have a wide spectrum of promising applications, motivating the interest of researchers worldwide. This paper offers a tutorial and an overview of the landscape of general strategies in visual content-based video indexing and retrieval, focusing on methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, video retrieval including query interfaces, similarity measure and relevance feedback, and video browsing. Finally, we analyze future research directions.

606 citations

Journal ArticleDOI
01 Nov 2012
TL;DR: This review presents an overview of recent research approaches on the topic of anomaly detection in automated surveillance, covering a wide range of domains, employing a vast array of techniques.
Abstract: As surveillance becomes ubiquitous, the amount of data to be processed grows along with the demand for manpower to interpret the data. A key goal of surveillance is to detect behaviors that can be considered anomalous. As a result, an extensive body of research in automated surveillance has been developed, often with the goal of automatic detection of anomalies. Research into anomaly detection in automated surveillance covers a wide range of domains, employing a vast array of techniques. This review presents an overview of recent research approaches on the topic of anomaly detection in automated surveillance. The reviewed studies are analyzed across five aspects: surveillance target, anomaly definitions and assumptions, types of sensors used and the feature extraction processes, learning methods, and modeling algorithms.

195 citations


Additional excerpts

  • ...The related topic of categorizing the genre of produced video was covered in a review by Brezeale and Cook [2]...

    [...]

  • ...The related topic of categorizing the genre of produced video was covered in a review by Brezeale and Cook [2] in 2008, while the more focused topic of understanding specific events in video data was addressed in a review by Lavee et al. [3] in 2009....

    [...]

Journal ArticleDOI
01 Jun 2013
TL;DR: While the existing solutions vary, common key modules are identified and detailed descriptions along with some insights for each are provided, including extraction and representation of low-level features across different modalities, classification strategies, fusion techniques, etc.
Abstract: The goal of high-level event recognition is to automatically detect complex high-level events in a given video sequence. This is a difficult task especially when videos are captured under unconstrained conditions by non-professionals. Such videos depicting complex events have limited quality control, and therefore, may include severe camera motion, poor lighting, heavy background clutter, and occlusion. However, due to the fast growing popularity of such videos, especially on the Web, solutions to this problem are in high demands and have attracted great interest from researchers. In this paper, we review current technologies for complex event recognition in unconstrained videos. While the existing solutions vary, we identify common key modules and provide detailed descriptions along with some insights for each of them, including extraction and representation of low-level features across different modalities, classification strategies, fusion techniques, etc. Publicly available benchmark datasets, performance metrics, and related research forums are also described. Finally, we discuss promising directions for future research.

192 citations

Journal ArticleDOI
TL;DR: This paper focuses on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges.
Abstract: Sports data analysis is becoming increasingly large scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This paper develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, this paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.

179 citations


Cites background from "Automatic Video Classification: A S..."

  • ...Brezeale and Cook [23] reviewed studies on video...

    [...]

Journal ArticleDOI
TL;DR: A new content-based recommender system that encompasses a technique to automatically analyze video contents and to extract a set of representative stylistic features grounded on existing approaches of Applied Media Theory, to improve the accuracy of recommendations.
Abstract: This paper investigates the use of automatically extracted visual features of videos in the context of recommender systems and brings some novel contributions in the domain of video recommendations. We propose a new content-based recommender system that encompasses a technique to automatically analyze video contents and to extract a set of representative stylistic features (lighting, color, and motion) grounded on existing approaches of Applied Media Theory. The evaluation of the proposed recommendations, assessed w.r.t. relevance metrics (e.g., recall) and compared with existing content-based recommender systems that exploit explicit features such as movie genre, shows that our technique leads to more accurate recommendations. Our proposed technique achieves better results not only when visual features are extracted from full-length videos, but also when the feature extraction technique operates on movie trailers, pinpointing that our approach is effective also when full-length videos are not available or when there are performance requirements. Our recommender can be used in combination with more traditional content-based recommendation techniques that exploit explicit content features associated to video files, to improve the accuracy of recommendations. Our recommender can also be used alone, to address the problem originated from video files that have no meta-data, a typical situation of popular movie-sharing websites (e.g., YouTube) where every day hundred millions of hours of videos are uploaded by users and may contain no associated information. As they lack explicit content, these items cannot be considered for recommendation purposes by conventional content-based techniques even when they could be relevant for the user.

175 citations


Cites background from "Automatic Video Classification: A S..."

  • ...[36] and Brezeale and Cook [39] provide comprehensive surveys on the relevant state of the art related to video content analysis and classification, and discuss a large body of low-level features (visual, auditory or textual) that can be considered for these purposes....

    [...]

  • ...Hu et al. [36] and Brezeale and Cook [39] provide comprehensive surveys on the relevant state of the art related to video content analysis and classification, and discuss a large body of low-level features (visual, auditory or textual) that can be considered for these purposes....

    [...]

References
More filters
Book
Christopher M. Bishop1
17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

22,840 citations

Journal ArticleDOI
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.

18,802 citations


"Automatic Video Classification: A S..." refers methods in this paper

  • ...Classification is performed with a GMM because of its popularity in speaker recognition....

    [...]

  • ...One solution to this problem is to use a linear combination of Gaussian distributions, known as a GMM [11]....

    [...]

  • ...A GMM is used for classification of a linear combination of the conditional probabilities of the audio and visual features....

    [...]

  • ...Classification is performed using a GMM. Roach et al. [76] perform classification using only dynamics, that is, object and camera motion....

    [...]

  • ...use a linear combination of Gaussian distributions, known as a GMM [11]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.
Abstract: Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

10,727 citations

Book
18 May 2001
TL;DR: Independent component analysis as mentioned in this paper is a statistical generative model based on sparse coding, which is basically a proper probabilistic formulation of the ideas underpinning sparse coding and can be interpreted as providing a Bayesian prior.
Abstract: In this chapter, we discuss a statistical generative model called independent component analysis. It is basically a proper probabilistic formulation of the ideas underpinning sparse coding. It shows how sparse coding can be interpreted as providing a Bayesian prior, and answers some questions which were not properly answered in the sparse coding framework.

8,333 citations

Proceedings ArticleDOI
12 Nov 1981
TL;DR: In this article, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.
Abstract: Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

8,078 citations