scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Performance evaluation of shot boundary detection metrics in the presence of object and camera motion

TL;DR: An algorithm has been proposed for shot boundary detection by using dual tree complex wavelet transform in the presence of motion and performance comparison of the proposed algorithm with the traditional metrics validates its effectiveness in terms of improved Recall, Precision, and F1 score.
Abstract: Partitioning a video into shots is an important step for video indexing. We have presented the performance of various traditional metrics that are generally used to detect shot boundaries. In this paper, we evaluated shot boundary detection metrics, such as likelihood ratio and color ratio histogram in Red Green Blue (RGB) and Hue Saturation, Value (HSV) color space for three different action and thriller movies. These movies consist of large number of frames with object and camera motion. The pixel difference and Chi-square shot boundary detection metrics in Luma and Chrominance Components (YUV) color space has been tested for Ave different movies. The results were evaluated in terms of Recall, Precision, and F1 measure for all these movies. It has been observed that these results are affected by the disturbance due to the motion in the consecutive frames. The false positives and miss detection of shot boundaries in all the tested metrics are due to fast camera and object motion. An algorithm has...
Citations
More filters
Proceedings ArticleDOI
01 Oct 2015
TL;DR: In this paper, a new automatic cut detection method is proposed based on the local binary pattern feature and its efficacy is validated with few existing popular approaches.
Abstract: Due to the growth of internet and multimedia technology, there is a tremendous need for an efficient method of automatic video retrieval and storage process. Automatic annotation is one of the key solutions for the automatic retrieval or search of a particular scene based frames or a particular scene based videos from the huge video database. Video segmentation based on shot boundary detection (Cut detection) is the fundamental step for the content based text annotation or video data analysis. In this paper, a new automatic cut detection method is proposed based on the local binary pattern feature. The proposed algorithm is tested with six test videos and its efficacy is validated with few existing popular approaches.

11 citations

Proceedings ArticleDOI
01 Dec 2015
TL;DR: This paper addressed the problem of video segmentation of only non-edited videos by classifying the boundary and non-boundary frames by using the block based center symmetric local binary pattern feature vector for the detection of shot boundaries in a video.
Abstract: Due to the popularity of multimedia technology and digital world, thousands of videos are accessed through internet in seconds. Most of the videos, available in internet for public access are non-edited videos. Efficient way of searching and storage need an efficient method of annotation. Automatic cut detection is the first stage of any automatic annotation process. In this paper we addressed the problem of video segmentation of only non-edited videos by classifying the boundary and non-boundary frames. The efficiency of intensity based cut detection methods decrease with variation of intensity of the scene. The local binary pattern is one of the texture feature which provides a strong spatial correlation among the neighboring pixels, which is also invariant to light variation. Therefore in the proposed method, the block based center symmetric local binary pattern feature vector is used for the detection of shot boundaries in a video. The Euclidean distance between the consecutive frame's feature vector is chosen as the similarity measure which is compared with a threshold value to detect the hard cuts in a non-edited video. The proposed algorithm is experimented with seven test videos and its efficacy is validated with few existing popular approaches.

9 citations

Journal ArticleDOI
TL;DR: This paper compares some of the popular shot boundary detection techniques in uncompressed domain and discusses the merits and demerits of each of the techniques.
Abstract: days there are tremendous amount of videos available on internet. Entertainment video, news video, sports video are accessed by users to fulfill their different needs. Our daily routine systems are also producing huge amount of videos for example surveillance system, shopping malls, home videos etc. These videos need to be accessed for different purposes. Current research topics on video includes video abstraction or summarization, video classification, video annotation, content based video retrieval. In nearly all these application one needs to identify shots and key frames in video which will correctly and briefly indicate the contents of video. This paper compares some of the popular shot boundary detection techniques in uncompressed domain. The merits and demerits of each of the techniques are also discussed. Some experiment done are also discussed.

8 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: A novel automatic thresholding method for the detection of abrupt shot transition (AST), based on statistics of the pixel difference value of the consecutive frames, from a given video sequence is proposed.
Abstract: In this article, we propose a novel automatic thresholding method for the detection of abrupt shot transition (AST), based on statistics of the pixel difference value of the consecutive frames, from a given video sequence. An outlier removal and a false alarm elimination scheme are also introduced to counteract the disturbances from illumination variations, object and camera movements. Experimental results and comparisons with state-of-the-art SBD schemes show the effectiveness of the proposed method and having average superior accuracy.

5 citations

Journal ArticleDOI
TL;DR: In this paper, a computer assisted video retrieval technique which can retrieve the visually similar videos stored in the repositories is introduced and is compared with standard state-of-the-art techniques using real-time datasets.
Abstract: Despite enormous research efforts devoted by the research community to effectively and precisely perform video matching and retrieval among heterogeneous videos from large-scale video repositories still remains a complex and most challenging task. In order to address this complex challenge, a content based video retrieval technique is required, which can exploit the visual content of the videos for effective retrieval from the videos repositories. In our proposed method, we introduce a computer assisted video retrieval technique which can retrieve the visually similar videos stored in the repositories. To accomplish this task, video summarization based on motion vector is employed to select keyframes based on similar segments. To estimate the video content, salient foreground extraction is executed, and matching based on the spatial pyramid is employed for matching the keyframe features of query video with videos in the repositories. The contribution of the former process has two major sections for superior saliency map generation. Firstly, it heuristically integrates the regional property, contrast, and foreground descriptors together. Secondly, it introduces a new feature vector to characterize the foreground as an object descriptor, while the latter process is the extension of orderless bag-of-features representation, which has significant performance with respect to scene categorization. The video retrieval performance is compared with standard state-of-the-art techniques using real-time datasets. Experimental and usability studies provide satisfactory results for video retrieval based on evaluation metrics such as video sampling error, fidelity, precision, and recall.

4 citations


Cites background from "Performance evaluation of shot boun..."

  • ...Chi-square distance [36] between two histogram features are shown in (7), with r number of histogram bins....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1995
TL;DR: This text intentionally omits theories of machine vision that do not have sufficient practical applications at the time, and basic concepts are introduced with only essential mathematical elements.
Abstract: This text is intended to provide a balanced introduction to machine vision. Basic concepts are introduced with only essential mathematical elements. The details to allow implementation and use of vision algorithm in practical application are provided, and engineering aspects of techniques are emphasized. This text intentionally omits theories of machine vision that do not have sufficient practical applications at the time.

2,365 citations

Journal ArticleDOI
TL;DR: A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects, and a motion analysis algorithm is applied to determine whether an actual transition has occurred.
Abstract: Partitioning a video source into meaningful segments is an important step for video indexing. We present a comprehensive study of a partitioning system that detects segment boundaries. The system is based on a set of difference metrics and it measures the content changes between video frames. A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects. To eliminate the false interpretation of camera movements as transitions, a motion analysis algorithm is applied to determine whether an actual transition has occurred. A technique for determining the threshold for a difference metric and a multi-pass approach to improve the computation speed and accuracy have also been developed.

1,360 citations

Journal ArticleDOI
TL;DR: The dual–tree CWT is proposed as a solution to the complex wavelet transform problem, yielding a transform with attractive properties for a range of signal and image processing applications, including motion estimation, denoising, texture analysis and synthesis, and object segmentation.
Abstract: We first review how wavelets may be used for multi–resolution image processing, describing the filter–bank implementation of the discrete wavelet transform (DWT) and how it may be extended via separable filtering for processing images and other multi–dimensional signals. We then show that the condition for inversion of the DWT (perfect reconstruction) forces many commonly used wavelets to be similar in shape, and that this shape produces severe shift dependence (variation of DWT coefficient energy at any given scale with shift of the input signal). It is also shown that separable filtering with the DWT prevents the transform from providing directionally selective filters for diagonal image features. Complex wavelets can provide both shift invariance and good directional selectivity, with only modest increases in signal redundancy and computation load. However, development of a complex wavelet transform (CWT) with perfect reconstruction and good filter characteristics has proved difficult until recently. We now propose the dual–tree CWT as a solution to this problem, yielding a transform with attractive properties for a range of signal and image processing applications, including motion estimation, denoising, texture analysis and synthesis, and object segmentation.

859 citations

Journal ArticleDOI
TL;DR: This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods.
Abstract: Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The perfor- mance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application. © 1996 SPIE and IS&T.

634 citations


"Performance evaluation of shot boun..." refers background in this paper

  • ...Boreczky and Rowe [2] have presented a comparison of several shot boundary detection and classification techniques and their variations including histograms, edge tracking, discrete cosine transform, motion vector, and block matching methods....

    [...]