scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Automatic partitioning of full-motion video

03 Jan 1993-Multimedia Systems (Springer-Verlag)-Vol. 1, Iss: 1, pp 10-28
TL;DR: A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects, and a motion analysis algorithm is applied to determine whether an actual transition has occurred.
Abstract: Partitioning a video source into meaningful segments is an important step for video indexing. We present a comprehensive study of a partitioning system that detects segment boundaries. The system is based on a set of difference metrics and it measures the content changes between video frames. A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects. To eliminate the false interpretation of camera movements as transitions, a motion analysis algorithm is applied to determine whether an actual transition has occurred. A technique for determining the threshold for a difference metric and a multi-pass approach to improve the computation speed and accuracy have also been developed.
Citations
More filters
Journal ArticleDOI
TL;DR: The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features, which lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features.
Abstract: Many audio and multimedia applications would benefit from the ability to classify and search for audio based on its characteristics. The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features. This lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features, or by selecting or entering reference sounds and asking the engine to retrieve similar or dissimilar sounds.

1,147 citations

Proceedings ArticleDOI
01 Feb 1997
TL;DR: It is shown that CCV’s can give superior results to color histogram-based methods for comparing images that incorporates spatial information, and to whom correspondence should be addressed tograms for image retrieval.
Abstract: Color histograms are used to compare images in many applications. Their advantages are efficiency, and insensitivity to small changes in camera viewpoint. However, color histograms lack spatial information, so images with very different appearances can have similar histograms. For example, a picture of fall foliage might contain a large number of scattered red pixels; this could have a similar color histogram to a picture with a single large red object. We describe a histogram-based method for comparing images that incorporates spatial information. We classify each pixel in a given color bucket as either coherent or incoherent, based on whether or not it is part of a large similarly-colored region. A color coherence vector (CCV) stores the number of coherent versus incoherent pixels with each color. By separating coherent pixels from incoherent pixels, CCV’s provide finer distinctions than color histograms. CCV’s can be computed at over 5 images per second on a standard workstation. A database with 15,000 images can be queried for the images with the most similar CCV’s in under 2 seconds. We show that CCV’s can give superior results to color his∗To whom correspondence should be addressed tograms for image retrieval.

931 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed rapid scene analysis algorithms are fast and effective in detecting abrupt scene changes, gradual transitions including fade-ins and fade-outs, flashlight scenes and in deriving intrashot variations.
Abstract: Several rapid scene analysis algorithms for detecting scene changes and flashlight scenes directly on compressed video are proposed. These algorithms operate on the DC sequence which can be readily extracted from video compressed using Motion JPEG or MPEG without full-frame decompression. The DC images occupy only a small fraction of the original data size while retaining most of the essential "global" information. Operating on these images offers a significant computation saving. Experimental results show that the proposed algorithms are fast and effective in detecting abrupt scene changes, gradual transitions including fade-ins and fade-outs, flashlight scenes and in deriving intrashot variations.

893 citations

Journal ArticleDOI
TL;DR: This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods.
Abstract: Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The perfor- mance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application. © 1996 SPIE and IS&T.

634 citations

Proceedings ArticleDOI
01 Dec 2002
TL;DR: A generic framework of video summarization based on the modeling of viewer's attention is presented, which takes advantage of computational attention models and eliminates the needs of complex heuristic rules inVideo summarization.
Abstract: Automatic generation of video summarization is one of the key techniques in video management and browsing. In this paper, we present a generic framework of video summarization based on the modeling of viewer's attention. Without fully semantic understanding of video content, this framework takes advantage of understanding of video content, this framework takes advantage of computational attention models and eliminates the needs of complex heuristic rules in video summarization. A set of methods of audio-visual attention model features are proposed and presented. The experimental evaluations indicate that the computational attention based approach is an effective alternative to video semantic analysis for video summarization.

602 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.

10,727 citations

Proceedings ArticleDOI
12 Nov 1981
TL;DR: In this article, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.
Abstract: Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

8,078 citations

Journal ArticleDOI
TL;DR: Design of the MPEG algorithm presents a difficult challenge since quality requirements demand high compression that cannot be achieved with only intraframe coding, and the algorithm’s random access requirement is best satisfied with pure intraframes coding.
Abstract: The Moving Picture Experts Group (MPEG) standard addresses compression of video signals at approximately 1.5M-bits. MPEG is a generic standard and is independent of any particular applications. Applications of compressed video on digital storage media include asymmetric applications such as electronic publishing, games and entertainment. Symmetric applications of digital video include video mail, video conferencing, videotelephone and production of electronic publishing. Design of the MPEG algorithm presents a difficult challenge since quality requirements demand high compression that cannot be achieved with only intraframe coding. The algorithm’s random access requirement, however, is best satisfied with pure intraframe coding. MPEG uses predictive and interpolative coding techniques to answer this challenge. Extensive details are presented.

2,447 citations

Book
01 Jan 1979
TL;DR: In this paper, Bordwell and Thompson's Film Art has been the best-selling and most widely respected introduction to the analysis of cinema, supporting a skills-centered approach supported by examples from many periods and countries.
Abstract: Film is an art form with a language and an aesthetic all its own. Since 1979, David Bordwell and Kristin Thompson's Film Art has been the best-selling and most widely respected introduction to the analysis of cinema. Taking a skills-centered approach supported by examples from many periods and countries, the authors help students develop a core set of analytical skills that will enrich their understanding of any film, in any genre. In-depth examples deepen students' appreciation for how creative choices by filmmakers affect what viewers experience and how they respond.

1,561 citations

Journal ArticleDOI
TL;DR: This paper describes a hierarchical computational framework for the determination of dense displacement fields from a pair of images, and an algorithm consistent with that framework, based on a scale-based separation of the image intensity information and the process of measuring motion.
Abstract: THE ROBUST MEASUREMENT OF VISUAL MOTION FROM DIGITIZED IMAGE SEQUENCES HAS BEEN AN IMPORTANT BUT DIFFICULT PROBLEM IN COMPUTER VISION. THIS PAPER DESCRIBES A HIERARCHICAL COMPUTATIONAL FRAMEWORK FOR THE DETERMINATION OF DENSE DISPLACEMENT FIELDS FROM A PAIR OF IMAGES, AND AN ALGORITHM CONSIST- ENT WITH THAT FRAMEWORK. OUR FRAMEWORK IS BASED ON THE SEPARATION OF THE IMAGE INTENSITY INFORMATION AS WELL AS THE PROCESS OF MEASURING MOTION ACCORDING TO SCALE. THE LARGE SCALE INTENSITY INFORMATION IS FIRST USED TO OBTAIN ROUGH ESTIMATES OF IMAGE MOTION, WHICH ARE THEN REFINED BY USING INTENSITY INFORMATION AT SMALLER SCALES. THE ESTIMATES ARE IN THE FORM OF DISPLACEMENT (OR VELOCITY) VECTORS FOR PIXELS AND ARE ACCOMPANIED BY A DIRECTION-DEPENDENT CONFIDENCE MEASURE. A SMOOTHNESS CONSTRAINT IS EMPLOYED TO PROPAGATE THE MEASUREMENTS WITH HIGH CONFIDENCE TO THEIR NEIGBORING AREAS WHERE THE CONFIDENCES ARE LOW. AT ALL LEVELS, THE COMPUTATIONS ARE PIXEL-PARALLEL, UNIFORM ACROSS THE IMAGE, AND BASED ON INFORMATION FROM A SMALL NEIGHBORHOOD OF A PIXEL. FOR OUR ALGORITHM, THE LOCAL DISPLACEMENT VECTORS ARE DETERMIND BY MINI- MIZING THE SUM-OF-SQUARED DIFFERENCES (SSD) OF INTENSITIES, THE CONFIDENCE MEASURES ARE DERIVED FROM THE SHAPE OF THE SSD SURFACE, AND THE SMOOTHNESS CONSTRAINT IS CAST IN THE FORM OF ENERGY MINIMIZATION. RESULTS OF APPLYING OUR ALGORITHM TO PAIRS OF REAL IMAGES ARE INCLUDED. IN ADDITION TO OUR OWN

1,175 citations


"Automatic partitioning of full-moti..." refers methods in this paper

  • ...Computing such a resolution of motion vectors is very time consuming, requiring either iterative refinement of a gradient-based algorithm (Horn and Schunck 1981) or the construction of a hierarchical framework of cross-correlation (Anandan 1989)....

    [...]