scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Video Shot Characterization Using Principles of Perceptual Prominence and Perceptual Grouping in Spatio–Temporal Domain

TL;DR: A computational model for analyzing a video shot based on a novel principle of perceptual prominence that captures the key aspects of mise-en-scene required for characterizing a video scene.
Abstract: We present a novel approach for applying perceptual grouping principles to the spatio-temporal domain of video. Our perceptual grouping scheme, applied on blobs, makes use of a specified spatio-temporal coherence model. The grouping scheme identifies the blob cliques or perceptual clusters in the scene. We propose a computational model for analyzing a video shot based on a novel principle of perceptual prominence. The principle of perceptual prominence captures the key aspects of mise-en-scene required for characterizing a video scene.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures by gradually shifting from an exploratory to an exploitative mode and compared to prior methods on synthetic and annotated real data, showing high precision rates.
Abstract: In this paper, a novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures. Our approach builds upon the tensor voting and the iterative voting frameworks. Its efficacy lies on iterative refinements of curvilinear structures by gradually shifting from an exploratory to an exploitative mode. Such a mode shifting is achieved by reducing the aperture of the tensor voting fields, which is shown to improve curve grouping and inference by enhancing the concentration of the votes over promising, salient structures. The proposed technique is validated on delineating adherens junctions that are imaged through fluorescence microscopy. However, the method is also applicable for screening other organisms based on characteristics of their cell wall structures. Adherens junctions maintain tissue structural integrity and cell-cell interactions. Visually, they exhibit fibrous patterns that may be diffused, heterogeneous in fluorescence intensity, or punctate and frequently perceptual. Besides the application to real data, the proposed method is compared to prior methods on synthetic and annotated real data, showing high precision rates.

29 citations


Cites background from "Video Shot Characterization Using P..."

  • ...Fromwhen it was initially conceived by the Gestalt psychologists [2] to now, perceptual grouping has evolved from the passive observation of human behavior to its inclusion in a wide-range of computer vision applications [3]–[6]....

    [...]

References
More filters
Proceedings ArticleDOI
01 Jun 1988
TL;DR: The prototype system for HSS creation works with a subset of three of these relationships; proximity, similarity, and continuation, which can be used to represent the perceptual structures found in segmented images which arise from inter-region relationships between the image segments.
Abstract: H.._.ierarchical S. cene Structures (HSS) can be used to represent the perceptual structures found in segmented images which arise from inter-region relationships between the image segments. These relationships include continuation of boundaries, proximity, enclosure, and similarity of shape, intensity, size, and directionality. Our prototype system for HSS creation works with a subset of three of these relationships; proximity, similarity, and continuation. Incorporating this perceptual organizing information into a structured, symbolic scene representation strongly increases the available information about the scene. This information can be used to improve pattern recognition, and extensions may address problems in 3-D modeling, multi-temporal correlations , and multisensor information fusion.

9 citations


"Video Shot Characterization Using P..." refers background in this paper

  • ...Moreover, the specific type of a scene may dictate the perceptual expectations and the emergent groupings which could influence the dominant association for a cluster [36]....

    [...]

Proceedings ArticleDOI
21 Nov 1995
TL;DR: This work exploits the similarity of feature attributes between frames to track 2D structures and suggests that if the authors overlay the corresponding organizations from different frames onto one frame then they will get a highly organized pattern which exhibits the Gestalt relationships of similarity, parallelism, and proximity.
Abstract: We investigate the role of perceptual organization in tracking 2D structures in long image sequences. Heretofore, the role of perceptual organization in computer vision has mainly been in static 2D image analysis. The role of perceptual organization in 2D motion sequence analysis has been minimal. We exploit the similarity of feature attributes between frames to track 2D structures. The incremental variation of attributes of a structure with time can be assumed to be small. This is a manifestation of the small motion assumption between frames. Even the change in location of an organization from one frame to the other can be taken to be small. This suggests that if we overlay the corresponding organizations from different frames onto one frame then we will get a highly organized pattern which exhibits the Gestalt relationships of similarity, parallelism, and proximity.

8 citations


"Video Shot Characterization Using P..." refers background in this paper

  • ...Sarkar [7] has applied temporal coherence to track geometrical structures like rectangles, quadrilaterals, ellipses, ribbons, etc, optimally over all the frames by exploiting the spatial and temporal coherence of the feature over time....

    [...]

  • ...Unlike [7] where the objective is to track the 2-D primitives, we identify meaningful spatio–temporal component regions as groups of spatio–temporal primitives....

    [...]

Journal ArticleDOI
TL;DR: This work proposes a novel clustering strategy, tailored towards the specific requirements of clustering in video data, that takes care of many of the problems with traditional clustering schemes applied to the heterogeneous feature space of video.

7 citations


"Video Shot Characterization Using P..." refers methods in this paper

  • ...However, changes in the blob boundaries resulting from slight variations in the DSCT segmentation from one stack to another may also lead to inconsistent small shifts in the blob centroid....

    [...]

  • ...The DSCT algorithm correctly models all the homogeneous color regions as space–time blob tracks....

    [...]

  • ...We use a hierarchical clustering methodology referred to as the decoupled semantics clustering tree (DSCT) [2] which is tailored specifically to the requirements of clustering in video data....

    [...]

  • ...The failure cases, which had the foreground cluster merged with the background, or with another adjacent foreground cluster, mostly resulted because of coarseness in the DSCT segmentation, or articulated parts, or the particular settings of scenes (so that distinct characteristics of foreground did not emerge)....

    [...]

  • ...The DSCT scheme identifies homogeneous color regions in space–time and models them as tracks of 2-...

    [...]

Book ChapterDOI
01 Nov 1998
TL;DR: The use of a representation, called a body plan, to segment and to recognize people and animals in complex environments is described and previous work on finding clothing by marking folds and then assembling groups of folds is described.
Abstract: We describe the use of a representation, called a body plan, to segment and to recognize people and animals in complex environments. The representation is an organized collection of grouping hints obtained from a combination of constraints on color and texture and constraints on geometric properties such as the structure of individual parts and the relationships, between parts. The approach is illustrated with two examples of programs that successfully use body plans for recognition: one example involves determining whether a picture contains a scantily clad human, using a body plan built by hand; the other involves determining whether a picture contains a horse, using a body plan learned from image data. In both cases, the system demonstrates excellent performance on large, uncontrolled test sets and very large and diverse control sets. The mechanism of recognition by assembly is very general; we describe previous work on finding clothing by marking folds and then assembling groups of folds.

6 citations