scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Video Shot Characterization Using Principles of Perceptual Prominence and Perceptual Grouping in Spatio–Temporal Domain

TL;DR: A computational model for analyzing a video shot based on a novel principle of perceptual prominence that captures the key aspects of mise-en-scene required for characterizing a video scene.
Abstract: We present a novel approach for applying perceptual grouping principles to the spatio-temporal domain of video. Our perceptual grouping scheme, applied on blobs, makes use of a specified spatio-temporal coherence model. The grouping scheme identifies the blob cliques or perceptual clusters in the scene. We propose a computational model for analyzing a video shot based on a novel principle of perceptual prominence. The principle of perceptual prominence captures the key aspects of mise-en-scene required for characterizing a video scene.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures by gradually shifting from an exploratory to an exploitative mode and compared to prior methods on synthetic and annotated real data, showing high precision rates.
Abstract: In this paper, a novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures. Our approach builds upon the tensor voting and the iterative voting frameworks. Its efficacy lies on iterative refinements of curvilinear structures by gradually shifting from an exploratory to an exploitative mode. Such a mode shifting is achieved by reducing the aperture of the tensor voting fields, which is shown to improve curve grouping and inference by enhancing the concentration of the votes over promising, salient structures. The proposed technique is validated on delineating adherens junctions that are imaged through fluorescence microscopy. However, the method is also applicable for screening other organisms based on characteristics of their cell wall structures. Adherens junctions maintain tissue structural integrity and cell-cell interactions. Visually, they exhibit fibrous patterns that may be diffused, heterogeneous in fluorescence intensity, or punctate and frequently perceptual. Besides the application to real data, the proposed method is compared to prior methods on synthetic and annotated real data, showing high precision rates.

29 citations


Cites background from "Video Shot Characterization Using P..."

  • ...Fromwhen it was initially conceived by the Gestalt psychologists [2] to now, perceptual grouping has evolved from the passive observation of human behavior to its inclusion in a wide-range of computer vision applications [3]–[6]....

    [...]

References
More filters
Book
01 Jan 1990

476 citations


"Video Shot Characterization Using P..." refers background or methods in this paper

  • ...As a result of instantiation of virtual evidence nodes, belief propagation [20] takes place in the Bayesian network, and the nodes A, B, C, D, and S compute the grouping saliency....

    [...]

  • ...The evaluated association measure is used to instantiate a virtual evidence node [20] which contributes probability values in favor and against the saliency of the grouping....

    [...]

Journal ArticleDOI
Shih-Fu Chang1, William Chen1, Horace J. Meng1, Hari Sundaram1, Di Zhong1 
TL;DR: The resulting system, called VideoQ, is the first on-line video search engine supporting automatic object-based indexing and spatiotemporal queries, and performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease.
Abstract: The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ , is the first on-line video search engine supporting automatic object-based indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease.

431 citations


"Video Shot Characterization Using P..." refers background in this paper

  • ...Object centric descriptions facilitate an object oriented search [29], [30]....

    [...]

Journal ArticleDOI

386 citations


"Video Shot Characterization Using P..." refers background in this paper

  • ...The researchers in the “Gestalt” school of psychophysics [12]–[14] have identified certain relationships, like continuation of boundaries, proximity, adjacency [3], enclosures, and similarity of shape, size, intensity, and directionality, which play the key role in the formation of “good” structures in the spatial domain....

    [...]

Proceedings ArticleDOI
18 Jun 2003
TL;DR: A tracker that can track moving people in long sequences without manual initialization is described and it is shown the tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net.
Abstract: We describe a tracker that can track moving people in long sequences without manual initialization. Moving people are modeled with the assumption that, while configuration can vary quite substantially from frame to frame, appearance does not. This leads to an algorithm that firstly builds a model of the appearance of the body of each individual by clustering candidate body segments, and then uses this model to find all individuals in each frame. Unusually, the tracker does not rely on a model of human dynamics to identify possible instances of people; such models are unreliable, because human motion is fast and large accelerations are common. We show our tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net. Experiments on video of real scenes demonstrate that this tracker can (a) count distinct individuals; (b) identify and track them; (c) recover when it loses track, for example, if individuals are occluded or briefly leave the view; (d) identify the configuration of the body largely correctly; and (e) is not dependent on particular models of human motion.

320 citations


"Video Shot Characterization Using P..." refers methods in this paper

  • ...In [23], the authors manually construct the body model as a fully deformable connected kinematic model....

    [...]

DissertationDOI
01 Jan 2000
TL;DR: A detailed computational model of basic pattern vision in humans and its modulation by top-down attention is presented, able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.
Abstract: When we observe our visual environment, we do not perceive all its components as being equally interesting. Some objects automatically and effortlessly “pop-out” from their surroundings, that is, they draw our visual attention, in a “ bottom-up” manner, towards them. In a first approximation, focal visual attention acts as a rapidly shiftable “spotlight,” which allows only the selected information to reach higher levels of processing and representation. Most models of the bottom-up control of attention are based on the concept of a saliency map, that is, an explicit two-dimensional map that encodes the conspicuity of objects in the visual environment. Competition among neurons in this map gives rise to a single winning location that corresponds to the next attended target. Inhibiting this location automatically allows the system to attend to the next most salient location. A first body of work in this thesis describes a detailed computer implementation of such a scheme, focusing on the problem of combining information across modalities, here orientation, intensity and color information, in a purely stimulus-driven manner. The model is applied to common psychophysical stimuli as well as to very demanding visual search tasks. Its successful performance is used to address the extent to which the primate visual system carries out visual search via one or more such saliency maps and how this can be tested. We next address the question of what happens once our attention is focused onto a restricted part of our visual field. There is mounting experimental evidence that attention is far more sophisticated than a simple feed-forward spatially-selective filtering process. Indeed, visual processing appears to be significantly different inside the attentional spotlight than outside. That is, in addition to its properties as a feed-forward information processing and transmission bottleneck, focal visual attention feeds back and locally modulates, in a “top-down” manner, the visual processing and representation of selected objects. The second body of work presented in this thesis is concerned with a detailed computational model of basic pattern vision in humans and its modulation by top-down attention. We start by acquiring a complete dataset of five different simple psychophysical experiments, including discriminations of contrast, orientation and spatial frequency of simple pattern stimuli by human observers. This experimental dataset places strict constraints on our model of early pattern vision. The model, however, is eventually able to reproduce the entire dataset while assuming plausible neurobiological components. The model is further applied to existing psychophysical data which demonstrates how top-down attention alters performance in these simple psychophysical discrimination experiments. Our model is able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.

308 citations


"Video Shot Characterization Using P..." refers background in this paper

  • ...However, we should clearly distinguish between perceptual prominence and the visual saliency term which is associated with the modeling of visual attention in primates [32]....

    [...]