Video Shot Characterization Using Principles of Perceptual Prominence and Perceptual Grouping in Spatio–Temporal Domain

doi:10.1109/TCSVT.2007.903812

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Iterative Tensor Voting for Perceptual Grouping of Ill-Defined Curvilinear Structures

[...]

Leandro A. Loss¹, George Bebis², Bahram Parvin¹•Institutions (2)

Lawrence Berkeley National Laboratory¹, University of Nevada, Reno²

17 Mar 2011-IEEE Transactions on Medical Imaging

TL;DR: A novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures by gradually shifting from an exploratory to an exploitative mode and compared to prior methods on synthetic and annotated real data, showing high precision rates.

...read moreread less

Abstract: In this paper, a novel approach is proposed for perceptual grouping and localization of ill-defined curvilinear structures. Our approach builds upon the tensor voting and the iterative voting frameworks. Its efficacy lies on iterative refinements of curvilinear structures by gradually shifting from an exploratory to an exploitative mode. Such a mode shifting is achieved by reducing the aperture of the tensor voting fields, which is shown to improve curve grouping and inference by enhancing the concentration of the votes over promising, salient structures. The proposed technique is validated on delineating adherens junctions that are imaged through fluorescence microscopy. However, the method is also applicable for screening other organisms based on characteristics of their cell wall structures. Adherens junctions maintain tissue structural integrity and cell-cell interactions. Visually, they exhibit fibrous patterns that may be diffused, heterogeneous in fluorescence intensity, or punctate and frequently perceptual. Besides the application to real data, the proposed method is compared to prior methods on synthetic and annotated real data, showing high precision rates.

...read moreread less

29 citations

Cites background from "Video Shot Characterization Using P..."

...Fromwhen it was initially conceived by the Gestalt psychologists [2] to now, perceptual grouping has evolved from the passive observation of human behavior to its inclusion in a wide-range of computer vision applications [3]–[6]....
[...]

References

PDF

Open Access

More filters

Book•

Probabilistic reasoning in expert systems

[...]

Richard E. Neapolitan

01 Jan 1990

476 citations

"Video Shot Characterization Using P..." refers background or methods in this paper

...As a result of instantiation of virtual evidence nodes, belief propagation [20] takes place in the Bayesian network, and the nodes A, B, C, D, and S compute the grouping saliency....
[...]
...The evaluated association measure is used to instantiate a virtual evidence node [20] which contributes probability values in favor and against the saliency of the grouping....
[...]

Journal Article•DOI•

A fully automated content-based video search engine supporting spatiotemporal queries

[...]

Shih-Fu Chang¹, William Chen¹, Horace J. Meng¹, Hari Sundaram¹, Di Zhong¹ - Show less +1 more•Institutions (1)

Columbia University¹

01 Sep 1998-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The resulting system, called VideoQ, is the first on-line video search engine supporting automatic object-based indexing and spatiotemporal queries, and performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease.

...read moreread less

Abstract: The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ , is the first on-line video search engine supporting automatic object-based indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease.

...read moreread less

431 citations

"Video Shot Characterization Using P..." refers background in this paper

...Object centric descriptions facilitate an object oriented search [29], [30]....
[...]

Journal Article•DOI•

The Legacy of Gestalt Psychology

[...]

Irvin Rock¹, Stephen E. Palmer•Institutions (1)

University of California, Berkeley¹

01 Dec 1990-Scientific American

386 citations

"Video Shot Characterization Using P..." refers background in this paper

...The researchers in the “Gestalt” school of psychophysics [12]–[14] have identified certain relationships, like continuation of boundaries, proximity, adjacency [3], enclosures, and similarity of shape, size, intensity, and directionality, which play the key role in the formation of “good” structures in the spatial domain....
[...]

Proceedings Article•DOI•

Finding and tracking people from the bottom up

[...]

Deva Ramanan¹, David Forsyth¹•Institutions (1)

University of California, Berkeley¹

18 Jun 2003

TL;DR: A tracker that can track moving people in long sequences without manual initialization is described and it is shown the tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net.

...read moreread less

Abstract: We describe a tracker that can track moving people in long sequences without manual initialization. Moving people are modeled with the assumption that, while configuration can vary quite substantially from frame to frame, appearance does not. This leads to an algorithm that firstly builds a model of the appearance of the body of each individual by clustering candidate body segments, and then uses this model to find all individuals in each frame. Unusually, the tracker does not rely on a model of human dynamics to identify possible instances of people; such models are unreliable, because human motion is fast and large accelerations are common. We show our tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net. Experiments on video of real scenes demonstrate that this tracker can (a) count distinct individuals; (b) identify and track them; (c) recover when it loses track, for example, if individuals are occluded or briefly leave the view; (d) identify the configuration of the body largely correctly; and (e) is not dependent on particular models of human motion.

...read moreread less

320 citations

"Video Shot Characterization Using P..." refers methods in this paper

...In [23], the authors manually construct the body model as a fully deformable connected kinematic model....
[...]

Dissertation•DOI•

Models of bottom-up and top-down visual attention

[...]

Christof Koch, Laurent Itti

01 Jan 2000

TL;DR: A detailed computational model of basic pattern vision in humans and its modulation by top-down attention is presented, able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.

...read moreread less

Abstract: When we observe our visual environment, we do not perceive all its components as being equally interesting. Some objects automatically and effortlessly “pop-out” from their surroundings, that is, they draw our visual attention, in a “ bottom-up” manner, towards them. In a first approximation, focal visual attention acts as a rapidly shiftable “spotlight,” which allows only the selected information to reach higher levels of processing and representation. Most models of the bottom-up control of attention are based on the concept of a saliency map, that is, an explicit two-dimensional map that encodes the conspicuity of objects in the visual environment. Competition among neurons in this map gives rise to a single winning location that corresponds to the next attended target. Inhibiting this location automatically allows the system to attend to the next most salient location. A first body of work in this thesis describes a detailed computer implementation of such a scheme, focusing on the problem of combining information across modalities, here orientation, intensity and color information, in a purely stimulus-driven manner. The model is applied to common psychophysical stimuli as well as to very demanding visual search tasks. Its successful performance is used to address the extent to which the primate visual system carries out visual search via one or more such saliency maps and how this can be tested. We next address the question of what happens once our attention is focused onto a restricted part of our visual field. There is mounting experimental evidence that attention is far more sophisticated than a simple feed-forward spatially-selective filtering process. Indeed, visual processing appears to be significantly different inside the attentional spotlight than outside. That is, in addition to its properties as a feed-forward information processing and transmission bottleneck, focal visual attention feeds back and locally modulates, in a “top-down” manner, the visual processing and representation of selected objects. The second body of work presented in this thesis is concerned with a detailed computational model of basic pattern vision in humans and its modulation by top-down attention. We start by acquiring a complete dataset of five different simple psychophysical experiments, including discriminations of contrast, orientation and spatial frequency of simple pattern stimuli by human observers. This experimental dataset places strict constraints on our model of early pattern vision. The model, however, is eventually able to reproduce the entire dataset while assuming plausible neurobiological components. The model is further applied to existing psychophysical data which demonstrates how top-down attention alters performance in these simple psychophysical discrimination experiments. Our model is able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.

...read moreread less

308 citations

"Video Shot Characterization Using P..." refers background in this paper

...However, we should clearly distinguish between perceptual prominence and the visual saliency term which is associated with the modeling of visual attention in primates [32]....
[...]

Collapse

Video Shot Characterization Using Principles of Perceptual Prominence and Perceptual Grouping in Spatio–Temporal Domain

Citations

Cites background from "Video Shot Characterization Using P..."

References

"Video Shot Characterization Using P..." refers background or methods in this paper

"Video Shot Characterization Using P..." refers background in this paper

"Video Shot Characterization Using P..." refers background in this paper

"Video Shot Characterization Using P..." refers methods in this paper

"Video Shot Characterization Using P..." refers background in this paper

Related Papers (5)