This paper presents a coherent computational approach to the modeling of the bottom-up visual attention, mainly based on the current understanding of the HVS behavior, which includes Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions.
Abstract:
Visual attention is a mechanism which filters out redundant visual information and detects the most relevant parts of our visual field. Automatic determination of the most visually relevant areas would be useful in many applications such as image and video coding, watermarking, video browsing, and quality assessment. Many research groups are currently investigating computational modeling of the visual attention system. The first published computational models have been based on some basic and well-understood human visual system (HVS) properties. These models feature a single perceptual layer that simulates only one aspect of the visual system. More recent models integrate complex features of the HVS and simulate hierarchical perceptual representation of the visual input. The bottom-up mechanism is the most occurring feature found in modern models. This mechanism refers to involuntary attention (i.e., salient spatial visual features that effortlessly or involuntary attract our attention). This paper presents a coherent computational approach to the modeling of the bottom-up visual attention. This model is mainly based on the current understanding of the HVS behavior. Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions are some of the features implemented in this model. The performances of this algorithm are assessed by using natural images and experimental measurements from an eye-tracking system. Two adequate well-known metrics (correlation coefficient and Kullbacl-Leibler divergence) are used to validate this model. A further metric is also defined. The results from this model are finally compared to those from a reference bottom-up model.
TL;DR: A set of novel features, including multiscale contrast, center-surround histogram, and color spatial distribution, are proposed to describe a salient object locally, regionally, and globally.
TL;DR: A taxonomy of nearly 65 models of attention provides a critical comparison of approaches, their capabilities, and shortcomings, and addresses several challenging issues with models, including biological plausibility of the computations, correlation with eye movement datasets, bottom-up and top-down dissociation, and constructing meaningful performance measures.
TL;DR: A new type of saliency is proposed—context-aware saliency—which aims at detecting the image regions that represent the scene, and a detection algorithm is presented which is based on four principles observed in the psychological literature.
TL;DR: A new type of saliency is proposed – context-aware saliency – which aims at detecting the image regions that represent the scene and a detection algorithm is presented which is based on four principles observed in the psychological literature.
TL;DR: A set of novel features including multi-scale contrast, center-surround histogram, and color spatial distribution are proposed to describe a salient object locally, regionally, and globally for salient object detection.
TL;DR: For instance, the authors discusses the multiplicity of the consciousness of self in the form of the stream of thought and the perception of space in the human brain, which is the basis for our work.
TL;DR: A new hypothesis about the role of focused attention is proposed, which offers a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.
TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
TL;DR: This study addresses the question of how simple networks of neuron-like elements can account for a variety of phenomena associated with this shift of selective visual attention and suggests a possible role for the extensive back-projection from the visual cortex to the LGN.
Q1. What is the visibility threshold for a visual cell?
Accurate nonlinear models simulating visual cells behaviors are used to calculate the visibility threshold associated to each value of each component.
Q2. How much is the signal sent from the photoreceptors?
For instance, the signal stemming from the photoreceptors is assumed to be compressed by a factor of about 130:1, before it is transmitted to the visual cortex.
Q3. What is the purpose of a saccade?
The purpose of this type of eye movement, occurring up to three times per second, is to direct a small part of their visual field into the fovea in order to achieve a closer inspection.
Q4. What is the effect of the saccade on the visual system?
Studies by Kapadia et al. [19], [20] show that the cell’s response can be greatly enhanced by the presentation of coaligned, coorientated stimuli in the neighborhood and increases with the number of appropriate stimuli placed outside the CRF.
Q5. What is the first information reduction in the retina?
The first information reduction appears in the retina in which the photoreceptors only process the wavelengths of the visible light.
Q6. What is the role of the cortical cell in the RF?
In addition, recent studies [15], [16], [17], [18], [19], [20] have shown that the cortical cell’s response can be influenced by stimuli outside their classical RF.
Q7. What is the main purpose of the paper?
the results are summarized and some conclusions are drawn in Section 6.HVS acts as a passive selector, acknowledging some stimuli but rejecting others.
Q8. What is the role of the center-surround organization in the definition of CSF?
This center-surround organization is responsible for their great sensibility to the contrast and to the spatial frequency leading to the definition of Contrast Sensitivity Function (CSF).