scispace - formally typeset
Search or ask a question
Topic

Human visual system model

About: Human visual system model is a research topic. Over the lifetime, 8697 publications have been published within this topic receiving 259440 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Multivariate decoding of magneto-encephalography data is used to characterize the neural underpinnings of attentional selection in natural scenes with high temporal precision and shows that brain activity quickly tracks the presence of objects in scenes, but crucially only for those objects that were immediately relevant for the participant.
Abstract: The human visual system can only represent a small subset of the many objects present in cluttered scenes at any given time, such that objects compete for representation. Despite these processing limitations, the detection of object categories in cluttered natural scenes is remarkably rapid. How does the brain efficiently select goal-relevant objects from cluttered scenes? In the present study, we used multivariate decoding of magneto-encephalography (MEG) data to track the neural representation of within-scene objects as a function of top-down attentional set. Participants detected categorical targets (cars or people) in natural scenes. The presence of these categories within a scene was decoded from MEG sensor patterns by training linear classifiers on differentiating cars and people in isolation and testing these classifiers on scenes containing one of the two categories. The presence of a specific category in a scene could be reliably decoded from MEG response patterns as early as 160 ms, despite substantial scene clutter and variation in the visual appearance of each category. Strikingly, we find that these early categorical representations fully depend on the match between visual input and top-down attentional set: only objects that matched the current attentional set were processed to the category level within the first 200 ms after scene onset. A sensor-space searchlight analysis revealed that this early attention bias was localized to lateral occipitotemporal cortex, reflecting top-down modulation of visual processing. These results show that attention quickly resolves competition between objects in cluttered natural scenes, allowing for the rapid neural representation of goal-relevant objects. SIGNIFICANCE STATEMENT: Efficient attentional selection is crucial in many everyday situations. For example, when driving a car, we need to quickly detect obstacles, such as pedestrians crossing the street, while ignoring irrelevant objects. How can humans efficiently perform such tasks, given the multitude of objects contained in real-world scenes? Here we used multivariate decoding of magnetoencephalogaphy data to characterize the neural underpinnings of attentional selection in natural scenes with high temporal precision. We show that brain activity quickly tracks the presence of objects in scenes, but crucially only for those objects that were immediately relevant for the participant. These results provide evidence for fast and efficient attentional selection that mediates the rapid detection of goal-relevant objects in real-world environments.Copyright © 2016 the authors 0270-6474/16/3610522-07$15.00/0. Language: en

91 citations

Journal ArticleDOI
TL;DR: Experimental results have demonstrated that the proposed steganographic scheme can achieve statistical security without degrading the image quality or the embedding capacity.
Abstract: Most state-of-the-art binary image steganographic techniques only consider the flipping distortion according to the human visual system, which will be not secure when they are attacked by steganalyzers. In this paper, a binary image steganographic scheme that aims to minimize the embedding distortion on the texture is presented. We extract the complement, rotation, and mirroring-invariant local texture patterns (crmiLTPs) from the binary image first. The weighted sum of crmiLTP changes when flipping one pixel is then employed to measure the flipping distortion corresponding to that pixel. By testing on both simple binary images and the constructed image data set, we show that the proposed measurement can well describe the distortions on both visual quality and statistics. Based on the proposed measurement, a practical steganographic scheme is developed. The steganographic scheme generates the cover vector by dividing the scrambled image into superpixels. Thereafter, the syndrome-trellis code is employed to minimize the designed embedding distortion. Experimental results have demonstrated that the proposed steganographic scheme can achieve statistical security without degrading the image quality or the embedding capacity.

91 citations

Journal ArticleDOI
TL;DR: This work proposes a method to segment the object of interest by finding the “optimal” closed contour around the fixation point in the polar space, avoiding the perennial problem of scale in the Cartesian space.
Abstract: Attention is an integral part of the human visual system and has been widely studied in the visual attention literature. The human eyes fixate at important locations in the scene, and every fixation point lies inside a particular region of arbitrary shape and size, which can either be an entire object or a part of it. Using that fixation point as an identification marker on the object, we propose a method to segment the object of interest by finding the “optimal” closed contour around the fixation point in the polar space, avoiding the perennial problem of scale in the Cartesian space. The proposed segmentation process is carried out in two separate steps: First, all visual cues are combined to generate the probabilistic boundary edge map of the scene; second, in this edge map, the “optimal” closed contour around a given fixation point is found. Having two separate steps also makes it possible to establish a simple feedback between the mid-level cue (regions) and the low-level visual cues (edges). In fact, we propose a segmentation refinement process based on such a feedback process. Finally, our experiments show the promise of the proposed method as an automatic segmentation framework for a general purpose visual system.

91 citations

Book ChapterDOI
Yin Li1, Yue Zhou1, Junchi Yan1, Zhibin Niu1, Jie Yang1 
23 Sep 2009
TL;DR: A novel visual saliency detection method - the conditional saliency for both image and video, approximate the conditional entropy by the lossy coding length of multivariate Gaussian data and indicates a robust and reliable feature invariance saliency.
Abstract: By the guidance of attention, human visual system is able to locate objects of interest in complex scene. In this paper, we propose a novel visual saliency detection method - the conditional saliency for both image and video. Inspired by biological vision, the definition of visual saliency follows a strictly local approach. Given the surrounding area, the saliency is defined as the minimum uncertainty of the local region, namely the minimum conditional entropy, when the perceptional distortion is considered. To simplify the problem, we approximate the conditional entropy by the lossy coding length of multivariate Gaussian data. The final saliency map is accumulated by pixels and further segmented to detect the proto-objects. Experiments are conducted on both image and video. And the results indicate a robust and reliable feature invariance saliency.

91 citations

Journal ArticleDOI
TL;DR: It is shown that below this just-noticeable asymmetry threshold, where subtle artifacts start to appear, symmetric coding performs better than asymmetric coding in terms of perceived 3D video quality, and that the choice between asymmetric vs. symmetrical coding depends on PSNR; hence, the available total bitrate.
Abstract: It is well known that the human visual system can perceive high frequencies in 3D stereo video, even if that information is present in only one of the views. Therefore, the best perceived 3D stereo video quality may be achieved by asymmetric coding where the reference and auxiliary (right and left) views are coded at unequal PSNR. However, the questions of what is the best level of asymmetry in order to maximize the perceived quality and whether asymmetry should be achieved by spatial resolution reduction or PSNR (quality) reduction have been open issues. We conducted extensive subjective tests, which indicate that if the reference view is encoded at sufficiently high quality and the auxiliary view is encoded at a lower quality but above a certain PSNR threshold, then the degradation in 3D video quality is unnoticeable. Since asymmetric coding by PSNR reduction gives finer control over achievable PSNR values over spatial resolution reduction, asymmetry by PSNR reduction allows us to encode at a point more close to this just-noticeable asymmetry PSNR threshold; hence will be preferred over the spatial resolution reduction method. Subjective tests also indicate that below this just-noticeable asymmetry threshold, where subtle artifacts start to appear, symmetric coding performs better than asymmetric coding in terms of perceived 3D video quality. Therefore, we show that the choice between asymmetric vs. symmetric coding depends on PSNR; hence, the available total bitrate. This paper also proposes a novel asymmetric scalable stereo video coding framework to enable adaptive stereoscopic video streaming taking full advantage of these observations and subjective test results.

91 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202349
202294
2021279
2020311
2019351
2018348