scispace - formally typeset
Search or ask a question
Topic

Human visual system model

About: Human visual system model is a research topic. Over the lifetime, 8697 publications have been published within this topic receiving 259440 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A detailed instantiation, in the form of a computational cognitive model, of a comprehensive theory of human visual processing known as “active vision” is described, built using the Executive Process-Interactive Control cognitive architecture.
Abstract: Human visual search plays an important role in many human–computer interaction HCI tasks. Better models of visual search are needed not just to predict overall performance outcomes, such as whether people will be able to find the information needed to complete an HCI task, but to understand the many human processes that interact in visual search, which will in turn inform the detailed design of better user interfaces. This article describes a detailed instantiation, in the form of a computational cognitive model, of a comprehensive theory of human visual processing known as “active vision” Findlay & Gilchrist, 2003. The computational model is built using the Executive Process-Interactive Control cognitive architecture. Eye-tracking data from three experiments inform the development and validation of the model. The modeling asks—and at least partially answers—the four questions of active vision: a What can be perceived in a fixation? b When do the eyes move? c Where do the eyes move? d What information is integrated between eye movements? Answers include: a Items nearer the point of gaze are more likely to be perceived, and the visual features of objects are sometimes misidentified. b The eyes move after the fixated visual stimulus has been processed i.e., has entered working memory. c The eyes tend to go to nearby objects. d Only the coarse spatial information of what has been fixated is likely maintained between fixations. The model developed to answer these questions has both scientific and practical value in that the model gives HCI researchers and practitioners a better understanding of how people visually interact with computers, and provides a theoretical foundation for predictive analysis tools that can predict aspects of that interaction.

48 citations

Journal ArticleDOI
TL;DR: It is found that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures, much like the human visual system.
Abstract: Deep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of visual object recognition, with performance now surpassing humans. While CNNs can accurately assign one image to potentially thousands of categories, network performance could be the result of layers that are tuned to represent the visual shape of objects, rather than object category, since both are often confounded in natural images. Using two stimulus sets that explicitly dissociate shape from category, we correlate these two types of information with each layer of multiple CNNs. We also compare CNN output with fMRI activation along the human visual ventral stream by correlating artificial with neural representations. We find that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures. Comparing CNNs with fMRI brain data, early visual cortex (V1) and early layers of CNNs encode shape information. Anterior ventral temporal cortex encodes category information, which correlates best with the final layer of CNNs. The interaction between shape and category that is found along the human visual ventral pathway is echoed in multiple deep networks. Our results suggest CNNs represent category information independently from shape, much like the human visual system.

48 citations

Journal ArticleDOI
TL;DR: A novel reduced-reference (RR) quality metric with the integration of bottom-up and top-down strategies is proposed, which stems from the recently revealed free energy principle that tells that the human visual system seeks to comprehend an input image via uncertainty removal.
Abstract: In image/video systems, contrast adjustment which manages to enhance visual quality is nowadays an important research topic. Yet very limited struggles have been made to the exploration of visual quality assessment for contrast adjustment. To tackle the issue, this paper proposes a novel reduced-reference (RR) quality metric with the integration of bottom-up and top-down strategies. The former one stems from the recently revealed free energy principle that tells that the human visual system seeks to comprehend an input image via uncertainty removal, while the latter one is toward using the symmetric Kullback–Leibler divergence to compare the histogram of the contrast-altered image with that of the pristine image. The bottom-up and top-down strategies are lastly incorporated to derive the RR contrast-altered image quality measure. A comparison using numerous existing IQA models is carried out on five contrast related databases/subsets in CID2013, CCID2014, CSIQ, TID2008, and TID2013, and experimental results validate the superiority of the proposed technique.

48 citations

Journal ArticleDOI
TL;DR: A novel perception-based hybrid model for video quality assessment that simulates the HVS perception process by adaptively combining noticeable distortion and blurring artifacts using an enhanced nonlinear model and exploits the orientation selectivity and shift invariance properties of the dual-tree complex wavelet transform.
Abstract: It is known that the human visual system (HVS) employs independent processes (distortion detection and artifact perception—also often referred to as near-threshold and suprathreshold distortion perception) to assess video quality for various distortion levels. Visual masking effects also play an important role in video distortion perception, especially within spatial and temporal textures. In this paper, a novel perception-based hybrid model for video quality assessment is presented. This simulates the HVS perception process by adaptively combining noticeable distortion and blurring artifacts using an enhanced nonlinear model. Noticeable distortion is defined by thresholding absolute differences using spatial and temporal tolerance maps that characterize texture masking effects, and this makes a significant contribution to quality assessment when the quality of the distorted video is similar to that of the original video. Characterization of blurring artifacts, estimated by computing high frequency energy variations and weighted with motion speed, is found to further improve metric performance. This is especially true for low quality cases. All stages of our model exploit the orientation selectivity and shift invariance properties of the dual-tree complex wavelet transform. This not only helps to improve the performance but also offers the potential for new low complexity in-loop application. Our approach is evaluated on both the Video Quality Experts Group (VQEG) full reference television Phase I and the Laboratory for Image and Video Engineering (LIVE) video databases. The resulting overall performance is superior to the existing metrics, exhibiting statistically better or equivalent performance with significantly lower complexity.

48 citations

ReportDOI
01 Jun 2000
TL;DR: A geometric model and a computational method for segmentation of images with missing boundaries, and an algorithm which tries to build missing information on the basis of the given point of view and the available information as boundary data to the algorithm are presented.
Abstract: We present a geometric model and a computational method for segmentation of images with missing boundaries. In many situations, the human visual system fills in missing gaps in edges and boundaries, building and completing information that is not present. Boundary completion presents a considerable challenge in computer vision, since most algorithms attempt to exploit existing data. A large body of work concerns completion models, which postulate how to construct missing data; these models are often trained and specific to particular images. In this paper, we take the following, alternative perspective: we consider a reference point within an image as given, and then develop an algorithm which tries to build missing information on the basis of the given point of view and the available information as boundary data to the algorithm. Starting from this point of view, a surface is constructed. It is then evolved with the mean curvature flow in the metric induced by the image until a piecewise constant solution is reached. We test the computational model on modal completion, amodal completion, texture, photo and medical images. We extend the geometric model and the algorithm to 3D in order to extract shapes from low signal/noise ratio medical volumes. Results in 3D echocardiography and 3D fetal echography are presented.

48 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202349
202294
2021279
2020311
2019351
2018348