scispace - formally typeset
Search or ask a question
Topic

Human visual system model

About: Human visual system model is a research topic. Over the lifetime, 8697 publications have been published within this topic receiving 259440 citations.


Papers
More filters
Posted Content
TL;DR: Inspired by the human visual system, low-level motion-based grouping cues can be used to learn an effective visual representation that significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce.
Abstract: This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation. Specifically, we use unsupervised motion-based segmentation on videos to obtain segments, which we use as 'pseudo ground truth' to train a convolutional network to segment objects from a single frame. Given the extensive evidence that motion plays a key role in the development of the human visual system, we hope that this straightforward approach to unsupervised learning will be more effective than cleverly designed 'pretext' tasks studied in the literature. Indeed, our extensive experiments show that this is the case. When used for transfer learning on object detection, our representation significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce.

95 citations

Journal ArticleDOI
TL;DR: A new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type is described.
Abstract: We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

95 citations

Book
01 Dec 1997
TL;DR: Computer Vision and Image Processing brings together the theory of computer imaging with the tools needed for practical research and development, and is a solid introduction for anyone who uses computer imaging.
Abstract: From the Publisher: True computer imaging for engineers! Digital signal processing has long been the domain of electrical engineers, while the manipulation of image data has been handled by computer scientists. The convergence of these two specialties in the field of Computer Vision and Image Processing (CVIP) is the subject of this pragmatic book, written from an applications perspective and accompanied by its own educational and developmentsoftware environment, CVIPtools. Illustrated with hundreds of examples, Computer Vision and Image Processing brings together the theory of computer imaging with the tools needed for practical research and development. The first part of Computer Vision and Image Processing presents a system model for each of the major application areas of CVIP, relating each specific algorithm to the overall process of applications development. The areas covered are: Image analysis Image restoration Image enhancement Image compression Computer Vision and Image Processing's second half focuses on the use of the CVIPtools environment, the software developed especially by the author and included on the accompanying CD-ROM. These advanced chapters discuss: Software features and applications CVIPtools software development environment Library descriptions and function prototypes CVIPtools is a GUI-based application, which includes an extended Tcl shell, that is ANSI-C compatible and runs on most flavors of UNIX and Windows NT/95. To get the most out of Computer Vision and Image Processing, a basic background in mathematics and computers is necessary. Knowledge of the C programming language will enhance the usefulness of the algorithms used in programming, and an understanding of signal and system theory is helpful in mastering transforms and compression. Engineers, programmers, graphics specialists, multimedia developers, and medical imaging professionals will all appreciate Computer Vision and Image Processing's solid introduction for anyone who uses computer imaging.

95 citations

Journal ArticleDOI
TL;DR: It is argued that attentional selection of pertinent information is heavily influenced by the stimuli most recently viewed that were important for behaviour, in particular the stimuli that have been important in the immediate past.
Abstract: Many lines of evidence show that the human visual system does not simply passively register whatever appears in the visual field. The visual system seems to preferentially “choose” stimuli according to what is most relevant for the task at hand, a process called attentional selection. Given the large amount of information in any given visual scene, and well-documented capacity limitations for the representation of visual stimuli, such a strategy seems only reasonable. Consistent with this, human observers are surprisingly insensitive to large changes in their visual environment when they attend to something else in the visual scene. Here I argue that attentional selection of pertinent information is heavily influenced by the stimuli most recently viewed that were important for behaviour. I will describe recent evidence for the existence of a powerful memory system, not under any form of voluntary control, which aids observers in orienting quickly and effectively to behaviourally relevant stimuli in the vi...

95 citations

Proceedings ArticleDOI
05 Nov 2012
TL;DR: This work presents a new visible tagging solution for active displays which allows a rolling-shutter camera to detect active tags from a relatively large distance in a robust manner and uses intelligent binary coding to encode digital positioning and shows potential applications such as large screen interaction.
Abstract: We show a new visible tagging solution for active displays which allows a rolling-shutter camera to detect active tags from a relatively large distance in a robust manner. Current planar markers are visually obtrusive for the human viewer. In order for them to be read from afar and embed more information, they must be shown larger thus occupying valuable physical space on the design. We present a new active visual tag which utilizes all dimensions of color, time and space while remaining unobtrusive to the human eye and decodable using a 15fps rolling-shutter camera. The design exploits the flicker fusion-frequency threshold of the human visual system, which due to the effect of metamerism, can not resolve metamer pairs alternating beyond 120Hz. Yet, concurrently, it is decodable using a 15fps rolling-shutter camera due to the effective line-scan speed of 15×400 lines per second. We show an off-the-shelf rolling-shutter camera can resolve the metamers flickering on a television from a distance over 4 meters. We use intelligent binary coding to encode digital positioning and show potential applications such as large screen interaction. We analyze the use of codes for locking and tracking encoded targets. We also analyze the constraints and performance of the sampling system, and discuss several plausible application scenarios.

95 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202349
202294
2021279
2020311
2019351
2018348