scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 1990"


Proceedings ArticleDOI
04 Nov 1990
TL;DR: A texture segmentation algorithm inspired by the multichannel filtering theory for visual information processing in the early stages of the human visual system is presented and appears to perform as predicted by preattentive texture discrimination by a human.
Abstract: A texture segmentation algorithm inspired by the multichannel filtering theory for visual information processing in the early stages of the human visual system is presented. The channels are characterized by a bank of Gabor filters that nearly uniformly covers the spatial-frequency domain. A systematic filter selection scheme based on reconstruction of the input image from the filtered images is proposed. Texture features are obtained by subjecting each (selected) filtered image to a nonlinear transformation and computing a measure of energy in a window around each pixel. An unsupervised square-error clustering algorithm is then used to integrate the feature images and produce a segmentation. A simple procedure to incorporate spatial adjacency information in the clustering process is proposed. Experiments on images with natural textures as well as artificial textures with identical second and third-order statistics are reported. The algorithm appears to perform as predicted by preattentive texture discrimination by a human. >

426 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the Hermite transform is in better agreement with human visual modeling than Gabor expansions, and therefore the scheme is presented as an analysis/resynthesis system.
Abstract: The author introduces a scheme for the local processing of visual information, called the Hermite transform. The problem is addressed from the point of view of image coding, and therefore the scheme is presented as an analysis/resynthesis system. The objectives of the present work, however, are not restricted to coding. The analysis part is designed so that it can also serve applications in the area of computer vision. Indeed, derivatives of Gaussians, which have found widespread application in feature detection over the past few years, play a central role in the Hermite analysis. It is also argued that the proposed processing scheme is in close agreement with current insight into the image processing that is carried out by the human visual system. In particular, it is demonstrated that the Hermite transform is in better agreement with human visual modeling than Gabor expansions. >

318 citations


Journal ArticleDOI
09 Feb 1990-Science
TL;DR: This work has shown that visual search also has access to another level of representation, one that describes properties in the corresponding three-dimensional scene that are three dimensionality and the direction of lighting, but not viewing direction.
Abstract: The task of visual search is to determine as rapidly as possible whether a target item is present or absent in a display. Rapidly detected items are thought to contain features that correspond to primitive elements in the human visual system. In previous theories, it has been assumed that visual search is based on simple two-dimensional features in the image. However, visual search also has access to another level of representation, one that describes properties in the corresponding three-dimensional scene. Among these properties are three dimensionality and the direction of lighting, but not viewing direction. These findings imply that the parallel processes of early vision are much more sophisticated than previously assumed.

282 citations


Journal ArticleDOI
TL;DR: VIVA is a proposed visual language for image processing that serves as an effective teaching tool for students of image processing and takes account of several secondary goals, including the completion of a software platform for research in human/image interaction and the establishment of a presentation medium for image-processing algorithms.
Abstract: Visual languages have been developed to help new programmers express algorithms easily. They also help to make experienced programmers more productive by simplifying the organization of a program through the use of visual representations. However, visual languages have not reached their full potential because of several problems including the following: difficulty of producing visual representations for the more abstract computing constructs; the lack of adequate computing power to update the visual representations in response to user actions; the immaturity of the subfield of visual programming and need for additional breakthroughs and standardization of existing mechanisms. Visualization of Vision Algorithms (VIVA) is a proposed visual language for image processing. Its main purpose is to serve as an effective teaching tool for students of image processing. Its design also takes account of several secondary goals, including the completion of a software platform for research in human/image interaction, the creation of a vehicle for studying algorithms and architectures for parallel image processing, and the establishment of a presentation medium for image-processing algorithms.

208 citations


Proceedings ArticleDOI
04 Dec 1990
TL;DR: A model is described for image segmentation that tries to capture the low-level depth reconstruction exhibited in early human vision, giving an important role to edge terminations, which gives rise to a family of optimal contours, called nonlinear splines, that minimize length and the square of curvature.
Abstract: A model is described for image segmentation that tries to capture the low-level depth reconstruction exhibited in early human vision, giving an important role to edge terminations. The problem is to find a decomposition of the domain D of an image that has a minimum of disrupted edges-junctions of edges, crack tips, corners, and cusps-by creating suitable continuations for the disrupted edges behind occluding regions. The result is a decomposition of D into overlapping regions R/sub 1/ union . . . union R/sub n/ ordered by occlusion, which is called the 2.1-D sketch. Expressed as a minimization problem, the model gives rise to a family of optimal contours, called nonlinear splines, that minimize length and the square of curvature. These are essential in the construction of the 2.1-D sketch of an image, as the continuations of disrupted edges. An algorithm is described that constructs the 2.1-D sketch of an image, and gives results for several example images. The algorithm yields the same interpretations of optical illusions as the human visual system. >

173 citations


Journal ArticleDOI
TL;DR: A progressive image transmission scheme which combines transform coding with the human visual system (HVS) model is developed and results in perceptually higher quality images compared to the unweighted scheme.
Abstract: A progressive image transmission scheme which combines transform coding with the human visual system (HVS) model is developed. The adaptive transform coding of W.H. Chen and C.H. Smith (1977) is utilized to classify an image into four equally populated subblocks based on their AC energies. The modulation transfer function (MTF) of the HVS model is obtained experimentally, based on processing a number of test images. A simplified technique for incorporating the MTF into the discrete cosine transform (DCT) domain is utilized. In the hierarchical image buildup, the image is first reconstructed from the DC coefficients of all subblocks. Further transmission hierarchy of transform coefficients and consequent image buildup are dependent on their HVS weighted variances. The HVS weighted reconstructed images are compared to the ones without any weighting at several stages. The HVS weighted progressive image transmission results in perceptually higher quality images compared to the unweighted scheme. >

146 citations


Journal ArticleDOI
TL;DR: It is found that when the gratings within the plaid are of different contrast, the perceived direction is not predicted by the intersection of constraints rule, and a revised model, which incorporates a contrast-dependent weighting of perceived grating speed as observed for 1-D patterns, can quantitatively predict most of the results.

130 citations


Proceedings ArticleDOI
16 Jun 1990
TL;DR: An analysis is presented which makes it possible to compare directly the space complexity of different sensor designs in the complex logarithmic family and rough estimates can be obtained of the parameters necessary to duplicate the field width/resolution performance of the human visual system.
Abstract: A space-variant sensor design based on the conformal mapping of the half disk, w=log (z+a), with real a>0, which characterizes the anatomical structure of the primate and human visual systems is discussed. There are three relevant parameters: the circumferential index kappa which is defined as the number of pixels around the periphery of the sensor, the visual field radius R (of the half-disk to be mapped), and the map parameter a, which displaces the logarithm's singularity at the origin out of the domain of the mapping. It is shown that the log sensor requires O( kappa /sup 2/log (R/a)) pixels. An analysis is presented which makes it possible to compare directly the space complexity of different sensor designs in the complex logarithmic family. In particular, rough estimates can be obtained of the parameters necessary to duplicate the field width/resolution performance of the human visual system. >

121 citations


Journal ArticleDOI
TL;DR: Effectiveness of both still and live video images, especially for user's browsing and interaction, is shown by usage of the 'MediaBENCH' (hyperMedia Basic ENvironment for Computer and Human interactions) which is a basic prototype multimedia database system.
Abstract: This paper clarifies the importance of content oriented visual user interfaces using video icons for visual database systems. Effectiveness of both still and live video images, especially for user's browsing and interaction, is shown by usage of the 'MediaBENCH' (hyperMedia Basic ENvironment for Computer and Human interactions) which is a basic prototype multimedia database system. Various methods of handling video data on the MediaBENCH are introduced and discussed to show how video data can be manipulated on visual database systems which deal with spatial and temporal factors. A visual interface using video icons is quite suitable to video editing systems, video mail systems, or other electronic video document systems. Brief application profiles are shown for guidance.

111 citations


Journal ArticleDOI
TL;DR: In this article, a perceptual component architecture for digital video partitions the image stream into signal components in a manner analogous to that used in the human visual system, consisting of achromatic and opponent color channels, divided into static and motion channels, further divided into bands of particular spatial frequency and orientation.
Abstract: A perceptual-components architecture for digital video partitions the image stream into signal components in a manner analogous to that used in the human visual system. These components consist of achromatic and opponent color channels, divided into static and motion channels, further divided into bands of particular spatial frequency and orientation. Bits are allocated to an individual band in accord with visual sensitivity to that band and in accord with the properties of visual masking. This architecture is argued to have desirable features such as efficiency, error tolerance, scalability, device independence, and extensibility.

82 citations


Journal ArticleDOI
TL;DR: A new approach to the aperture problem is presented, using an adaptive neural network model that accommodates its structure to long-term statistics of visual motion, but also simultaneously uses its acquired structure to assimilate, disambiguate, and represent visual motion events in real-time.

Journal Article
TL;DR: In this article, a hierarchical, multiscale network that uses feature arrays with strong lateral inhibitory connections is proposed to generate responses consistent with a range of data reported in the psychological literature, and with neurophysiological characteristics of primate vision.
Abstract: Selective visual attention serializes the processing of stimulus data to make efficient use of limited processing resources in the human visual system. This paper describes a connectionist network that exhibits a variety of attentional phenomena reported by Treisman, Wolford, Duncan, and others. As demonstrated in several simulations, a hierarchical, multiscale network that uses feature arrays with strong lateral inhibitory connections provides responses in agreement with a number of prominent behaviors associated with visual attention. The overall network design is consistent with a range of data reported in the psychological literature, and with neurophysiol-ogical characteristics of primate vision.

Journal ArticleDOI
TL;DR: A connectionist network that exhibits a variety of attentional phenomena reported by Treisman, Wolford, Duncan, and others is described, which is consistent with a range of data reported in the psychological literature, and with neurophysiological characteristics of primate vision.
Abstract: Selective visual attention serializes the processing of stimulus data to make efficient use of limited processing resources in the human visual system. This paper describes a connectionist network that exhibits a variety of attentional phenomena reported by Treisman, Wolford, Duncan, and others. As demonstrated in several simulations, a hierarchical, multiscale network that uses feature arrays with strong lateral inhibitory connections provides responses in agreement with a number of prominent behaviors associated with visual attention. The overall network design is consistent with a range of data reported in the psychological literature, and with neurophysiol-ogical characteristics of primate vision.

Journal ArticleDOI
TL;DR: A general adaptive restoration algorithm is derived on the basis of a set theoretic regularization technique and can be used for any type of linear distortion and constraint operators and can also be used to restore signals other than images.

Journal ArticleDOI
TL;DR: A computational framework arising out of computer vision research is used to organize and interpret human and primate neurophysiology and neuropsychology to extend the understanding of vision through object representation and recognition.
Abstract: Significant progress has been made in understanding vision by combining computational and neuroscientific constraints. However, for the most part these integrative approaches have been limited to low-level visual processing. Recent advances in our understanding of high-level vision in the two separate disciplines warrant an attempt to relate and integrate these results to extend our understanding of vision through object representation and recognition. This paper is an attempt to contribute to this goal, by using a computational framework arising out of computer vision research to organize and interpret human and primate neurophysiology and neuropsychology.

Proceedings ArticleDOI
01 Sep 1990
TL;DR: An image coding scheme based on the properties of the early stages of the human visual system with satisfactory image quality is presented, which can be seen as an empirical confirmation of the suitability of vector quantization in subband coding.
Abstract: We present an image coding scheme based on the properties of the early stages of the human visual system. The image signal is decomposed via even and odd symmetric, frequency and orientation selective band-pass filters in analogy to the quadrature phase simple cell pairs in the visual cortex. The resulting analytic signal is transformed into a local amplitude and local phase representation in order to achieve a better match to its signal statistics. Both intra filter dependencies of the analytic signal and inter filter dependencies between different orientation filters are exploited by a suitable vector quantization scheme. Inter orientation filter dependencies are demonstrated by means of a statistical evaluation of the multidimensional probability density function. The results can be seen as an empirical confirmation of the suitability of vector quantization in subband coding. Instead of generating a code book by use of an conventional design-algorithm, we suggest a feature specific partitioning of the multidimensional signal space matched to the properties of human vision. Using this coding scheme satisfactory image quality can be obtained with about 0.78 bit/pixel.

Proceedings ArticleDOI
03 Apr 1990
TL;DR: A texture analysis approach superior to previous ones in such aspects as classification/segmentation performance and applicability is presented, based on a widely adopted human visual model which hypothesizes that the HVS processes input pictorial signals through a set of parallel and quasi-independent mechanisms or channels.
Abstract: A texture analysis approach superior to previous ones in such aspects as classification/segmentation performance and applicability is presented. It is based on a widely adopted human visual model which hypothesizes that the human visual system (HVS) processes input pictorial signals through a set of parallel and quasi-independent mechanisms or channels. This model is referred to as the multichannel spatial filtering model (MSFM). The core of the MSFM presently applied is the recently formulated cortical channel model (CCM), which attempts to model the process of texture feature extraction in each individual channel in the MSFM. With these models, successful algorithms for both texture classification and segmentation (texture edge detection) have been developed. The algorithm for texture feature extraction and classification is compared with the conventional benchmark, i.e., the gray-level cooccurrence matrix approach, and proves to be superior in many aspects. The algorithm for texture edge detection is tested under a variety of textured images, and good segmentation results are obtained. >

Patent
30 Apr 1990
TL;DR: In this article, a perceptual metric, based on the properties of the sub-band filters, quantizer error distribution, and properties of human visual system, is determined which provides the maximum amount of coding noise that may be introduced to each pixel in every subband without causing perceptible degradation of coded image.
Abstract: An image-coding system reduces image data redundancies and perceptual irrelevancies through progressive sub-band coding. The image is separated into plurality of sub-bands ((0,0) to (3,3)). From this sub-band information, a perceptual metric, based on the properties of the sub-band filters, quantizer error distribution, and properties of the human visual system, is determined (28) which provides the maximum amount of coding noise that may be introduced to each pixel in every sub-band without causing perceptible degradation of the coded image. This perceptual metric is used to adjust the quantizer (25) used in encoding each sub-band signal. In addition, redundancy in the output of the quantizer is reduced using a multidimensional Huffman compression scheme (27).

Journal ArticleDOI
TL;DR: An approach based on data visualization and visual reasoning is described to transform the data objects and present sample data objects in a visual space and incorporates a semantic model of the database for fuzzy visual query translation.
Abstract: When a database increases in size, retrieving the data becomes a major problem. An approach based on data visualization and visual reasoning is described. The main idea is to transform the data objects and present sample data objects in a visual space. The user can use a visual language to incrementally formulate the information retrieval request in the visual space. A prototype system is described with the following features: (1) it is built on top of the SIL-ICON visual language compiler and therefore can be customized for different application domains; (2) it supports a fuzzy icon grammar to define reasonable visual sentences; (3) it incorporates a semantic model of the database for fuzzy visual query translation; and (4) it incorporates a VisualNet which stores the knowledge learned by the system in its interaction with the user so that the VisualReasoner can adapt its behavior. >

Proceedings ArticleDOI
23 Oct 1990
TL;DR: The authors discuss visual information processing issues relevant to the research, methodology and data analyses used to develop the classification system, results of the empirical study, and possible directions for future research.
Abstract: An exploratory effort to classify visual representations into homogeneous clusters is discussed. The authors collected hierarchical sorting data from twelve subjects. Five principal groups of visual representations emerged from a cluster analysis of sorting data: graphs and tables, maps, diagrams, networks, and icons. Two dimensions appear to distinguish these clusters: the amount of spatial information and cognitive processing effort. The authors discuss visual information processing issues relevant to the research, methodology and data analyses used to develop the classification system, results of the empirical study, and possible directions for future research. >

Journal ArticleDOI
Louis D. Silverstein1, John H. Krantz1, Frank E. Gomer1, Yei-Yu Yeh1, Robert W. Monty1 
TL;DR: In this article, a CMD image-simulation system and a series of psychophysical image-quality experiments were conducted to investigate basic visual parameters of sampled displays, and the results were interpreted within the context of human visual system channel structure and elucidate the role of applied-vision studies in advanced display systems design.
Abstract: Color matrix displays (CMD’s) constitute a new generation of high-resolution, two-dimensional sampled visual display devices. We describe a CMD image-simulation system and a series of psychophysical image-quality experiments to investigate basic visual parameters of sampled displays. For binary CMD’s, a RGBG pixel mosaic was found to produce imaging superior overall to that of the more-conventional RGB mosaics. For CMD’s with gray-scale capability, a discrete approximation of a Gaussian band-limiting line-spread function dramatically improved CMD image quality for all pixel mosaics. Across the six image colors tested, image quality for band-limited CMD images approached asymptotic levels at approximately 3000 pixels deg−2 and eight gray levels, regardless of the pixel mosaic. The results are interpreted within the context of contemporary views of human visual system channel structure and elucidate the role of applied-vision studies in advanced display systems design.

Proceedings ArticleDOI
30 Jan 1990
TL;DR: The proposed approach to stereo image coding takes advantage of the singleness of vision property of the human visual system and shows that a stereo image pair, in which one of the images is low-pass filtered and subsampled, is perceived as a sharp 3-D image.
Abstract: The proposed approach to stereo image coding takes advantage of the singleness of vision property of the human visual system. Experiments show that a stereo image pair, in which one of the images is low-pass filtered and subsampled, is perceived as a sharp 3-D image. The depth information is perceived due to the stereopsis effect, and the sharpness is maintained due to the details in the non-filtered image. A methodology for the evaluation of the compression effects on the 3-D perception of stereo images is presented. It is based on measurements of response-time and accuracy of human subjects performing simple 3-D perception tasks.

Journal ArticleDOI
TL;DR: An approach based upon data visualization and visual reasoning is suggested to transform the data objects and present sample data objects in a visual space so that the user can incrementally formulate the information retrieval request in the visual space.
Abstract: When the database grows larger and larger, the user no longer knows what is in the database and nor does the user know clearly what should be retrieved. How to get at the data becomes a central problem for very large databases. We suggest an approach based upon data visualization and visual reasoning. The idea is to transform the data objects and present sample data objects in a visual space. The user can then incrementally formulate the information retrieval request in the visual space. By combining data visualization, visual query, visual examples and visual clues, we hope to come up with better ways for formulating and modifying a user's query. A prototype system using the Visual Language Compiler and the VisualNet is then described.

Journal ArticleDOI
Lucia M. Vaina1
01 Apr 1990-Synthese
TL;DR: Clinical evidence for the existence of two visual systems in man, one specialized for spatial vision and the other for object vision, and the computational hypothesis that these two systems consist of several visual modules are presented.
Abstract: In this paper we focus on the modularity of visual functions in the human visual cortex, that is, the specific problems that the visual system must solve in order to achieve recognition of objects and visual space. The computational theory of early visual functions is briefly reviewed and is then used as a basis for suggesting computational constraints on the higher-level visual computations. The remainder of the paper presents neurological evidence for the existence of two visual systems in man, one specialized for spatial vision and the other for object vision. We show further clinical evidence for the computational hypothesis that these two systems consist of several visual modules, some of which can be isolated on the basis of specific visual deficits which occur after lesions to selected areas in the visually responsive brain. We will provide examples of visual modules which solve information processing tasks that are mediated by specific anatomic areas. We will show that the clinical data from behavioral studies of monkeys (Ungerleider and Mishkin 1984) supports the distinction between two visual systems in monkeys, the ‘what’ system, involved in object vision, and the ‘where’ system, involved in spatial vision.

Book ChapterDOI
01 Jan 1990
TL;DR: This chapter focuses on neural computers for foveating vision systems, and it becomes apparent that the visual system is spatially inhomogeneous in that only a small area near the center of the retina, the so-called fovea, affords acute vision, and that the rate of both spatial sampling and processing decreases toward the periphery.
Abstract: Publisher Summary This chapter focuses on neural computers for foveating vision systems. Most currently-available schemes of, and systems for, image acquisition and processing for the purpose of visual communication and/or high-level computer vision, sample and process the visual data in a uniform manner. Such systems do not adapt automatically to the specific structure of the visual environment and/or the task to be accomplished. Consequently, such vision systems do not allocate efficiently the available acquisition and computational resources. Analysis of cell density, neural circuitry, receptive field physiology, and the rules that govern the retinotopic mapping, reveal some basic principles of organization, and the functional architecture of the early stages of the visual pathway. In particular, it becomes apparent that the visual system is spatially inhomogeneous in that only a small area near the center of the retina, the so-called fovea, affords acute vision, and that the rate of both spatial sampling and processing decreases toward the periphery.

Proceedings ArticleDOI
03 Apr 1990
TL;DR: A new segmentation-based image coding technique which performs segmentation based on roughness of textural regions and on properties of the human visual system (HVS) is presented.
Abstract: A new segmentation-based image coding technique which performs segmentation based on roughness of textural regions and on properties of the human visual system (HVS) is presented. The image is segmented into texturally homogeneous region with respect to the degree of roughness as perceived by the HVS. The fractal dimension is used to measure the roughness of the textural regions. The segmentation is accomplished by thresholding the fractal dimension so that textural regions are classified into several classes. Three texture classes are chosen: perceived constant intensity, smooth texture, and rough texture. An image coding system with high compression and good image quality is achieved by developing an efficient coding technique for each texture class. >

Journal ArticleDOI
TL;DR: A directional decomposition based sequence coding technique is presented, in which spatial lowpass and highpass components are analyzed and coded separately, and a simple law for sharing the available bits between these components is stated and analytically proved.
Abstract: Second generation image coding techniques, which use information about the human visual system to reach high compression ratios, have proven very successful when applied to single images. These methods can also be applied to image sequences. A directional decomposition based sequence coding technique is presented, in which spatial lowpass and highpass components are analyzed and coded separately. A simple law for sharing the available bits between these components is stated and analytically proved by using a minimum cost/resolution optimality criterion. The detection of directional elements is carried out by using both linear and nonlinear (median) filtering. The coding is based on near optimal estimators which retain only the innovation part of information, and is well suited for differential pulse code modulation. The results of applying this method to a typical sequence are shown. The estimated compression ratio is approximately 320 : 1 (0.025 bits per pixel), allowing a transmission rate of about 41 kbit/second. The resulting image quality is reasonably good.

Book ChapterDOI
01 Mar 1990
TL;DR: In this paper, the authors consider the problem where the 3D motion of an object corresponding to a known 3D model is to be tracked using only the motion of 2D features in the stream of images.
Abstract: A major issue in computer vision is the interpretation of three-dimensional (3D) motion of moving objects from a continuous stream of two-dimensional (2D) images. In this paper we consider the problem where the 3D motion of an object corresponding to a known 3D model is to be tracked using only the motion of 2D features in the stream of images. Two general solution paradigms for this problem are characterized: (1) motion-searching, which hypothesizes and tests 3D motion parameters, and (2) motion-calculating, which uses back-projection to directly estimate 3D motion from image-feature motion. Two new algorithms for computing 3D motion based on these two paradigms are presented. One of the major novel aspects of both algorithms is their use of the assumption that the input image stream is spatiotemporally dense. This constraint is psychologically plausible since it is also used by the short-range motion processes in the human visual system.

01 Jan 1990
TL;DR: A texture segmentation algorithm inspired by the multi-channel filtering theory for visual information processing in the early stages of human visual system and a simple procedure to incorporate spatial adjacency information in the clustering process is proposed.
Abstract: We presenf a texture segmentation algorithm inspired by the multi-channel filtering theory for visual information processing in the early stages of human visual system. The channels are characterized by a bank of Gabor filters that nearly uniformly covers the spatial-frequency domain. We propose a systematic filter selection scheme which is based on reconstruction of the input image from the filtered images. Texture features are obtained by subjecting each (selected) filtered image to a nonlinear transformation and computing a measure of “energy” in a window around each pixel. An unsupervised square-emr clustering algorithm is then used to integrate the feature images and produce a segmentation. A simple procedure to incorporate spatial adjacency information in the clustering process is also proposed. We report experiments on images with natural textures as well as artificial textures with identical 2nd- and 3rd-order statistics.

Journal ArticleDOI
01 Feb 1990
TL;DR: An approach is presented for the automation of important aspects of human visual inspection in quality control that applies an algorithm for optimizing convolution masks to distinguish between acceptable and unacceptable images.
Abstract: An approach is presented for the automation of important aspects of human visual inspection in quality control. Pattern recognition and digital image processing are used to detect and classify defects in full gray-scale images of complex mechanical assemblies. The method simulates the processes of adaptation, fixation, and feature extraction in the human visual system. It applies an algorithm for optimizing convolution masks to distinguish between acceptable and unacceptable images. As a numerical example, the technique is used to detect a number of defects in X-ray images of complex mechanical assemblies. >