scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 1985"


Journal ArticleDOI
TL;DR: MIRAGE, a theory for the primitive coding of the (1D) spatial distribution of luminance changes by the human visual system is developed from a theoretical examination of the practical problems associated with the characterization of such changes.

351 citations


Journal ArticleDOI
01 Apr 1985
TL;DR: An overview of the theory of sampling and reconstruction of multidimensional signals, including the role of the camera and display apertures, and the human visual system is presented and a class of nonlinear interpolation algorithms which adapt to the motion in the scene is presented.
Abstract: Sampling is a fundamental operation in all image communication systems A time-varying image, which is a function of three independent variables, must be sampled in at least two dimensions for transmission over a one-dimensional analog communication channel, and in three dimensions for digital processing and transmission At the receiver, the sampled image must be interpolated to reconstruct a continuous function of space and time In imagery destined for human viewing, the visual system forms an integral part of the reconstruction process This paper presents an overview of the theory of sampling and reconstruction of multidimensional signals The concept of sampling structures based on lattices is introduced The important problem of conversion between different sampling structures is also treated This theory is then applied to the sampling of time-varying imagery, including the role of the camera and display apertures, and the human visual system Finally, a class of nonlinear interpolation algorithms which adapt to the motion in the scene is presented

301 citations


Journal ArticleDOI
N. Nill1
TL;DR: A new analytical solution, taking the form of a straightforward multiplicative weighting function, is developed which is readily applicable to image compression and quality assessment in conjunction with a visual model and the image cosine transform.
Abstract: Utilizing a cosine transform in image compression has several recognized performance benefits, resulting in the ability to attain large compression ratios with small quality loss. Also, incorporation of a model of the human visual system into an image compression or quality assessment technique intuitively should (and has often proven to) improve performance. Clearly, then, it should prove highly beneficial to combine the image cosine transform with a visual model. In the past, combining these two has been hindered by a fundamental problem resulting from the scene alteration that is necessary for proper cosine transform utilization. A new analytical solution to this problem, taking the form of a straightforward multiplicative weighting function, is developed in this paper. This solution is readily applicable to image compression and quality assessment in conjunction with a visual model and the image cosine transform. In the development, relevant aspects of a human visual system model are discussed, and a refined version of the mean square error quality assessment measure is given which should increase this measure's utility.

297 citations


Journal ArticleDOI
TL;DR: It was concluded that the neural processing of information along the visual pathways of the two species is generally similar and that the monkey is an excellent model of the human visual system.
Abstract: Data for three fundamental psychophysical functions (spatial modulation sensitivity, temporal modulation sensitivity, and increment-threshold spectral sensitivity) were compared for groups of 12 rhesus monkeys and 12 human subjects. It was found that there are important, nontrivial differences between the data for monkeys and humans, but that many of the differences could be accounted for by structural or passive differences in the visual systems. Therefore, it was concluded that the neural processing of information along the visual pathways of the two species is generally similar and that the monkey is an excellent model of the human visual system.

86 citations


Proceedings ArticleDOI
01 Dec 1985
TL;DR: Different types of nonstationary constrained iterative image restoration algorithms are introduced which incorporate properties of the response of the human visual system and can be used for any type of linear constraint and distortion operators.
Abstract: This paper introduces different types of nonstationary constrained iterative image restoration algorithms. The adaptivity of the algorithm is introduced by the constraint operator which incorporates properties of the response of the human visual system. The properties of the visual system are represented by noise masking and visibility functions. A new way of computing the masking function is also introduced. The proposed algorithms are general and can be used for any type of linear constraint and distortion operators. The algorithms can also be used to restore signals different than images, since the constraint operator can be interpreted as adapting to the local signal activity.

42 citations


Journal ArticleDOI
TL;DR: A method of representing image data sets in the form of naturally occurring variables in a realistic apparently three-dimensional scene is presented, which relies on techniques for the modeling of surfaces and surface reflectance to render the synthesised scenes realistically.
Abstract: Superimposition of two image data sets allows the spatial distribution of one to be directly related to that of the other. If the two data sets have different spatial structures, the composite image is generally confusing and difficult to interpret. A method of representing image data sets in the form of naturally occurring variables in a realistic apparently three-dimensional scene is presented. One data set is represented by the topography of a surface, depicted by shaded-relief methods, while another is represented by the color of the surface, or by the color of an overlaid transparency. Presentation in this form exploits the normal scene decomposition abilities of the human visual system, allowing intuitive appreciation and separation of the scene, and hence data set, variables. The method relies on techniques for the modeling of surfaces and surface reflectance to render the synthesised scenes realistically.

33 citations


Journal ArticleDOI
Jacob Beck1
TL;DR: The different tactics employed by human and machine vision systems in judging transparency are compared and the tendency for the human visual system to see a simple organization leads to the perception of transparency even when the intensity pattern indicates transparency to be physically impossible.
Abstract: The different tactics employed by human and machine vision systems in judging transparency are compared. Instead of luminance or reflectance (relative luminance), the human visual system uses lightness, a nonlinear function of reflectance, to estimate transparency. The representation of intensity information in terms of lightness restricts the operations that can be applied, and does not permit solving the equations describing the occurrence of transparency. Instead, the human visual system uses algorithms based on simple order and magnitude relations. One consequence of the human visual system not using a mathematically correct procedure is the occurrence of nonveridical perceptions of transparency. A second consequence is that the human visual system is not able to make accurate judgments of the degree of transparency. Figural cues are also important in the human perception of transparency. The tendency for the human visual system to see a simple organization leads to the perception of transparency even when the intensity pattern indicates transparency to be physically impossible. In contrast, given the luminances or reflectances, a machine vision system can apply the relevant equations for additive and subtractive color mixture to give veridical and quantitatively correct judgments of transparency.

22 citations


Journal ArticleDOI
TL;DR: A new combined encoding scheme is introduced for the luminance and chrominance signals based on the technique of vector quantization, taking into account the properties of the human visual system.
Abstract: A demand for color picture transmission is growing rapidly for various integrated services. For transmission, color pictures must be encoded, but it is widely understood that this can be most conveniently done in the form of luminance and chrominance signals, since the properties of the human visual system can be exploited to achieve considerable redundancy reduction. For encoding these component signals, it is most efficient if they can be jointly encoded. In this paper, a new combined encoding scheme is introduced for the luminance and chrominance signals based on the technique of vector quantization, taking into account the properties of the human visual system. However, these encoded signals are converted into primary RGB signals for display, and the corresponding increase of the encoding noise has not been discussed in the literature. In this paper, an encoding process is considered as a part of the total system between a camera and a display. Two different adaptive quantization schemes are incorporated in the encoding, and it will be shown that a good picture quality can be obtained with 2.0-2.5 bits/pel.

15 citations


Dissertation
01 Jan 1985
TL;DR: This thesis introduces what is called soft or statistical constraints, thus making possible the direct incorporation of statistical information about the noise and the image into the algorithm, and develops constrained iterative restoration algorithms which directly incorporate deterministic and statistical knowledge about the original signal and noise into the restoration.
Abstract: This thesis considers the problem of image restoration, i.e., the problem of recovering the input image to a space-invariant or space-varying linear system from its noisy-blurred output. Iterative algorithms have been chosen for solving the restoration problem due to their advantages over non-iterative and recursive techniques. We introduce what we call soft or statistical constraints, thus making possible the direct incorporation of statistical information about the noise and the image into the algorithm. The first issue which is addressed in this thesis concerns the development of constrained iterative restoration algorithms which directly incorporate deterministic and statistical knowledge about the original signal and noise into the restoration. We call these algorithms regularized iterative restoration algorithms. In the regularized algorithms, deterministic knowledge about the original signal is again incorporated with the use of linear or nonlinear hard constraints as before, and also with the use of linear constraints which do not have the properties of projection operators, as hard constraints do. Information about the noise in the data is incorporated into the restoration through a regularization parameter (alpha) which controls the noise amplification in the restored signal. The regularization parameter and the linear non-hard constraint are combined into a new constraint, which we call soft or statistical constraint. Two new types of soft constraint operators are also developed. The second issue which is addressed concerns the development of adaptive regularized iterative restoration algorithms. The soft constraint operator controls the trade-off between sharpness and noise amplification in the restored image. Since the same amount of noise is perceived differently by the human observer in different regions of the image, the response of the human visual system is incorporated into the soft constraint operator. Different ways for adapting the soft constraint are proposed. The local variance is found to be a useful measure of the spatial detail in the image. Experimental results obtained by application of our nonadaptive or adaptive regularized iterative algorithms to simulated and photographically blurred images are shown in the thesis. We conclude that our algorithms perform better than stochastic optimum restoration filters in the range of signal-to-noise ratios 3 to 20 decibels. (Abstract shortened with permission of author.)

15 citations


Journal ArticleDOI
01 Oct 1985-Displays
TL;DR: In this article, the authors found that contrast of low-spatial-frequency information is more important for enhancing apparent sharpness than is high spatial resolution in a moving image, which is understandable when considering the performance characteristics of the human visual system.

12 citations


01 Jan 1985
TL;DR: The smart sensor problem is studied, new concepts are developed, new algorithms for implementing an intelligent enormous matrix inversion are proposed, and research areas for further exploration are discussed.
Abstract: To design lightweight smart sensor systems which are capable of outputting motion-invariant features useful for automatic pattern recognition systems, we must turn to the simultaneous image processing and feature extraction capability of the human visual system (HVS) to enable operation in real time, on a mobile platform, and in a "natural environment." This dissertation studies the smart sensor problem, develops new concepts, supported by simulation, and discusses research areas for further exploration. An n('2) parallel data throughput architecture implemented through a hardwired algotecture which accomplishes, without computation, an equivalent logarithmic coordinate mapping is presented. The algotecture mapping provides, at the sensor level, the ability to change scales and rotations in the input plane to shifts in the algotecture mapped space. The resulting invariant leading edge is shown to possess an intensity preserving property for arbitrary variations of image size. The sensitivity of the algotecture to center mismatch is discussed in terms of the difference between coordinate and functional transformation methods. A mathematical link between the lateral subtractive inhibition (SI) and multiple spatial filtering (MSF) mechanisms of the HVS coexist and function simultaneously. The feature extraction filter in visual neurophysiology, known as the novelty filter, is identified to be the first feedback term of an iterative expansion of the sensory mapping point spread function. The use of the algotecture space combined with the image plane MSF approach is explored for detection and classification using template crosscorrelation methods of recognition. The concept of using a three spatial frequency band model, based upon HVS physiological and psychophysical data, for an intra-class, and inter-class, and a membership identification classification scheme is introduced. This concept is extended to represent the feature vector entries for, not only each spatial frequency band, but for each image view angle in the recognition library. A new algorithm for implementing an intelligent enormous matrix inversion is proposed. Such an inverse problem exists in the solution of the negative feedback equation for SI and has a form that lends itself easily to parallel processing. The solution provides for a means to solve the inversion even though one of the partitioned submatrices is singular. Procedures are given to construct a partition tree which is analogous to quadtree partitioning methods in image processing. Finally a simple matrix inversion example is worked out to demonstrate how the algorithm works.

Journal ArticleDOI
TL;DR: The experiments were aimed at mechanisms by which spatial frequency information could be transferred from one part of the visual field to another, suggesting mechanisms within the visual system mathematically best described by a local spectrum analysis.

Proceedings ArticleDOI
26 Apr 1985
TL;DR: It is found that the reconstructed images in this case are sharper and exhibit much less staircase effect usually found in matrix quantizers that do not include a reasonably accurate model to reflect the human visual perception process.
Abstract: This paper discusses the design of a matrix quantizer incorporating the human visual model for image encoding. It is well known that in image processing square error distortion measure is not the most suitable yardstick for the evaluation of subjective quality of reconstructed images. Since in almost all image data compression systems the human observer is the ultimate destination, one may take into account some parameters of the human visual system in computing the distance measure used in the design of an encoding algorithm. The matrix quantizer design and encoding algorithms have been modified to include a reasonably accurate model to reflect the human visual perception process. It is found that the reconstructed images in this case are sharper and exhibit much less staircase effect usually found in matrix quantizers that do not include such a model. The results were compared for rate one and 0.56 bits per pixel (bpp) with those of straight matrix quantizers of similar rate and size.

Book ChapterDOI
01 Jan 1985
TL;DR: This chapter discusses the visual perception of an intelligent system with limited bandwidth, an active process that uses past experience and expectations to aid in its construction of a plausible visual world from seemingly ambiguous sense-data.
Abstract: Publisher Summary This chapter discusses the visual perception of an intelligent system with limited bandwidth. Visual stimuli, be they objects in the real world or patterns on a cathode ray tube, can be described in terms of their size, shape, orientation, color, and movement. In many of these dimensions, the humans' visual systems act like a filter, responding to some parts of the dimension while remaining oblivious to others. An appreciation of these limitations is important because it guards humans against requiring visual system to perform beyond its physical constraints and because it enables to achieve an economy of representation when constructing visual displays—one do not need to generate those parts of a pattern that will be invisible to the visual system. Many of the constraints on vision are established in the early stages of visual processing and can be traced to aspects of the anatomy and physiology of the peripheral visual system. Other constraints appear at a cognitive level. At this higher level, visual processing appears to be an active process that uses past experience and expectations to aid in its construction of a plausible visual world from seemingly ambiguous sense-data. The rigidity of the physical limitations of the peripheral visual system makes recommendations on the physical dimensions of visual stimuli possible. Whether the use of color, highlighted words, or flashing symbols will help or hinder the user is often impossible to determine outside the particular task being considered. Fortunately, these decisions can be made after reasonably simple experiments have been carried out.

01 Jan 1985
TL;DR: Clinical and anatomical evidence for parallel processing in the human visual system is presented and the need for overcoming the constraints of thinking that vision is the same as seeing is emphasized.
Abstract: There is growing evidence for parallel processing of visual information. Visual information, spatially or temporally distinct, is transmitted to various regions of the brain. This paper presents clinical and anatomical evidence for parallel processing in the human visual system. The neuro-ophthalmologist often has psychophysical evidence for the separation of visual functions. Our own investigations have demonstrated that brightness sense and other visual functions may be impaired out of proportion to visual acuity in diseases of the optic nerve. Classes of retinal ganglion cells have been morphologically and physiologically described in several experimental animals. No such classification of retinal ganglion cell types has been made in man. However, psychophysical and retinal electrophysiological human studies suggest the segregation of human retinal ganglion cells into classes which subserve different functions. A new staining method (PPD) has made it possible to directly study the visual pathways in man. With this method, we have documented several previously undescribed human visual pathways to different brain visual nuclei: the lateral geniculate nucleus, the pretectum, the superior colliculus, the pulvinar, and three nuclei of the hypothalamus (SCN, PVN, SON). We have also developed a method which permits the accurate and rapid measurement of human retinal ganglion cell axon diameters through the optic nerve and through the fascicles of optic fibers entering several of these recently described visual nuclei. There is evidence for three size classes of axons which differentially distribute to the visual nuclei. These studies emphasize the need for overcoming the constraints of thinking that vision is the same as seeing.(ABSTRACT TRUNCATED AT 250 WORDS)

Journal ArticleDOI
TL;DR: A modified version of Stevens' model for perceiving structure in complex patterns which has a fast implementation on parallel 'pyramid' hardware and was found to behave quite similarly toStevens' model.

Book ChapterDOI
01 Jan 1985
TL;DR: A computational theory of retinal filtering is presented and results suggest that a spatial-frequency filtering is performed in the human visual system and neurophysiological and psycho-physical data suggest that the segmentation is the next stage after retinal processing.
Abstract: The visual system is considered as an information processing system where the information processing task may consist in the localization and recognition of objects in the 3-dimensional physical world. After some definitions concerning terms like physical world and its projection, image,feature, and segmentation the processing in the first stages of the visual system (low-level vision) is discussed. A computational theory of retinal filtering is presented and related to the anatomy and physiology of the retina as well as to psychophysical results suggesting that a spatial-frequency filtering is performed in the human visual system. Furthermore neurophysiological and psycho-physical data suggest that the segmentation is the next stage after retinal processing. Here surface elements are extracted which follow from motion, texture, or the depth of objects (Stereopsis) relative to the observer. Some results of our simulation of low-level vision (retina, primary visual cortex) are presented for natural images. Particularly these results are related to the extraction of contour points, and edges for the reconstruction of the original gray values and the description of forms, and to the evaluation of statistical parameters for the description of textures in different spatial-frequency domains.

Proceedings ArticleDOI
11 Jul 1985
TL;DR: A theory for how possible objects, called "blobs", will be represented in an image is presented, and some measures of importance for these candidate objects are explored, and an example algorithm can quickly generate a list of possible object locations for the identification computation.
Abstract: A vision machine must locate possible objects before it can identify them. In simple images, where the objects and illumination are known, locating possible objects can be part of, and secondary to, identification. In complex, natural images it is more efficient to use a quick and simple method to locate possible objects first, and then to selectively identify them. This parallels the strategy normally used by the human visual system. We present a theory for how possible objects, called "blobs", will be represented in an image, and explore some measures of importance for these candidate objects. An example algorithm based on this theory can quickly generate a list of possible object locations for the identification computation. We discuss the implementation of this example algorithm on a standard image processing system.

Journal ArticleDOI
TL;DR: In this article, an algorithm for sharpening blurred as well as noisy image is proposed, which is implemented on a set of image and the performance of the present technique is compared with those of other conventional spatial domain techniques.

Proceedings ArticleDOI
16 Sep 1985
TL;DR: The physiologically based HVS model isporated with the popular two-dimensional discrete cosine transform (DCT) coding to encode digitized images of diagnostic human tissue sections acquired by a scanning transmission electron microscopy (STEM).
Abstract: Data Compression will reduce storage device problems and cut down the time required for transmission of high resolution medical images. In order to improve subjective image quality, various mathematical models of the human visual system (HVS) have been proposed for image compression applications. Most of these models are based upon data from psychophysical experiments on human vision. Since some nonlinear characteristics are involved in the human visual system, the psychophysically based model may not reflect the real HVS. In this paper, the physiologically based HVS model is incor-porated with the popular two-dimensional discrete cosine transform (DCT) coding to encode digitized images of diagnostic human tissue sections acquired by a scanning transmission electron microscopy (STEM). Simulation results are compared at 1 and 0.5 bit/pixel between systems with and without the HVS model. Superior subjective image quality was observed from the system with the HVS model. A study of coding scheme mismatch was also conducted to evaluate the robustness of DCT coding for medical image applications. Due to the strong similarity between the statistics of images, simulation results showed little degradation from coding scheme mismatch.


Proceedings Article
18 Aug 1985
TL;DR: In this article, it is shown that it is frequently possible to structure the problem as that of recovering depth from a stereo pair consisting of a conventionial perspective image (the original image) and an orthographic image(the virtual image).
Abstract: A single 2-D image is an ambiguom representation of the 3-D world many different scenes could have produced the same image - yet the human visual system is extremely successful at recovering a qualitatively correct depth model from this type of representation. Workers in the field of computational vision have devised many distinct schemes that attempt to duplicate this ability of human vision; these schemes are collectively called "shape from...." methods (e.g., shape from shading, shape from texture, shape from contour). In this paper we argue that the distinct assumptions employed by each of these different schemes must be equivalent to providing a second (virtual) image of the original scene, and that all of these different approaches can be translated into a conventional stereo formalism. In particular, we show that it is frequently possible to structure the problem as that of recovering depth from a stereo pair consisting of a conventionial perspective image (the original image) and an orthographic image (the virtual image). We provide a new algorithm of the form required to accomplish this type of stereo reconstruction task.