scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 1993"


Journal ArticleDOI
TL;DR: This paper considers the task of detection of a weak signal in a noisy image and suggests the Hotelling model with channels as a useful model observer for the purpose of assessing and optimizing image quality with respect to simple detection tasks.
Abstract: Image quality can be defined objectively in terms of the performance of some "observer" (either a human or a mathematical model) for some task of practical interest. If the end user of the image will be a human, model observers are used to predict the task performance of the human, as measured by psychophysical studies, and hence to serve as the basis for optimization of image quality. In this paper, we consider the task of detection of a weak signal in a noisy image. The mathematical observers considered include the ideal Bayesian, the nonprewhitening matched filter, a model based on linear-discriminant analysis and referred to as the Hotelling observer, and the Hotelling and Bayesian observers modified to account for the spatial-frequency-selective channels in the human visual system. The theory behind these observer models is briefly reviewed, and several psychophysical studies relating to the choice among them are summarized. Only the Hotelling model with channels is mathematically tractable in all cases considered here and capable of accounting for all of these data. This model requires no adjustment of parameters to fit the data and is relatively insensitive to the details of the channel mechanism. We therefore suggest it as a useful model observer for the purpose of assessing and optimizing image quality with respect to simple detection tasks.

465 citations


Journal ArticleDOI
TL;DR: This work investigates the quantification of depth and size perception of virtual objects relative to real objects in combined real and virtual environments, and preliminary experimental results on the perceived depth of spatially nonoverlapping real andvirtual objects are presented.
Abstract: With the rapid advance of real-time computer graphics, head-mounted displays HMDs have become popular tools for 3D visualization. One of the most promising and challenging future uses of HMDs, however, is in applications where virtual environments enhance rather than replace real environments. In such applications, a virtual image is superimposed on a real image. The unique problem raised by this superimposition is the difficulty that the human visual system may have in integrating information from these two environments. As a starting point to studying the problem of information integration in see-through environments, we investigate the quantification of depth and size perception of virtual objects relative to real objects in combined real and virtual environments. This starting point leads directly to the important issue of system calibration, which must be completed before perceived depth and sizes are measured. Finally, preliminary experimental results on the perceived depth of spatially nonoverlapping real and virtual objects are presented.

192 citations


Patent
12 Aug 1993
TL;DR: In this paper, a support provided with a machine-detectable security element, particularly a copying-security element, is described, where the security element comprises a first image perceptible to the human visual system, and a second image that is substantially imperceptible.
Abstract: A support provided with a machine-detectable security element, particularly a copying-security element. The security element comprises a first image perceptible to the human visual system, said first image incorporating a second image that is substantially imperceptible to the human visual system.

145 citations


Book
01 Apr 1993
TL;DR: This book examines the current status of what is known (and not known) about human vision, how human observers interpret visual data, and how to present such data to facilitate their interpretation and use.
Abstract: From the Publisher: This book examines the current status of what is known (and not known) about human vision, how human observers interpret visual data, and how to present such data to facilitate their interpretation and use. Written by experts who are able to cross disciplinary boundaries, the book provides an educational pathway through several models of human vision; describes how the visual response is analyzed and quantified; presents current theories of how the human visual response is interpreted; discusses the cognitive responses of human observers; and examines such applications as space exploration, manufacturing, surveillance, earth and air sciences, and medicine. The book is intended for everyone with an undergraduate-level background in science or engineering with an interest in visual science.

128 citations


Journal ArticleDOI
TL;DR: Findings in psychophysical and neurophysiological experiments have shown that the primate visual system's normally veridical interpretation of moving patterns is attained through utilization of image segmentation cues unrelated to motion per se challenge notions of modularity.
Abstract: The problem of processing visual motion is underconstrained---many possible real world motions are compatible with any given dynamic retinal image. Recent psychophysical and neurophysiological experiments have shown that the primate visual system's normally veridical interpretation of moving patterns is attained through utilization of image segmentation cues unrelated to motion per se. These findings challenge notions of modularity in which it is assumed that the processing of specific scene properties, such as motion, can be studied in isolation from other visual processes. We discuss the implications of these findings with regard to both experimental and computational approaches to the study of visual motion.

86 citations


Journal ArticleDOI
William S. Cleveland1
TL;DR: A model has been developed to provide a framework for the study of visual decoding and a specification of visual operations that are employed to carry out pattern perception and table look-up is developed.
Abstract: A method of statistical graphics consists of two parts: a selection of statistical information to be displayed and a selection of a visual display method to encode the information. Some display methods lead to efficient, accurate visual decoding of encoded information, and others lead to inefficient, inaccurate decoding. It is only through rigorous studies of visual decoding that informed judgments can be made about how to choose display methods. A model has been developed to provide a framework for the study of visual decoding. The model consists of three parts: (1) a two-way classification of information on displays—quantitative-scale, quantitative-physical, categorical-scale, and categorical-physical; (2) a division of the visual processing of graphical displays into pattern perception and table look-up; (3) a specification of visual operations that are employed to carry out pattern perception and table look-up. Display methods are assessed by studying the visual operations to which they lead....

77 citations


Journal ArticleDOI
TL;DR: This article addresses the problem of measuring reliabily the absolute three-dimensional position of objects in an unknown and cluttered scene by using several range recovery techniques together, so that they cooperate in visual behaviors similar to those exhibited by the human visual system.
Abstract: This article addresses the problem of measuring reliabily the absolute three-dimensional position of objects in an unknown and cluttered scene. It circumvents the limitations of a single sensor or single algorithm by using several range recovery techniques together, so that they cooperate in visual behaviors similar to those exhibited by the human visual system. Implemented visual behaviors include (i) aperture adjustment to vary depth of field and contrast, (ii) focus ranging followed by fixation, (iii) stereo ranging followed by focus ranging, and (iv) focus ranging followed by disparity prediction followed by focus ranging. The main contribution is a demonstration that two particular visual ranging processes—focusing and stereo—can cooperate to improve measurement reliability. The results of 75 experiments processing close to 3000 different object points lying at distances between 1 and 3 meters demonstrate that the computed range values are highly reliable.

70 citations


Proceedings ArticleDOI
01 Apr 1993
TL;DR: In this paper, the authors survey and give a classification of the criteria for the evaluation of monochrome image quality, including the mean square error (MSE) and mean square errors (SSE).
Abstract: Although a variety of techniques are available today for gray-scale image compression, a complete evaluation of these techniques cannot be made as there is no single reliable objective criterion for measuring the error in compressed images. The traditional subjective criteria are burdensome, and usually inaccurate or inconsistent. On the other hand, being the most common objective criterion, the mean square error (MSE) does not have a good correlation with the viewer's response. It is now understood that in order to have a reliable quality measure, a representative model of the complex human visual system is required. In this paper, we survey and give a classification of the criteria for the evaluation of monochrome image quality.

66 citations


Book
31 Dec 1993
TL;DR: A Pyramid Framework for Early Vision describes a multiscale approach to vision, including its theoretical foundations, and a set of pyramid-based modules for image processing, object detection, texture discrimination, contour detection and processing, feature detection and description, and motion detection and tracking.
Abstract: From the Publisher: Biological visual systems employ massively parallel processing to perform real-world visual tasks in real time. A key to this remarkable performance seems to be that biological systems construct representations of their visual image data at multiple scales. A Pyramid Framework for Early Vision describes a multiscale, or 'pyramid', approach to vision, including its theoretical foundations, a set of pyramid-based modules for image processing, object detection, texture discrimination, contour detection and processing, feature detection and description, and motion detection and tracking. It also shows how these modules can be implemented very efficiently on hypercube-connected processor networks. The volume is intended for both students of vision and vision system designers; it provides a general approach to vision systems design as well as a set of robust, efficient vision modules.

62 citations


Journal ArticleDOI
TL;DR: This paper introduces VPISC, a new digital image sequence (video) coding process that possesses significant advantages relative to other technologies; in particular, it is extremely efficient in terms of the computational effort required.
Abstract: Visual pattern image sequence coding (VPISC) is a new digital image sequence (video) coding process that possesses significant advantages relative to other technologies; in particular, it is extremely efficient in terms of the computational effort required. It is designed to exploit properties of the human visual system (HVS), and thus yields high visual fidelity. Visual quality criteria are deliberately chosen over information-theoretic ones on the grounds that, in images intended for human viewing, visual criteria are the most meaningful ones. VPISC yields impressive compression comparable to other recent methods, such as motion-compensated vector quantization. VPISC divides the images into spatiotemporal cubes, which are then independently matched with one of a small, predetermined set of visually meaningful three-dimensional space-time patterns. The pattern set is chosen to conform to specific characteristics of the HVS. Also introduced are two modifications of VPISC: adaptive and foveal VPISC. These are spatiotemporally nonuniform implementations that code different portions of the image sequence at different resolutions, according to either a fidelity criterion (for AVPISC) or a foveation criterion (for FVPISC). >

50 citations


Proceedings ArticleDOI
26 Jul 1993
TL;DR: In this work, two models of fish-eye transform are presented and the validity of the transformations is demonstrated by fitting the alternative models to a real fish-eyes lens.
Abstract: The human visual system can be characterized as a variable-resolution system: foveal information is processed at very high spatial resolution whereas peripheral information is processed at low spatial resolution. Various transforms have been proposed to model spatially varying resolution. Unfortunately, special sensors need to be designed to acquire images according to existing transforms. In this work, two models of fish-eye transform are presented. The validity of the transformations is demonstrated by fitting the alternative models to a real fish-eye lens.

Book ChapterDOI
01 Jan 1993
TL;DR: This chapter will not consider motion estimation, but rather look at the ends of motion compensation, i.e., various ways that motion vectors are used in different applications.
Abstract: As Einstein has observed, “mankind’s great problem is a perfection of means but a confusion of ends.” Several chapters in this book present means to estimate motion between two or more image frames in rather sophisticated ways. In this chapter, we will not consider motion estimation, but rather look at the ends of motion compensation, i.e., we will consider various ways that motion vectors are used in different applications.

Proceedings ArticleDOI
06 Aug 1993
TL;DR: A new parallel machine vision architecture fully adapted to vision-based control approaches and able to achieve real-time video rate performances for the servoing loop is designed.
Abstract: The work presented here focuses on the problem of positioning a mechanical structure with respect to a deep underwater bore-hole by using a visual servoing approach. First, we briefly recall a paradigm we have been using as a theoretical framework for implementing visual servoing tasks. Then, we apply this general formalism to our particular case where the visual features are two ellipses perceived in the image which correspond to the projection of two circles bounding the bore-hole in the scene. The last part is devoted to implementation aspects. We show some results which have been obtained both in simulation and on our testbed consisting of a 6 degrees-of-freedom arm with a camera mounted on its end effector. Among implementation aspects, the control loop sampling rate is the most crucial issue. In a vision- based control approach, we have to deal with strong real-time constraints in terms of image processing in order to ensure performances and stability of the closed loop control scheme. To solve this problem, we have designed a new parallel machine vision architecture fully adapted to vision-based control approaches and able to achieve real-time video rate performances for the servoing loop.

Journal ArticleDOI
TL;DR: This article will discuss present compression techniques in terms of scene properties and scene statistics and show how even further compression can be obtained by developing a complete understanding ofscene properties and those of visual perception.
Abstract: Digital compression techniques have made impressive progress in recent years. Frequently, a reader of a paper on compression is left with the impression that one is getting something for nothing; however, compression can only be achieved by leaving out unnecessary information about the image. A compression ratio of 20:1 simply means that 95% of the information in the original image has been eliminated. There are only two types of information that can be removed without seeing degradation in image quality: information that can be accurately predicted or information that the human visual system cannot see. This article will discuss present compression techniques in terms of these two factors and show how even further compression can be obtained by developing a complete understanding of scene properties and those of visual perception. From an analysis based on present knowledge of visual perception and scene statistics, compression ratios in excess of 50:1 should be achievable without perceptible degradation.

Book ChapterDOI
09 Jun 1993
TL;DR: An efficient implementation on a conventional computer architecture of some early visual processing that occurs in the eye, based on an extended version of Mead's model, by means of an original algorithm, much faster than FFT.
Abstract: This paper presents an efficient implementation on a conventional computer architecture of some early visual processing that occurs in the eye. A retinal spatiotemporal recursive filter, based on an extended version of Mead's model, has been implemented by means of an original algorithm, much faster than FFT (18 times for a 256 × 256 image). Various realistic aspects have been added, particularly non-homogeneous filtering by the crystalline lens, photoreceptor coupling, chromatic sampling and digital filtering with an irregular spatial sampling. The integration of these unconventional components of early visual organisation into a single coherent system leads to the development of clever digital realisations: among the many possibilities of this simulation tool, three of them are illustrated, concerning space-variant and colour processing. The program, written in the C language, allows synthesis, filtering and the display of colour images with moving objects.

Journal ArticleDOI
TL;DR: A novel account of the origins of the patterns is developed in this paper utilizing advances in the neurobiology of information processing and nonlinear dynamics.


Journal ArticleDOI
01 May 1993
TL;DR: Three new models for visual contrast sensitivity are introduced and evaluated using contrast sensitivity function (CSF) data with samples on both narrow and wide frequency ranges, and appear to more-closely approximate the underlying sensory mechanisms.
Abstract: Eight models are examined as input-output representations of steady-state vision in humans at moderate to low level illumination. Three new models for visual contrast sensitivity are introduced and evaluated using contrast sensitivity function (CSF) data with samples on both narrow and wide frequency ranges. Additionally, five variations of previously published models are evaluated using the same data. A nonlinear least squares fitting algorithm produced the optimal parameters for each model. The eight models are compared on the basis of RMS error in their fit to the CSF data. The three new models, based on second-, third-, and fourth-order filter functions, provided the best fit to the data. They appear to more-closely approximate the underlying sensory mechanisms, and thus they provide a more useful input-output representation of the overall human visual system. >

Journal ArticleDOI
TL;DR: An automatic thresholding method based on aspects of the human visual system which preserved edge structure in images is proposed, and first determine edge thresholds based on human visual perception, and then use these edge thresholds to find several edge intervals.

Proceedings ArticleDOI
26 Jul 1993
TL;DR: A visual programming approach to transferring the operator's manipulation skill directly to the robot to let the robot learn by observing the human expert moving in his/her own work space is described.
Abstract: The paper describes a visual programming approach to transferring the operator's manipulation skill directly to the robot. The objective is to let the robot learn by observing the human expert moving in his/her own work space. High-speed and high-precision acquisition of the location information is expedited by a marker-based visual system mounted on a conic tool surface, assisted by transputer calculation. Possible violation of the robot physical constraint is checked online with the provision of computer graphic facilities in the system which are also used for subsequent automatic path refinement.

Proceedings ArticleDOI
27 Apr 1993
TL;DR: An adaptive DCT (discrete cosine transform)-based image coding scheme is developed in which a combination of a perceptually motivated image model, entropy-constrained trellis coded quantization (ECTCQ), and perceptual error weighting is used to obtain good subjective performance at low bit rates.
Abstract: An adaptive DCT (discrete cosine transform)-based image coding scheme is developed in which a combination of a perceptually motivated image model, entropy-constrained trellis coded quantization (ECTCQ), and perceptual error weighting is used to obtain good subjective performance at low bit rates. The model is used to decompose the image into strong edge, slow-intensity variations and texture components. The perceptually important strong edges are encoded essentially losslessly. The remaining components are encoded using an adaptive DCT in which the transform coefficients are quantized by ECTCQ. The contrast sensitivity of the human visual system is used for perceptual weighting of the transform coefficients. Objective and subjective results suggest noticeable improvement over JPEG (Joint Photographic Experts Group). >

Book ChapterDOI
01 Jan 1993
TL;DR: This chapter discusses algorithms that convolve the input image with operators of limited bandwidth, and search either for zero-crossings or peaks in the output, and then simulate how humans may detect lines and edges.
Abstract: It is generally accepted that edge and line detection is an important stage of any visual system, biological or artificial. Many algorithms have been developed, either to simulate how humans may detect lines and edges, or as a stage in artificial image processing (see Hildreth, 1985) . Most algorithms convolve the input image with operators of limited bandwidth, and search either for zero-crossings or peaks in the output.

Proceedings ArticleDOI
TL;DR: Work in progress at Georgia Tech to develop a model of human pattern perception, visual search, and detection is reviewed and the organization of the model for predicting target acquisition, analyzing target signatures, and specifying low-observable requirements is discussed.
Abstract: Work in progress at Georgia Tech to develop a model of human pattern perception, visual search, and detection is reviewed. The model's algorithms are based on research on low-level visual processes. Recent advances in the field have led to the development of computational models of the image processing performed by the visual system from the cornea to the striate cortex. The model also incorporates recent advances from research on visual search. The organization of the model for predicting target acquisition, analyzing target signatures, and specifying low-observable requirements is discussed.

Proceedings ArticleDOI
08 Sep 1993
TL;DR: In this paper, two experiments for evaluating psychophysical distortion metrics for JPEG-encoded images are described, and the results of these experiments were used to determine the predictive value of a number of computed image distortion metrics.
Abstract: Two experiments for evaluating psychophysical distortion metrics for JPEG-encoded images are described The first is a threshold experiment, in which subjects determined the bit rate or level of distortion at which distortion was just noticeable The second is a suprathreshold experiment in which subjects ranked image blocks according to perceived distortion The results of these experiments were used to determine the predictive value of a number of computed image distortion metrics It was found that mean-square-error is not a good predictor of distortion thresholds or suprathreshold perceived distortion Some simple point- wise measures were in good agreement with psychophysical data; other more computationally intensive metrics involving spatial properties of the human visual system gave mixed results It was determined that mean intensity, which is not accounted for in the JPEG algorithm, plays a significant role in perceived distortion© (1993) COPYRIGHT SPIE--The International Society for Optical Engineering Downloading of the abstract is permitted for personal use only

Book ChapterDOI
01 Jan 1993
TL;DR: The interpretation of visual images is characterized by ambiguity and one approach has been to define the visual capability of an individual in terms of metrics derived from an understanding of how individuals perceive spatial information.
Abstract: The interpretation of visual images is characterized by ambiguity. The ability of an individual to extract information from an image is difficult to quantify. One approach to this challenge has been to define the visual capability of an individual in terms of metrics derived from an understanding of how individuals perceive spatial information.

Proceedings ArticleDOI
08 Sep 1993
TL;DR: Preliminary trials of the meter verify that it can produce quantitative CCIR gradings that match those made by an `expert' human assessor and it does so better than other electronic systems that do not incorporate the model of early human vision.
Abstract: We are developing an automatic `image quality meter' for assessing the degree of impairment of broadcast TV images. The meter incorporates a model of the human visual system derived from psychophysical and neurophysiological studies. Early visual processing is assumed to consist of a set of spatially parallel, largely independent functional modules; but later stages are more heavily resource limited and constrained by limitations on attention and memory capacity. In line with CCIR recommendations, image evaluation can focus either on detection of the impairment itself (typically, superimposed lines or noise, or color dropout) or on assessment of the perceptible quality of the depicted scene. The observer may choose to attend to either aspect. Experimental studies of human subjects suggest that these two processes are largely independent of each other and subject to voluntary control. The meter captures images directly from TV via a CCD camera and digital sampling hardware. Early visual processes are emulated in software as a bank of spatial and temporal filters and higher level processes by a 3-layer neural network. Preliminary trials of the meter verify that it can produce quantitative CCIR gradings that match those made by an `expert' human assessor and it does so better than other electronic systems that do not incorporate the model of early human vision.


Proceedings ArticleDOI
14 Sep 1993
TL;DR: A model for image perception is described in which the percept is first constructed from visual primitives and then deconstructed into diagnostic features that are suitable for symbolic manipulation.
Abstract: A model for image perception is described in which the percept is first constructed from visual primitives and then deconstructed into diagnostic features that are suitable for symbolic manipulation. The relationship of the model to the psychophysics of medical imaging is discussed in terms of the relationship between image appearance and the observer response. Some methods for modifying the image to aid visual perception are discussed. They include image enhancement, computer image analysis, and feedback assisted visual search. Some of the perceptual aspects of the search component of the detection-location task are described. Finally, the implications of the model are presented for those who would assist image perception or utilize visual feature extraction.

Proceedings ArticleDOI
TL;DR: A new multimedia integrated switching system (MISS) that uses a fully connected crossbar switch to combine servers and treats a time interval of a few hundred microseconds as the basic unit of data block transfer, which greatly reduces video frame fluctuation and halves the average image transfer delay.
Abstract: Advanced visual information retrieval systems supporting both video and images need to have flexible system design so that their system configurations can easily be enhanced. It is therefore desirable to separate the features of a central system into three parts: storage servers, communication servers, and a back-end network that combines these. In this architecture, unscheduled arrivals of data blocks at a back-end network cause two problems: unacceptable fluctuation of video frames and overly long delays of image transfer. To solve these problems, we have designed a new multimedia integrated switching system (MISS) that uses a fully connected crossbar switch to combine servers. MISS treats a time interval of a few hundred microseconds (called a `time-slot') as the basic unit of data block transfer, and allocates appropriate time-slots to all transfer requests in order to simultaneously meet the requirements for each kind of visual information transfer. According to simulation results and estimates based on queuing theory, MISS greatly reduces video frame fluctuation and halves the average image transfer delay. These effects have been confirmed in an experimental visual communication system built around MISS. This system supports JPEG compressed video and images, and six terminals can simultaneously retrieve visual information through an FDDI network.© (1993) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
29 Oct 1993
TL;DR: This research started from the well known MAD full search procedure and wanted to obtain reasonable results without adding too much complexity, so the improvement is performed inside the algorithm without any need for post processing.
Abstract: In video coding it is clearly worthwhile to have a more realistic motion field than what can be obtained by the classical mean absolute difference (MAD) full search method. Applications can be found in very low bitrate coding where the amount of bits to code the motion field is important compared to the total bitrate, in codecs which want to exploit some features of the human visual system (classification for optimal bit allocation, use of motion masking...) or in object based coding where the motion estimation algorithm interacts with the segmentation procedure. Our research started from the well known MAD full search procedure and wanted to obtain reasonable results without adding too much complexity. The improvement is performed inside the algorithm without any need for post processing. After a more thorough description of these improvements, some results will be compared and applications indicated.