scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 1994"


Journal ArticleDOI
TL;DR: This paper reviews the visual search literature and presents a model of human search behavior, a revision of the guided search 2.0 model in which virtually all aspects of the model have been made more explicit and/or revised in light of new data.
Abstract: An important component of routine visual behavior is the ability to find one item in a visual world filled with other, distracting items. This ability to performvisual search has been the subject of a large body of research in the past 15 years. This paper reviews the visual search literature and presents a model of human search behavior. Built upon the work of Neisser, Treisman, Julesz, and others, the model distinguishes between a preattentive, massively parallel stage that processes information about basic visual features (color, motion, various depth cues, etc.) across large portions of the visual field and a subsequent limited-capacity stage that performs other, more complex operations (e.g., face recognition, reading, object identification) over a limited portion of the visual field. The spatial deployment of the limited-capacity process is under attentional control. The heart of the guided search model is the idea that attentional deployment of limited resources isguided by the output of the earlier parallel processes. Guided Search 2.0 (GS2) is a revision of the model in which virtually all aspects of the model have been made more explicit and/or revised in light of new data. The paper is organized into four parts: Part 1 presents the model and the details of its computer simulation. Part 2 reviews the visual search literature on preattentive processing of basic features and shows how the GS2 simulation reproduces those results. Part 3 reviews the literature on the attentional deployment of limited-capacity processes in conjunction and serial searches and shows how the simulation handles those conditions. Finally, Part 4 deals with shortcomings of the model and unresolved issues.

3,436 citations


Book
31 Dec 1994
TL;DR: Signal Processing for Computer Vision is the first book to give a unified treatment of representation and filtering of higher order data, such as vectors and tensors in multidimensional space.
Abstract: From the Publisher: Signal Processing for Computer Vision provides a unique and thorough treatment of the signal processing aspects of filters and operators for low level computer vision. Computer Vision has progressed considerably over the years. From methods only applicable to simple images, it has developed to deal with increasingly complex scenes, volumes and time sequences. A substantial part of this book deals with the problem of designing models that can be used for several purposes with computer vision. These partial models have some general properties of invariance generation and generality in model generation. Signal Processing for Computer Vision is the first book to give a unified treatment of representation and filtering of higher order data, such as vectors and tensors in multidimensional space. Included is a systematic organisation for the implementation of complex models in a hierarchical modular structure and novel material on adaptive filtering using tensor data representation. Signal Processing for Computer Vision is intended for final year undergraduate and graduate students as well as engineers and researchers in the field of computer vision and image processing.

796 citations


Journal ArticleDOI
TL;DR: It is suggested that several varieties of second-order motion stimuli may be regarded as equivalent to contrast-modulated images when considered in terms of the effects of local spatiotemporal filtering operations carried out by the human visual system.

201 citations


Journal ArticleDOI
27 Oct 1994-Nature
TL;DR: It is reported that even Kanizsa subjective figures can be detected without focal attention at parallel stages of the human visual system.
Abstract: Subjective figures, seen in the absence of luminance gradients (Fig. 1), provide a phenomenal illusion that can be related to the properties of single cells in the visual cortex, offering a rare bridge between brain function and visual awareness. It remains controversial whether subjective figures arise from intelligent cognitive mechanisms, or from lower-level processes in early vision. The cognitive account implies that the perception of subjective figures may require serial attentive processing, whereas on the low-level account they should arise in parallel at earlier visual stages. Physiological evidence apparently fits a low-level account and indicates that some types of subjective contour may be detected earlier than the conventional Kanizsa type. Here we report that even Kanizsa subjective figures can be detected without focal attention at parallel stages of the human visual system.

162 citations


Journal ArticleDOI
TL;DR: In this paper, a new multivariate filtering operation called the alpha-trimmed vector median is proposed, which completely preserves stationary regions in image sequences, without motion compensation or motiondetection.
Abstract: Most current algorithms developed for image sequence filtering require motion information in order to obtain good results both in the still and moving parts of an image sequence. In the present paper, filters which completely preserve stationary regions in image sequences are introduced. In moving regions, the 3D filters inherently reduce to spatial filters and perform well in these areas without any motion-compensation or motion-detection. A new multivariate filtering operation called the alpha-trimmed vector median is proposed. Guidelines for the determination of optimal 3D median-related structures for color and gray-level image sequence filtering are given. Algorithms based on vector median, extended vector median, alpha-trimmed vector median, and componentwise median operations are developed. Properties of the human visual system are taken into account in the design of filters. Noise attenuation and detail preservation capability of the filters is examined. In particular, the impulsive noise attenuation capability of the filters is analyzed theoretically. Simulation results based on real image sequences are given. >

160 citations


Journal ArticleDOI
TL;DR: The results show that the human visual system can indeed exploit symmetry to facilitate object recognition, and support the model for object recognition in which a small number of two-dimensional views are remembered and combined to recognize novel views of the same object.

153 citations


Dissertation
01 Jan 1994
TL;DR: This thesis describes the development, implementation, and analysis of diffraction-specific computation, an approach that considers the reconstruction process rather than the interference process in optical holography, to increase the speed of holographic computation for real-time three-dimensional electro-holographic (holovideo) displays.
Abstract: Diffraction-specific fringe computation is a novel system for the generation of holographic fringe patterns for real-time display. This thesis describes the development, implementation, and analysis of diffraction-specific computation, an approach that considers the reconstruction process rather than the interference process in optical holography. The primary goal is to increase the speed of holographic computation for real-time three-dimensional electro-holographic (holovideo) displays. Diffraction-specific fringe computation is based on the discretization of space and spatial frequency in the fringe pattern. Two holographic fringe encoding techniques are developed from diffraction-specific fringe computation and applied to make most efficient use of hologram channel capacity. A "hogel-vector encoding" technique is based on undersampling the fringe spectra. A "fringelet encoding" technique is designed to increase the speed and simplicity of decoding. The analysis of diffraction-specific computation focuses on the trade-offs between compression ratio, image fidelity, and image depth. The decreased image resolution (increased point spread) that is introduced into holographic images due to encoding is imperceptible to the human visual system under certain conditions. A compression ratio of 16 is achieved (using either encoding method) with an acceptably small loss in image resolution. Total computation time is reduced by a factor of over 100 to less than 7.0 seconds per 36-MB holographic fringe using the fringelet encoding method. Diffraction-specific computation more efficiently matches the information content of holographic fringes to the human visual system. Diffraction-specific holographic encoding allows for "visual-bandwidth holography," i.e., holographic imaging that requires a bandwidth commensurate with the usable visual information contained in an image. Diffraction-specific holographic encoding enables the integration of holographic information with other digital media, and is therefore vital to applications of holovideo in the areas of visualization, entertainment, and information, including education, telepresence, medical imaging, interactive design, and scientific visualization. Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.

140 citations


Proceedings ArticleDOI
Eric Saund1, Thomas P. Moran1
02 Nov 1994
TL;DR: By using computer vision techniques to perform covert recognition of visual structure as it emerges during the course of a drawing/editing session, a perceptually supported image editor gives users access to visual objects as they are perceived by the human visual system.
Abstract: The human visual system makes a great deal more of images than the elemental marks on a surface. In the course of viewing, creating, or editing a picture, we actively construct a host of visual structures and relationships as components of sensible interpretations. This paper shows how some of these computational processes can be incorporated into perceptually-supported image editing tools, enabling machines to better engage users at the level of their own percepts. We focus on the domain of freehand sketch editors, such as an electronic whiteboard application for a pen-based computer. By using computer vision techniques to perform covert recognition of visual structure as it emerges during the course of a drawing/editing session, a perceptually supported image editor gives users access to visual objects as they are perceived by the human visual system. We present a flexible image interpretation architecture based on token grouping in a multiscale blackboard data structure. This organization supports multiple perceptual interpretations of line drawing data, domain-specific knowledge bases for interpretable visual structures, and gesture-based selection of visual objects. A system implementing these ideas, called PerSketch, begins to explore a new space of WYPIWYG (What You Perceive Is What You Get) image editing tools.

116 citations


Journal ArticleDOI
TL;DR: In this article, a technique for guiding vergence movements for an active stereo camera system and for calculating dense disparity maps is described in the same theoretical framework based on phase differences in complex Gabor filter responses, modeling receptive field properties in the visual cortex.
Abstract: We present a technique for guiding vergence movements for an active stereo camera system and for calculating dense disparity maps. Both processes are described in the same theoretical framework based on phase differences in complex Gabor filter responses, modeling receptive field properties in the visual cortex. While the camera movements are computed with input images of coarse spatial resolution, the disparity map calculation uses a finer resolution in the scale space. The correspondence problem is solved implicitly by restricting the disparity range around zero disparity (Panum′s area in the human visual system). The vergence process is interpreted as a mechanism to minimize global disparity, thereby setting a 3D region of interest for subsequent disparity detection. The disparity map represents smaller local disparities as an important cue for depth perception. Experimental data for the integrated performance of vergence in natural scenes followed by disparity map calculations are presented.

96 citations


Book ChapterDOI
15 Dec 1994
TL;DR: This work demonstrates that motion recognition can be accomplished using lower-level motion features, without the use of abstract object models or trajectory representations, and presents a novel low-level computational approach for detecting and recognizing temporally repetitive movements, such as those characteristic of walking people, or flying birds.
Abstract: The goal of this thesis is to demonstrate the utility of low-level motion features for the purpose of recognition. Although motion plays an important role in biological recognition tasks, motion recognition, in general, has received little attention in the literature compared to the volume of work on static object recognition. It has been shown that in some cases, motion information alone is sufficient for human visual system to achieve reliable recognition. Previous attempts at duplicating such capability in machine vision have been based on abstract higher-level models of objects, or have required building intermediate representations such as the trajectories of certain feature points of the object. In this work we demonstrate that motion recognition can be accomplished using lower-level motion features, without the use of abstract object models or trajectory representations. First, we show that certain statistical spatial and temporal features derived from the optic flow field have invariant properties, and can be used to classify regional motion patterns such as ripples on water, fluttering of leaves, and chaotic fluid flow. We then present a novel low-level computational approach for detecting and recognizing temporally repetitive movements, such as those characteristic of walking people, or flying birds, on the basis of the periodic nature of their motion signatures. We demonstrate the techniques on a number of real-world image sequences containing complex non-rigid motion patterns. We also show that the proposed techniques are reliable and efficient by implementing a real-time activity recognition system.

88 citations


Journal ArticleDOI
TL;DR: A visual search method is employed to evaluate potential measures for the perceptual closure of fragmented shapes and shows that while certain intuitive measures are psychophysically inconsistent, a measure based on a sum of squares of the lengths of contour gaps is appropriate for both polygonal and smooth shapes.

Journal ArticleDOI
TL;DR: It is shown that the photoreceptor disarray does not determine the limit to performance for this task, the limit is post-receptoral and can be modelled in terms of a positional uncertainty within the early filters located before the response envelope has been extracted.

Journal ArticleDOI
TL;DR: In this paper, second-order motion patterns were employed in an attempt to establish the principles used for the detection of image motion in the human visual system, and two experiments were conducted to test the effectiveness of the feature-correspondence and intensity-based detection strategies.
Abstract: Motion in the retinal image may occur either in the form of spatiotemporal variations in luminance (first-order motion) or as spatiotemporal variations in characteristics derived from luminance, such as contrast (second-order motion). Second-order motion patterns were employed in an attempt to establish the principles used for the detection of image motion in the human visual system. In principle, one can detect motion at a high level of visual analysis by identifying features of the image and tracking their positions (correspondence-based detection) or at a low level by analysis of spatiotemporal luminance variations without reference to features (intensity-based detection). Prevailing models favor the latter approach, which has been adapted to account for the visibility of second-order motion by postulation of a stage of rectification that precedes motion energy detection [J. Opt. Soc. Am. A 5, 1986 (1988)]. In two experiments it is shown that second-order motion is indeed detected normally by use of the strategy of transformation plus energy detection but that detection can also be achieved by use of the feature-correspondence strategy when the intensity strategy fails. In the first experiment, a stimulus is employed in which opposite directions of motion perception are predicted by the two strategies. It is shown that normally the direction associated with motion energy in the rectified image is seen but that the direction associated with feature motion is seen when the energy system is disabled by the use of an interstimulus interval.(ABSTRACT TRUNCATED AT 250 WORDS)


Proceedings ArticleDOI
S. Daly1
13 Nov 1994
TL;DR: The paper describes an algorithm for the assessment of image fidelity that includes an image processing model of the human visual system for luminance still imagery and major components of the algorithm are described that model the visual system as three main sensitivity variations.
Abstract: The paper describes an algorithm for the assessment of image fidelity. The algorithm includes an image processing model of the human visual system for luminance still imagery. The major components of the algorithm are described that model the visual system as three main sensitivity variations. These address the sensitivity as a function of gray level, as a function of spatial frequency, and as a function of image content. To quantify the performance of the algorithm, specific psychophysical experiments were simulated, and these results are shown. >

Proceedings ArticleDOI
04 Oct 1994
TL;DR: It is argued that a single visual mechanism called "zooming" addresses these scaling problems and, when suitably augmented, can also support automatic component discovery and intelligent error correction.
Abstract: Visual programming research has largely focused on the issues of visual programming-in-the-small. However, entirely different concerns arise when one is programming-in-the-large. We present a visual software engineering environment that allows users to construct visually programs consisting of hierarchically organized networks of components that process streams of arbitrary objects. We discuss the problems that occur when trying to construct systems consisting of thousands of interconnected components, examine how this environment deals with some of the problems specific to visual programming-in-the-large, and show why our initial solutions failed to scale successfully. Finally, we argue that a single visual mechanism called "zooming" addresses these scaling problems and, when suitably augmented, can also support automatic component discovery and intelligent error correction. >

Proceedings Article
01 Jan 1994
TL;DR: A general computational method for recognizing repetitive movements characteristic of walking people, galloping horses, or flying birds in real image sequences is demonstrated using what is essentially template matching in a motion feature space coupled with a technique for detecting and normalizing periodic activities.
Abstract: The recognition of repetitive movements characteristic of walking people, galloping horses, or flying birds is a routine function of the human visual system. It has been demonstrated that humans can recognize such activity solely on the basis of motion information. We demonstrate a general computational method for recognizing such movements in real image sequences using what is essentially template matching in a motion feature space coupled with a technique for detecting and normalizing periodic activities. This contrasts with earlier model-based approaches for recognizing such activities.

Journal ArticleDOI
TL;DR: The design, performance, and application of The Real-time, Intelligently ControLled, Optical Positioning System (TRICLOPS), a multiresolution trinocular camera-pointing system which provides a center wide-angle view camera and two higher-resolution vergence cameras, are described.
Abstract: The design, performance, and application of The Real-time, Intelligently ControLled, Optical Positioning System (TRICLOPS) are described in this article. TRICLOPS is a multiresolution trinocular camera-pointing system which provides a center wide-angle view camera and two higher-resolution vergence cameras. It is a direct-drive system that exhibits dynamic performance comparable to the human visual system. The mechanical design and performance of various active vision systems are discussed and compared to those of TRICLOPS. The multiprocessor control system for TRICLOPS is described. The kinematics of the device are also discussed and calibration methods are given. Finally, as an example of real-time visual control, a problem in visual tracking with TRICLOPS is examined. In this example, TRICLOPS is shown to be capable of tracking a ball moving at 3 m/s, which results in rotational velocities of the vergence cameras in excess of 6 rad/s (344 deg/s).

Proceedings ArticleDOI
15 Oct 1994
TL;DR: This paper introduces a set of techniques for processing video data compressed using JPEG compression at near real-time rates on current generation workstations, and represents those effects where a pixel in the output image is a linear combination of pixels in the input image.
Abstract: This paper introduces a set of techniques for processing video data compressed using JPEG compression at near real-time rates on current generation workstations. Performance is improved over traditional methods by processing video data in compressed form, avoiding compression and decompression and reducing the amount of data processed. An approximation technique called condensation is developed that further reduces the complexity of the operation. The class of operations that are computable using the techniques developed in this paper are called linear, global digital specials effects (LGDSEs), and represent those effects where a pixel in the output image is a linear combination of pixels in the input image. Many important video processing problems, including convolution, scaling, rotation, translation, and transcoding can be expressed as LGDSEs.

01 Jun 1994
TL;DR: In this paper, the authors describe a number of visual illusions of motion in depth in which the motion of an object's cast shadow determines the perceived 3D motion of the object.
Abstract: We describe a number of visual illusions of motion in depth in which the motion of an object's cast shadow determines the perceived 3D motion of the object. The illusory percepts are phenomenally very strong. We analyze the information which cast shadow motion provides for the inference of 3D object motion and experimentally measure human observers' use of this information. The experimental results show that cast shadow information overrides a number of other strong perceptual constraints, including viewers' assumptions of constant object size and a general viewpoint. Moreover, they support the hypothesis that the human visual system incorporates a stationary light source constraint in the perceptual processing of shadow motion. The system imposes the constraint even when image information suggests a moving light source.

Proceedings ArticleDOI
13 Nov 1994
TL;DR: The psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates in this paper.
Abstract: Stereoscopic sequence compression typically involves the exploitation of the spatial redundancy between the left and right streams to achieve higher compressions than are possible with the independent compression of the two streams. In this paper the psychophysical property of the human visual system, that only one high resolution image in a stereo image pair is sufficient for satisfactory depth perception, has been used to further reduce the bit rates. Thus, one of the streams is independently coded along the lines of the MPEG standards, while the other stream is estimated at a lower resolution from this stream. A multiresolution framework has been adopted to facilitate such an estimation of motion and disparity vectors at different resolutions. Experimental results on typical sequences indicate that the additional stream can be compressed to about one-fifth of a highly compressed independently coded stream, without any significant loss in depth perception or perceived image quality. >

Patent
18 Feb 1994
TL;DR: In this paper, an adaptive quantizing method was proposed to modify a quantization spacing for individual blocks to be processed in accordance with the human visual system, such as motion, complicatedness, brightness etc. of a picture.
Abstract: An adaptive quantizing apparatus and method in HDTV according to the invention is capable of modifying a quantization spacing for the individual blocks to be processed in accordance with the human visual system, such as motion, complicatedness, brightness, etc. of a picture. To this end, a picture portion which is prone to present any conspicuous visual undesirable effects is tracked down and assigned a modified bit rate so that a bit rate is lowered for a picture portion where such undesirable effects, if they occur, should not be very conspicuous, decreasing the probability of enhancing the subject picture quality.

Proceedings ArticleDOI
19 Apr 1994
TL;DR: A new partitioning with overlapped blocks in the context of image coding based on iterated transformations systems that shows a very significant improvement of the visual quality of decoded images with no increase of the bitrate request.
Abstract: Memoryless blockwise partitioning induces blockiness artifacts highly disturbing to the human visual system. This paper presents a new partitioning with overlapped blocks in the context of image coding based on iterated transformations systems. Each block of the partition is extended by n pixels. As usual each cell is expressed as the contractive transformation of another part of the image. During the decoding, values of pixels corresponding to overlapped regions are computed as the weighted sum of the different contributions leading to that pixel. This overlapped partitioning is embedded in a quadtree segmentation of the image support. In order to avoid blurring effects in small blocks while maintaining efficiency in bigger ones, the overlapping width n is a function of the block size. Simulations show a very significant improvement of the visual quality of decoded images with no increase of the bitrate request. >

Book
01 Aug 1994
TL;DR: The author presents a meta-modelling architecture that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and modeling the images of a person using a computer.
Abstract: 1. Introduction.- 1.1. The Concept of Visual Computing.- 1.2. Organization of the Book.- 2. Psyschophysical Basics.- 2.1. Anatomy of the Human Visual System.- 2.1.1. Overview.- 2.1.2. Biological Neurons.- 2.1.3. Receptive Fields.- 2.1.4. The Human Retina.- 2.1.5. Organization of the Visual Cortex.- 2.2. Physics of the Human Eye.- 2.2.1. Image Projection and the Field of Vision.- 2.2.2. Accommodation.- 2.2.3. Image Quality and Diffraction Effects.- 2.3. Measuring Light.- 2.3.1. Spectral Sensitivity.- 2.3.2. Basic Measurements.- 2.3.3. Examples.- 2.4. Rendering Physically Based Light Sources.- 2.4.1. A Rendering Pipeline.- 2.4.2. Modeling Light Sources.- 2.4.3. Direct Illumination.- 2.4.4. Spectral Radiosity.- 2.4.5. Spectral Ray Tracing.- 2.4.6. Examples.- 3. Sensitivity to Light and Color.- 3.1. Visual Perception of Light and Shape.- 3.1.1. Adaptation.- 3.1.2. Spatial Sensitivity.- 3.1.3. Temporal Sensitivity.- 3.1.4. Binocular Vision.- 3.1.5. Visual Clustering, Grouping and Gestalt.- 3.2. Color Vision.- 3.2.1. Physiological Basics.- 3.2.2. Measuring Color.- 3.3. Imaging Transforms.- 4. Visualization and Visibility Analysis.- 4.1. Introduction.- 4.2. Visibility Analysis Using Graphics and Imaging.- 4.2.1. Introductory Remarks.- 4.2.2. Factors Influencing the Visibility.- 4.2.3. Mathematical Description of the Visibility.- 4.2.4. Image Generation and Image Analysis.- 4.3. Interactive Visualization and Simulation.- 4.3.1. Introductory Remarks.- 4.3.2. Modeling of Wind and Air Pollution.- 4.3.3. Shape and Color for Visualization.- 4.3.4. Examples.- 4.3.5. The Need for Advanced Imaging Methods.- 5. Computational Vision.- 5.1. Introduction: The Marr Paradigm.- 5.2. Early Visual Processing.- 5.2.1. Basics.- 5.2.2. Modeling Retinal Image Processing.- 5.2.3. Modeling Cortical Image Processing.- 5.3. Advanced Visibility Analysis for Advertising.- 5.3.1. The Psychology of Advertising.- 5.3.2. Analyzing Retinal and Cortical Images.- 5.3.3. Modeling via Ray Casting.- 5.3.4. Examples.- 5.4. Wavelets for Graphics and Imaging.- 5.4.1. An Introduction to Wavelet Bases.- 5.4.2. General Description of the CWT.- 5.4.3. Nonorthogonal Wavelets.- 5.4.4. Wavelets for Volume Rendering.- 5.4.5. Wavelets for Texture Analysis.- 5.5. Shape from Stereo.- 5.5.1. Introductory Remarks.- 5.5.2. Automatic Stereo Matching.- 5.5.3. Formulation of a Matching Algorithm.- 5.5.4. Examples.- 5.6. Active Light Approaches: Laser Scanners.- 6. Image Analysis and Neural Networks.- 6.1. Introductory Remarks.- 6.2. Mathematical Foundations.- 6.2.1. Cluster Analysis and VectorQuantization.- 6.2.2. N-Tree Clustering.- 6.2.3. Dimensionality Reduction and Ordering.- 6.2.4. Principal Components and Subspaces.- 6.2.5. Image Coding Using the KL-Transform.- 6.2.6. Supervised Classification Methods.- 6.3. Neural Networks.- 6.3.1. An Introduction.- 6.3.2. Self-Organizing Kohonen Maps.- 6.3.3. Supervised Backpropagation Networks.- 6.3.4. Other Neural Network Models.- 7. Neural Network Applications.- 7.1. Introduction.- 7.2. Recognition of Distorted Characters.- 7.2.1. General Remarks.- 7.2.2. Matched Filtering.- 7.2.3. Error Probability for Binary Signals.- 7.2.4. Transmission and Discrimination of Characters...- 7.2.5. Results.- 7.3. Analysis and Visualization of Mutidimensional Remotely Sensed Image Data Sets.- 7.3.1. Remote Sensing Techniques.- 7.3.2. Cluster Visualization and Subspace Mapping.- 7.3.3. Studies on Satellite Image Classification.- 7.4. Interactive Identification and Reconstruction of Brain Tumors in MR-Images.- 7.4.1. Segmentation of Volume Data.- 7.4.2. Magnetic Resonance Technology.- 7.4.3. Clustering Texture Feature Spaces.- 7.4.4. Some Results.- 7.5. Automatic Face Recognition.- 7.5.1. Face Recognition Methods.- 7.5.2. Eigenfaces and Neural Networks.- 7.5.3. Results.- 7.5.4. Psychophysical Evidence.- 8. The Way Ahead.- Literature.

Proceedings ArticleDOI
15 Apr 1994
TL;DR: Practical method for soft copy color reproduction that matches the hard copy image in appearance is presented and is fundamentally based on a simple von Kries' adaptation model and takes into account the human visual system's partial adaptation and contrast matching.
Abstract: CRT monitors are often used as a soft proofing device for the hard copy image output. However, what the user sees on the monitor does not match its output, even if the monitor and the output device are calibrated with CIE/XYZ or CIE/Lab. This is especially obvious when correlated color temperature (CCT) of CRT monitor's white point significantly differs from ambient light. In a typical office environment, one uses a computer graphic monitor having a CCT of 9300K in a room of white fluorescent light of 4150K CCT. In such a case, human visual system is partially adapted to the CRT monitor's white point and partially to the ambient light. The visual experiments were performed on the effect of the ambient lighting. Practical method for soft copy color reproduction that matches the hard copy image in appearance is presented in this paper. This method is fundamentally based on a simple von Kries' adaptation model and takes into account the human visual system's partial adaptation and contrast matching.

Journal ArticleDOI
TL;DR: A new algorithm for stereoscopic depth perception, where the depth map is the momentary state of a dynamic process, and the structure of which shows analogies to the human visual system is proposed.
Abstract: We propose a new algorithm for stereoscopic depth perception, where the depth map is the momentary state of a dynamic process. To each image point we assign a set of possible disparity values. In a dynamic process with competition and cooperation, the correct disparity value is selected for each image point. Therefore, we solve the correspondence problem by a dynamic, self-organizing process, the structure of which shows analogies to the human visual system. The algorithm can be implemented in a massive parallel manner and yields good results for either artificial or natural images.


Journal ArticleDOI
TL;DR: A method for extracting, perceptually selecting and coding of visual details in a video sequence using morphological techniques is proposed and its application in the framework of a multiresolution segmentation-based coding algorithm yields better results than pure segmentation techniques at higher compression ratios.
Abstract: In this paper, the importance of including small image features at the initial levels of a progressive second generation video coding scheme is presented. It is shown that a number of meaningful small features called details should be coded, even at very low data bit-rates, in order to match their perceptual significance to the human visual system. We propose a method for extracting, perceptually selecting and coding of visual details in a video sequence using morphological techniques. Its application in the framework of a multiresolution segmentation-based coding algorithm yields better results than pure segmentation techniques at higher compression ratios, if the selection step fits some main subjective requirements. Details are extracted and coded separately from the region structure and included in the reconstructed images in a later stage. The bet of considering the local background of a given detail for its perceptual selection breaks the concept of "partition" in the segmentation scheme. As long as details are not considered as adjacent regions but isolated features spread over the image, "detail coding" can be seen as one step towards the so called feature-based video coding techniques. >

Book ChapterDOI
01 Jan 1994
TL;DR: The local energy model of Morrone and Burr detects and locates both lines and edges simultaneously, by taking the Pythagorean sum of the output of pairs of matched filters to produce the all-positive local energy function.
Abstract: Edges and lines carry much information about images and many models have been developed to explain how the human visual system may process them. One recent approach is the local energy model of Morrone and Burr. This model detects and locates both lines and edges simultaneously, by taking the Pythagorean sum of the output of pairs of matched filters (even- and odd-symmetric operators) to produce the all-positive local energy function. Maxima of this function signal the presence of all image features that are then classified as lines or edges (or both) and as positive or negative, depending on the strength of response of the even- and odd-symmetric operators. If the feature is an edge, it carries with it a brightness description that extends over space to the next edge. The model successfully explains many visual illusions, such as the Craik-O'Brien, Mach bands and a modified version of the Chevreul. Features can structure the visual image, often creating appearances quite contrary to the physical luminance distributions. In some examples the features dictate totally the image structure, 'capturing' all other information; in others the features are seen in transparence together with an alternate image. All cases can be predicted from the rules for combination of local energy at different scales.

Book ChapterDOI
01 Jan 1994
TL;DR: A catalogue has been provided of what filter kernels are natural to use, as well as an extensive theoretical explanation of how different kernels of different orders and at different scales can be related, which forms the basis of a theoretically well-founded modeling of visual front-end operators with a smoothing effect.
Abstract: In the previous chapter a formal justification has been given for using linear filtering as an initial step in early processing of image data (see also section 5 in this chapter). More importantly, a catalogue has been provided of what filter kernels are natural to use, as well as an extensive theoretical explanation of how different kernels of different orders and at different scales can be related. This forms the basis of a theoretically well-founded modeling of visual front-end operators with a smoothing effect.