scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 2000"


Journal ArticleDOI
TL;DR: A detailed computer implementation of a saliency map scheme is described, focusing on the problem of combining information across modalities, here orientation, intensity and color information, in a purely stimulus-driven manner, which is applied to common psychophysical stimuli as well as to a very demanding visual search task.

3,105 citations


Journal ArticleDOI
TL;DR: The capability of the human visual system with respect to these problems is discussed, and it is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.
Abstract: Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information, and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. The paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

1,872 citations


Proceedings ArticleDOI
10 Sep 2000
TL;DR: A new approach that can blindly measure blocking artifacts in images without reference to the originals is proposed, which has the flexibility to integrate human visual system features such as the luminance and the texture masking effects.
Abstract: The objective measurement of blocking artifacts plays an important role in the design, optimization, and assessment of image and video coding systems. We propose a new approach that can blindly measure blocking artifacts in images without reference to the originals. The key idea is to model the blocky image as a non-blocky image interfered with a pure blocky signal. The task of the blocking effect measurement algorithm is then to detect and evaluate the power of the blocky signal. The proposed approach has the flexibility to integrate human visual system features such as the luminance and the texture masking effects.

473 citations


Journal ArticleDOI
01 Jun 2000
TL;DR: An image steganographic model is proposed that is based on variable-size LSB insertion to maximise the embedding capacity while maintaining image fidelity and two methods are provided to deal with the security issue when using the proposed model.
Abstract: Steganography is an ancient art of conveying messages in a secret way that only the receiver knows the existence of a message. So a fundamental requirement for a steganographic method is imperceptibility; this means that the embedded messages should not be discernible to the human eye. There are two other requirements, one is to maximise the embedding capacity, and the other is security. The least-significant bit (LSB) insertion method is the most common and easiest method for embedding messages in an image. However, how to decide on the maximal embedding capacity for each pixel is still an open issue. An image steganographic model is proposed that is based on variable-size LSB insertion to maximise the embedding capacity while maintaining image fidelity. For each pixel of a grey-scale image, at least four bits can be used for message embedding. Three components are provided to achieve the goal. First, according to contrast and luminance characteristics, the capacity evaluation is provided to estimate the maximum embedding capacity of each pixel. Then the minimum-error replacement method is adapted to find a grey scale as close to the original one as possible. Finally, the improved grey-scale compensation, which takes advantage of the peculiarities of the human visual system, is used to eliminate the false contouring effect. Two methods, pixelwise and bitwise, are provided to deal with the security issue when using the proposed model. Experimental results show effectiveness and efficiency of the proposed model.

408 citations


DissertationDOI
01 Jan 2000
TL;DR: A detailed computational model of basic pattern vision in humans and its modulation by top-down attention is presented, able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.
Abstract: When we observe our visual environment, we do not perceive all its components as being equally interesting. Some objects automatically and effortlessly “pop-out” from their surroundings, that is, they draw our visual attention, in a “ bottom-up” manner, towards them. In a first approximation, focal visual attention acts as a rapidly shiftable “spotlight,” which allows only the selected information to reach higher levels of processing and representation. Most models of the bottom-up control of attention are based on the concept of a saliency map, that is, an explicit two-dimensional map that encodes the conspicuity of objects in the visual environment. Competition among neurons in this map gives rise to a single winning location that corresponds to the next attended target. Inhibiting this location automatically allows the system to attend to the next most salient location. A first body of work in this thesis describes a detailed computer implementation of such a scheme, focusing on the problem of combining information across modalities, here orientation, intensity and color information, in a purely stimulus-driven manner. The model is applied to common psychophysical stimuli as well as to very demanding visual search tasks. Its successful performance is used to address the extent to which the primate visual system carries out visual search via one or more such saliency maps and how this can be tested. We next address the question of what happens once our attention is focused onto a restricted part of our visual field. There is mounting experimental evidence that attention is far more sophisticated than a simple feed-forward spatially-selective filtering process. Indeed, visual processing appears to be significantly different inside the attentional spotlight than outside. That is, in addition to its properties as a feed-forward information processing and transmission bottleneck, focal visual attention feeds back and locally modulates, in a “top-down” manner, the visual processing and representation of selected objects. The second body of work presented in this thesis is concerned with a detailed computational model of basic pattern vision in humans and its modulation by top-down attention. We start by acquiring a complete dataset of five different simple psychophysical experiments, including discriminations of contrast, orientation and spatial frequency of simple pattern stimuli by human observers. This experimental dataset places strict constraints on our model of early pattern vision. The model, however, is eventually able to reproduce the entire dataset while assuming plausible neurobiological components. The model is further applied to existing psychophysical data which demonstrates how top-down attention alters performance in these simple psychophysical discrimination experiments. Our model is able to quantitatively account for all observations by assuming that attention strengthens the non-linear cortical interactions among visual neurons.

308 citations


BookDOI
01 Aug 2000
TL;DR: Many of the issues raised are relevant to object recognition in general, and such visual learning machines have numerous potential applications in areas such as visual surveillance multimedia and visually mediated interaction.
Abstract: From the Publisher: Face recognition is a task which the human visual system seems to perform almost effortlessly, yet goal of building machines with comparable capabilities has proven difficult to realize. The task requires the ability to locate and track faces through scenes which are often complex and dynamic. Recognition is difficult because of variations in factors such as lighting conditions, viewpoint, body movement and facial expression. Although evidence from psychophysical and neurobiological experiments provides intriguing insights into how we might code and recognize faces, their bearings on computational and engineering solutions are far from clear. This book describes how to build learning machines to perform face recognition in dynamic scenes. The task at hand is that of engineering robust machine vision systems that can operate under poorly controlled and changing conditions. Many of the issues raised are relevant to object recognition in general, and such visual learning machines have numerous potential applications in areas such as visual surveillance multimedia and visually mediated interaction.

276 citations


Journal ArticleDOI
TL;DR: Using functional magnetic resonance imaging (fMRI), it is shown that the cerebral networks involved in efficient and inefficient search overlap almost completely, and it is likely that visual search does not require serial processing, otherwise the existence of a serial searchlight that operates in the extrastriate cortex but differs from the visuospatial shifts of attention involving the parietal and frontal regions is assumed.
Abstract: The human visual system is usually confronted with many different objects at a time, with only some of them reaching consciousness. Reaction-time studies have revealed two different strategies by which objects are selected for further processing: an automatic, efficient search process, and a conscious, so-called inefficient search [Treisman, A. (1991). Search, similarity, and integration of features between and within dimensions. Journal of Experimental Psychology: Human Perception and Performance, 17, 652-676; Treisman, A., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136; Wolfe, J. M. (1996). Visual search. In H. Pashler (Ed.), Attention. London: University College London Press]. Two different theories have been proposed to account for these search processes. Parallel theories presume that both types of search are treated by a single mechanism that is modulated by attentional and computational demands. Serial theories, in contrast, propose that parallel processing may underlie efficient search, but inefficient searching requires an additional serial mechanism, an attentional "spotlight" (Treisman, A., 1991) that successively shifts attention to different locations in the visual field. Using functional magnetic resonance imaging (fMRI), we show that the cerebral networks involved in efficient and inefficient search overlap almost completely. Only the superior frontal region, known to be involved in working memory [Courtney, S. M., Petit, L., Maisog, J. M., Ungerleider, L. G., & Haxby, J. V. (1998). An area specialized for spatial working memory in human frontal cortex. Science, 279, 1347-1351], and distinct from the frontal eye fields, that control spatial shifts of attention, was specifically involved in inefficient search. Activity modulations correlated with subjects' behavior best in the extrastriate cortical areas, where the amount of activity depended on the number of distracting elements in the display. Such a correlation was not observed in the parietal and frontal regions, usually assumed as being involved in spatial attention processing. These results can be interpreted in two ways: the most likely is that visual search does not require serial processing, otherwise we must assume the existence of a serial searchlight that operates in the extrastriate cortex but differs from the visuospatial shifts of attention involving the parietal and frontal regions.

173 citations


Journal ArticleDOI
TL;DR: A perceptual-based image coder, which discriminates between image components based on their perceptual relevance for achieving increased performance in terms of quality and bit rate, which is based on a locally adaptive perceptual quantization scheme for compressing the visual data.
Abstract: Most existing efforts in image and video compression have focused on developing methods to minimize not perceptual but rather mathematically tractable, easy to measure, distortion metrics. While nonperceptual distortion measures were found to be reasonably reliable for higher bit rates (high-quality applications), they do not correlate well with the perceived quality at lower bit rates and they fail to guarantee preservation of important perceptual qualities in the reconstructed images despite the potential for a good signal-to-noise ratio (SNR). This paper presents a perceptual-based image coder, which discriminates between image components based on their perceptual relevance for achieving increased performance in terms of quality and bit rate. The new coder is based on a locally adaptive perceptual quantization scheme for compressing the visual data. Our strategy is to exploit human visual masking properties by deriving visual masking thresholds in a locally adaptive fashion based on a subband decomposition. The derived masking thresholds are used in controlling the quantization stage by adapting the quantizer reconstruction levels to the local amount of masking present at the level of each subband transform coefficient. Compared to the existing non-locally adaptive perceptual quantization methods, the new locally adaptive algorithm exhibits superior performance and does not require additional side information. This is accomplished by estimating the amount of available masking from the already quantized data and linear prediction of the coefficient under consideration. By virtue of the local adaptation, the proposed quantization scheme is able to remove a large amount of perceptually redundant information. Since the algorithm does not require additional side information, it yields a low entropy representation of the image and is well suited for perceptually lossless image compression.

140 citations


Proceedings ArticleDOI
01 Sep 2000
TL;DR: The investigation presented in this paper aims at an extension of the visual attention model to the scene depth component and results of visual attention, obtained form the extended model and for various 3D scenes, are presented.
Abstract: Visual attention is the ability to rapidly detect the interesting parts of a given scene. Inspired by biological vision, the principle of visual attention is used with a similar goal in computer vision. Several previous works deal with the computation of visual attention from images provided by standard video cameras, but little attention has been devoted so far to scene depth as source for visual attention. The investigation presented in this paper aims at an extension of the visual attention model to the scene depth component. The first part of the paper is devoted to the integration of depth in the computational model built around conspicuity and saliency maps. The second part is devoted to experimental work in which results of visual attention, obtained form the extended model and for various 3D scenes, are presented. The results speak for the usefulness of the enhanced computational model.

137 citations


Journal ArticleDOI
TL;DR: Experiments that require human observers to discriminate between pictures of slightly different faces or objects are described, providing direct empirical support for the notion that human spatial vision is optimised to the second-order statistics of the optical environment.

135 citations


Journal ArticleDOI
TL;DR: Evidence is discussed that some of these tasks are performed by predicting where an object will be at some sharply defined instant, several hundred milliseconds in the future, while other tasks are performing by utilizing the fact thatSome of their motor actions change what the authors see in ways that obey lawful relationships, and can therefore be learned.

Proceedings ArticleDOI
05 Jun 2000
TL;DR: The criteria for monochrome compressed image quality from 1974 to 1999 is reviewed and attempts to improve quality measurement include incorporation of simple models of the human visual system (HVS) and multi-dimensional tool design.
Abstract: While lossy image compression techniques are vital in reducing bandwidth and storage requirements, they result in distortions in compressed images. A reliable quality measure is a much needed tool for determining the type and amount of image distortion. The traditional subjective criteria, which involve human observers, are inconvenient, time-consuming, and influenced by environmental conditions. Widely used pixel wise measures such as the mean square error (MSE) cannot capture the artifacts like blurriness or blockiness, and do not correlate well with visual error perception. Attempts to improve quality measurement include incorporation of simple models of the human visual system (HVS) and multi-dimensional tool design. We review the criteria for monochrome compressed image quality from 1974 to 1999.

Proceedings ArticleDOI
Yong Rui1, Padmanabhan Anandan1
13 Jun 2000
TL;DR: This paper represents frame to frame optical-flow in terms of the coefficients of the most significant principal components computed from all the flow-fields within a given video sequence, and detects discontinuities in the temporal trajectories of these coefficients based on three different measures.
Abstract: The analysis of human action captured in video sequences has been a topic of considerable interest in computer vision. Much of the previous work has focused on the problem of action or activity recognition, but ignored the problem of detecting action boundaries in a video sequence containing unfamiliar and arbitrary visual actions. This paper presents an approach to this problem based on detecting temporal discontinuities of the spatial pattern of image motion that captures the action. We represent frame to frame optical-flow in terms of the coefficients of the most significant principal components computed from all the flow-fields within a given video sequence. We then detect the discontinuities in the temporal trajectories of these coefficients based on three different measures. We compare our segment boundaries against those detected by human observers on the same sequences in a recent independent psychological study of human perception of visual events. We show experimental results on the two sequences that were used in this study. Our experimental results are promising both from visual evaluation and when compared against the results of the psychological study.

Journal ArticleDOI
TL;DR: A model and algorithm for segmentation of images with missing boundaries is presented, considering a reference point within an image as given and developing an algorithm that tries to build missing information on the basis of the given point of view and the available information as boundary data.
Abstract: We present a model and algorithm for segmentation of images with missing boundaries. In many situations, the human visual system fills in missing gaps in edges and boundaries, building and completing information that is not present. This presents a considerable challenge in computer vision, since most algorithms attempt to exploit existing data. Completion models, which postulate how to construct missing data, are popular but are often trained and specific to particular images. In this paper, we take the following perspective: We consider a reference point within an image as given and then develop an algorithm that tries to build missing information on the basis of the given point of view and the available information as boundary data to the algorithm. We test the algorithm on some standard images, including the classical triangle of Kanizsa and low signal/noise ratio medical images.

Journal ArticleDOI
TL;DR: By analyzing and modeling several visual mechanisms of the HVS with the Haar transform, a new subjective fidelity measure is developed which is more consistent with human observation experience.

Journal ArticleDOI
TL;DR: An image adaptive watermark casting method based on the wavelet transform is proposed to increase the robustness and perceptual invisibility, and the algorithm is combined with the quantisation modelBased on the human visual system to ensure robustness to high compression environments.
Abstract: An image adaptive watermark casting method based on the wavelet transform is proposed. To increase the robustness and perceptual invisibility, the algorithm is combined with the quantisation model based on the human visual system. The number of factors that affect the noise sensitivity of the human eye are taken into consideration. Experimental results demonstrate the robustness of the algorithm to high compression environments.

Journal ArticleDOI
TL;DR: This review considers the logic and the evidence relating to the issue of dynamic grouping in human vision, including grouping of visual information into surface descriptions and contour descriptions, and considers the hypothesis that dynamic grouping is signalled by neuronal synchrony.

Book ChapterDOI
26 Jun 2000
TL;DR: This work considers a Gaussian color model, which inherently uses the spatial and color information in an integrated model, and proposes a framework for spatial color measurement, based on the Gaussian scale-space theory.
Abstract: For grey-value images, it is well accepted that the neighborhood rather than the pixel carries the geometrical interpretation. Interestingly the spatial configuration of the neighborhood is the basis for the perception of humans. Common practise in color image processing, is to use the color information without considering the spatial structure. We aim at a physical basis for the local interpretation of color images. We propose a framework for spatial color measurement, based on the Gaussian scale-space theory. We consider a Gaussian color model, which inherently uses the spatial and color information in an integrated model. The framework is well-founded in physics as well as in measurement science. The framework delivers sound and robust spatial color invariant features. The usefulness of the proposed measurement framework is illustrated by edge detection, where edges are discriminated as shadow, highlight, or object boundary. Other applications of the framework include color invariant image retrieval and color constant edge detection.

Book ChapterDOI
01 Jan 2000
TL;DR: This chapter provides an overview of the basic principles and potentials of state of the art fuzzy image processing that can be applied to a variety of computer vision tasks.
Abstract: Publisher Summary This chapter provides an overview of the basic principles and potentials of state of the art fuzzy image processing that can be applied to a variety of computer vision tasks. The world is fuzzy, and so are images, projections of the real world onto the image sensor. Fuzziness quantifies vagueness and ambiguity, as opposed to crisp memberships. The types of uncertainty in images are manifold, ranging over the entire chain of processing levels, from pixel based grayness ambiguity over fuzziness in geometrical description up to uncertain knowledge in the highest processing level. The human visual system has been perfectly adapted to handle uncertain information in both data and knowledge. The interrelation of a few such “fuzzy” properties sufficiently characterizes the object of interest. Fuzzy image processing is an attempt to translate this ability of human reasoning into computer vision problems as it provides an intuitive tool for inference from imperfect data. Fuzzy image processing is special in terms of its relation to other computer vision techniques. It is not a solution for a special task, but rather describes a new class of image processing techniques. It provides a new methodology, augmenting classical logic, a component of any computer vision tool. A new type of image understanding and treatment has to be developed. Fuzzy image processing can be a single image processing routine or complement parts of a complex image processing chain.

01 Jan 2000
TL;DR: In this article, a review of image distortion measures is presented, which is a criterion that assigns a "quality number" to an image, i.e., a quality number is defined as the sum of a number of different distortion measures.
Abstract: Within this paper we review image distortion measures. A distortion measure is a criterion that assigns a "quality number" to an image. We distinguish between mathematical distortion measures and those distortion measures in-cooperating a priori knowledge about the imaging devices ( e.g. satellite images), image processing algorithms or the human physiology. We will consider representative examples of different kinds of distortion measures and are going to discuss them.

Journal ArticleDOI
01 Feb 2000
TL;DR: A method to embed a secret image into a cover image using a pseudorandom mechanism based on the similarity among the grey values of consecutive image pixels as well as the human visual system's variation insensitivity from smooth to contrastive is proposed.
Abstract: A method to embed a secret image into a cover image is proposed. The method is based on the similarity among the grey values of consecutive image pixels as well as the human visual system's variation insensitivity from smooth to contrastive. A stego-image is produced by replacing the grey values of a differencing result obtained from the cover image with those of a differencing result obtained from the secret image. The process preserves the secret image with no loss and produces the stego-image with low degradation. Moreover, a pseudorandom mechanism is used to achieve cryptography. It is found from experiment that the peak values of signal-to-noise ratios of the method are high and that the resulting stego-images are imperceptible. Even when the size of the secret image is about a half of the cover image.

01 Jan 2000
TL;DR: An introduction to the general issue of HVS-modeling is given and the specific applications of visual quality assessment and H VS-based image compression, which are closely related, are reviewed.
Abstract: By taking into account the properties and limitations of the human visual system (HVS), images can be more efficiently compressed, colors more accurately reproduced, prints better rendered, to mention a few major advantages. To achieve these goals it is necessary to build a computational model of the HVS. In this paper we give an introduction to the general issue of HVS-modeling and review the specific applications of visual quality assessment and HVS-based image compression, which are closely related. On one hand, these two examples demonstrate the common structure of HVS-models, on the other hand they also show how application-specific constraints influence model design. Recent vision models from these application areas are reviewed and summarized in a table for direct comparison. Keywords— Human Visual System (HVS), Color Perception, Quality Assessment, Image Compression

Journal ArticleDOI
TL;DR: Output images taken from the models indicate that natural images do contain useful second-order structure and reveal variations in texture and features defined by such variations, suggesting that the two types of image ‘content’ may be statistically independent.
Abstract: The human visual system is sensitive to both first-order variations in luminance and second-order variations in local contrast and texture. Although there is some debate about the nature of second-order vision and its relationship to first-order processing, there is now a body of results showing that they are processed separately. However, the amount, and nature, of second-order structure present in the natural environment is unclear. This is an important question because, if natural scenes contain little second-order structure in addition to first-order signals, the notion of a separate second-order system would lack ecological validity. Two models of second-order vision were applied to a number of well-calibrated natural images. Both models consisted of a first stage of oriented spatial filters followed by a rectifying nonlinearity and then a second set of filters. The models differed in terms of the connectivity between first-stage and second-stage filters. Output images taken from the models indicate that natural images do contain useful second-order structure. Specifically, the models reveal variations in texture and features defined by such variations. Areas of high contrast (but not necessarily high luminance) are also highlighted by the models. Second-order structure--as revealed by the models--did not correlate with the first-order profile of the images, suggesting that the two types of image 'content' may be statistically independent.

BookDOI
01 Jan 2000
TL;DR: Approximation-Based Keypoints in Colour Images - A Tool for Building and Searching Visual Databases and a Knowledge Synthesizing Approach for Classification of Visual Information.
Abstract: The Visual Information Systems International Conference series is designed to provide a forum for researchers and practitioners from diverse areas of computing including computer vision, databases, human–computer interaction, information security, image processing, information visualization and mining, as well as knowledge and information management to exchange ideas, discuss challenges, present their latest results and to advance research and development in the construction and application of visual information systems. Following previous conferences held in Melbourne (1996), San Diego (1997), Amsterdam (1999), Lyon (2000), Taiwan (2002), Miami (2003), San Francisco (2004) and Amsterdam (2005), the Ninth International Conference on Visual Information Systems, VISUAL2007, was held in Shanghai, China, June 28–29, 2007. Over the years, the visual information systems paradigm continues to evolve, and the unrelenting exponential growth in the amount of digital visual data underlines the escalating importance of how such data are effectively managed and deployed. VISUAL2007 received 117 submissions from 15 countries and regions. Submitted full papers were reviewed by more than 60 international experts in the field. This volume collects 54 selected papers presented at VISUAL2007. Topics covered in these papers include image and video retrieval, visual biometrics, intelligent visual information processing, visual data mining, ubiquitous and mobile visual information systems, visual semantics, 2D/3D graphical visual data retrieval and applications of visual information systems.

Journal ArticleDOI
TL;DR: In this article, two psychophysics experiments are described, pointing out the significant role played by stochastic resonance in recognition of capital stylized noisy letters by the human perceptive apparatus.
Abstract: Two psychophysics experiments are described, pointing out the significant role played by stochastic resonance in recognition of capital stylized noisy letters by the human perceptive apparatus. The first experiment shows that an optimal noise level exists at which the letter is recognized for a minimum threshold contrast. A simple two-parameter model that best fits the experimental data is also discussed. In the second experiment we show that a dramatically increased ability of the visual system in letter recognition occurs in an extremely narrow range of increasing noise. Possible interesting future investigations suggested by these experimental results and based on functional imaging techniques are discussed.

Journal ArticleDOI
TL;DR: In this article, a correlation analysis was applied to human spatial frequency contrast sensitivity and adults showed power law correlational structure consistent with a β of 1.09-1.20 (closely matched to that of natural images).

Proceedings ArticleDOI
02 Jun 2000
TL;DR: The first results are presented here of a number of implementation choices for some components found in most of today's visual quality metrics that are based on a model of human vision.
Abstract: The design of reliable visual quality metrics is complicated by our limited knowledge of the human visual system and the resulting variety of pertinent vision models. We have begun to analyze and compare a number of implementation choices for some components found in most of today's visual quality metrics that are based on a model of human vision and present the first results here.

Patent
17 Oct 2000
TL;DR: In this paper, a method for extending bit-depth of display systems is proposed, which includes the steps of measuring the static display noise of a display device, using the display noise to create pseudo-random noise, and subtracting the pseudorandom noise from a contone image.
Abstract: A method for extending bit-depth of display systems. The method includes the steps of measuring the static display noise of a display device (14), using the display noise to create pseudo-random noise (12) and subtracting the pseudo-random noise (12) from a contone image (10). After the noise-compensated image data is quantized and displayed, the noise in the display device (14) will substantially convert the noise-compensated image data back to contone image data with few or no contouring artifacts. Other embodiments include using the inherent noise of the human visual system (22) instead of the static display noise, or both. Specific adjustments can be made to the noise of the human visual system (22) for color displays.

Proceedings ArticleDOI
21 Aug 2000
TL;DR: A review on human visual system (HVS) based digital video quality metrics and how the characteristics of the HVS have been incorporated into quality metrics, and the implementation issues of the metrics as well as the directions of future research are presented.
Abstract: We present a review on human visual system (HVS) based digital video quality metrics. Particularly, three objective video quality metrics are discussed and analyzed in detail because they represent the state-of-the-art of HVS based quality metric research and have been proposed to and verified by VQEG (Video Quality Expert Group) as the candidates of a possible ITU standard. The purpose of the paper is to provide an up-to-date knowledge of the HVS modeling, how the characteristics of the HVS have been incorporated into quality metrics, and the implementation issues of the metrics as well as the directions of future research.

ReportDOI
01 Jun 2000
TL;DR: A geometric model and a computational method for segmentation of images with missing boundaries, and an algorithm which tries to build missing information on the basis of the given point of view and the available information as boundary data to the algorithm are presented.
Abstract: We present a geometric model and a computational method for segmentation of images with missing boundaries. In many situations, the human visual system fills in missing gaps in edges and boundaries, building and completing information that is not present. Boundary completion presents a considerable challenge in computer vision, since most algorithms attempt to exploit existing data. A large body of work concerns completion models, which postulate how to construct missing data; these models are often trained and specific to particular images. In this paper, we take the following, alternative perspective: we consider a reference point within an image as given, and then develop an algorithm which tries to build missing information on the basis of the given point of view and the available information as boundary data to the algorithm. Starting from this point of view, a surface is constructed. It is then evolved with the mean curvature flow in the metric induced by the image until a piecewise constant solution is reached. We test the computational model on modal completion, amodal completion, texture, photo and medical images. We extend the geometric model and the algorithm to 3D in order to extract shapes from low signal/noise ratio medical volumes. Results in 3D echocardiography and 3D fetal echography are presented.