scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 2005"


Journal ArticleDOI
TL;DR: This paper proposes a novel information fidelity criterion that is based on natural scene statistics and derives a novel QA algorithm that provides clear advantages over the traditional approaches and outperforms current methods in testing.
Abstract: Measurement of visual quality is of fundamental importance to numerous image and video processing applications. The goal of quality assessment (QA) research is to design algorithms that can automatically assess the quality of images or videos in a perceptually consistent manner. Traditionally, image QA algorithms interpret image quality as fidelity or similarity with a "reference" or "perfect" image in some perceptual space. Such "full-reference" QA methods attempt to achieve consistency in quality prediction by modeling salient physiological and psychovisual features of the human visual system (HVS), or by arbitrary signal fidelity criteria. In this paper, we approach the problem of image QA by proposing a novel information fidelity criterion that is based on natural scene statistics. QA systems are invariably involved with judging the visual quality of images and videos that are meant for "human consumption". Researchers have developed sophisticated models to capture the statistics of natural signals, that is, pictures and videos of the visual environment. Using these statistical models in an information-theoretic setting, we derive a novel QA algorithm that provides clear advantages over the traditional approaches. In particular, it is parameterless and outperforms current methods in our testing. We validate the performance of our algorithm with an extensive subjective study involving 779 images. We also show that, although our approach distinctly departs from traditional HVS-based methods, it is functionally similar to them under certain conditions, yet it outperforms them due to improved modeling. The code and the data from the subjective study are available at [1].

1,334 citations


Journal ArticleDOI
01 Jul 2005
TL;DR: This paper defines mesh saliency in a scale-dependent manner using a center-surround operator on Gaussian-weighted mean curvatures to capture what most would classify as visually interesting regions on a mesh.
Abstract: Research over the last decade has built a solid mathematical foundation for representation and analysis of 3D meshes in graphics and geometric modeling. Much of this work however does not explicitly incorporate models of low-level human visual attention. In this paper we introduce the idea of mesh saliency as a measure of regional importance for graphics meshes. Our notion of saliency is inspired by low-level human visual system cues. We define mesh saliency in a scale-dependent manner using a center-surround operator on Gaussian-weighted mean curvatures. We observe that such a definition of mesh saliency is able to capture what most would classify as visually interesting regions on a mesh. The human-perception-inspired importance measure computed by our mesh saliency operator results in more visually pleasing results in processing and viewing of 3D meshes. compared to using a purely geometric measure of shape. such as curvature. We discuss how mesh saliency can be incorporated in graphics applications such as mesh simplification and viewpoint selection and present examples that show visually appealing results from using mesh saliency.

703 citations


Book
21 Nov 2005
TL;DR: This landmark book is the first to describe HDRI technology in its entirety and covers a wide-range of topics, from capture devices to tone reproduction and image-based lighting, leading to an unparalleled visual experience.
Abstract: This landmark book is the first to describe HDRI technology in its entirety and covers a wide-range of topics, from capture devices to tone reproduction and image-based lighting. The techniques described enable you to produce images that have a dynamic range much closer to that found in the real world, leading to an unparalleled visual experience. As both an introduction to the field and an authoritative technical reference, it is essential to anyone working with images, whether in computer graphics, film, video, photography, or lighting design. New material includes chapters on High Dynamic Range Video Encoding, High Dynamic Range Image Encoding, and High Dynammic Range Display Devices Written by the inventors and initial implementors of High Dynamic Range Imaging Covers the basic concepts (including just enough about human vision to explain why HDR images are necessary), image capture, image encoding, file formats, display techniques, tone mapping for lower dynamic range display, and the use of HDR images and calculations in 3D rendering Range and depth of coverage is good for the knowledgeable researcher as well as those who are just starting to learn about High Dynamic Range imaging Table of Contents Introduction; Light and Color; HDR Image Encodings; HDR Video Encodings; HDR Image and Video Capture; Display Devices; The Human Visual System and HDR Tone Mapping; Spatial Tone Reproduction; Frequency Domain and Gradient Domain Tone Reproduction; Inverse Tone Reproduction; Visible Difference Predictors; Image-Based Lighting.

417 citations


Journal ArticleDOI
TL;DR: The result is a fast and practical algorithm for general use with intuitive user parameters that control intensity, contrast, and level of chromatic adaptation, respectively.
Abstract: A common task in computer graphics is the mapping of digital high dynamic range images to low dynamic range display devices such as monitors and printers. This task is similar to the adaptation processes which occur in the human visual system. Physiological evidence suggests that adaptation already occurs in the photoreceptors, leading to a straightforward model that can be easily adapted for tone reproduction. The result is a fast and practical algorithm for general use with intuitive user parameters that control intensity, contrast, and level of chromatic adaptation, respectively.

401 citations


Journal ArticleDOI
TL;DR: This model shows that visual artifacts after demosaicing are due to aliasing between luminance and chrominance and could be solved using a preprocessing filter, and gives new insights for the representation of single-color per spatial location images.
Abstract: There is an analogy between single-chip color cameras and the human visual system in that these two systems acquire only one limited wavelength sensitivity band per spatial location. We have exploited this analogy, defining a model that characterizes a one-color per spatial position image as a coding into luminance and chrominance of the corresponding three colors per spatial position image. Luminance is defined with full spatial resolution while chrominance contains subsampled opponent colors. Moreover, luminance and chrominance follow a particular arrangement in the Fourier domain, allowing for demosaicing by spatial frequency filtering. This model shows that visual artifacts after demosaicing are due to aliasing between luminance and chrominance and could be solved using a preprocessing filter. This approach also gives new insights for the representation of single-color per spatial location images and enables formal and controllable procedures to design demosaicing algorithms that perform well compared to concurrent approaches, as demonstrated by experiments.

346 citations


Journal ArticleDOI
TL;DR: A new formulation of the regularized image up-sampling problem that incorporates models of the image acquisition and display processes is presented, giving a new analytic perspective that justifies the use of total-variation regularization from a signal processing perspective.
Abstract: This paper presents a new formulation of the regularized image up-sampling problem that incorporates models of the image acquisition and display processes. We give a new analytic perspective that justifies the use of total-variation regularization from a signal processing perspective, based on an analysis that specifies the requirements of edge-directed filtering. This approach leads to a new data fidelity term that has been coupled with a total-variation regularizer to yield our objective function. This objective function is minimized using a level-sets motion that is based on the level-set method, with two types of motion that interact simultaneously. A new choice of these motions leads to a stable solution scheme that has a unique minimum. One aspect of the human visual system, perceptual uniformity, is treated in accordance with the linear nature of the data fidelity term. The method was implemented and has been verified to provide improved results, yielding crisp edges without introducing ringing or other artifacts.

340 citations


Reference BookDOI
TL;DR: Video Quality Experts Group .
Abstract: PICTURE CODING AND HUMAN VISUAL SYSTEM FUNDAMENTALS Digital Picture Compression and Coding Structure . Introduction to Digital Picture Coding . Characteristics of Picture Data . Compression and Coding Techniques . Picture Quantization . Rate-Distortion Theory . Human Visual Systems . Digital Picture Coding Standards and Systems . Summary Fundamentals of Human Vision and Vision Modeling . Introduction . A Brief Overview of the Visual System . Color Vision . Luminance and the Perception of Light Intensity . Spatial Vision and Contrast Sensitivity . Temporal Vision and Motion . Visual Modeling . Conclusions Coding Artifacts and Visual Distortions . Introduction . Blocking Effect . Basis Image Effect . Blurring . Color Bleeding . Staircase Effect . Ringing . Mosaic Patterns . False Contouring . False Edges . MC Mismatch . Mosquito Effect . Stationary Area Fluctuations . Chrominance Mismatch . Video Scaling and Field Rate Conversion . Deinterlacing . Summary PICTURE QUALITY ASSESSMENT AND METRICS Video Quality Testing . Introduction . Subjective Assessment Methodologies . Selection of Test Materials . Selection of Participants-Subjects . Experimental Design . International Test Methods . Objective Assessment Methods . Summary Perceptual Video Quality Metrics-A Review . Introduction . Quality Factors . Metric Classification . Pixel-Based Metrics . The Psychophysical Approach . The Engineering Approach . Metric Comparisons . Conclusions and Perspectives Philosophy of Picture Quality Scale . Objective Picture Quality Scale for Image Coding . Application of PQS to a Variety of Electronic Images . Various Categories of Image Systems . Study at ITU . Conclusion Structural Similarity Based Image Quality Assessment . Structural Similarity Based Image Quality . The Structural SIMilarity (SSIM) Index . Image Quality Assessment Based on the SSIM Index . Discussions Vision Model Based Digital Video Impairment Metrics . Introduction . Vision Modeling for Impairment Measurement . Perceptual Blocking Distortion Metric . Perceptual Ringing Distortion Measure . Conclusion Computational Models for Just-Noticeable Difference . Introduction . JND with DCT Subbands . JND with Pixels . JND Model Evaluation . Conclusions No-Reference Quality Metric for Degraded and Enhanced Video . Introduction . State-of-the-Art for No-Reference Metrics . Quality Metric Components and Design . No-Reference Overall Quality Metric . Performance of the Quality Metric . Conclusions and Future Research Video Quality Experts Group . Formation . Goals . Phase I . Phase II . Continuing Work and Directions . Summary PERCEPTUAL CODING AND PROCESSING OF DIGITAL PICTURES HVS Based Perceptual Video Encoders . Introduction . Noise Visibility and Visual Masking . Architectures for Perceptual Based Coding . Standards-Specific Features . Salience/Maskability Pre-Processing . Application to Multi-Channel Encoding Perceptual Image Coding . Introduction . A Perceptual Distortion Metric Based Image Coder . Model Calibration . Performance Evaluation . Perceptual Lossless Coder . Summary Foveated Image and Video Coding . Foveated Human Vision and Foveated Image Processing . Foveation Methods . Scalable Foveated Image and Video Coding . Discussions Artifact Reduction by Post-Processing in Image Compression . Introduction . Image Compression and Coding Artifacts . Reduction of Blocking Artifacts . Reduction of Ringing Artifacts . Summary Reduction of Color Bleeding in DCT Block-Coded Video . Introduction . Detailed Analysis of the Color Bleeding Phenomenon . Description of the Post-Processor . Experimental Results-Concluding Remarks Error Resilience for Video Coding Service . Introduction to Error Resilient Coding Techniques . Error Resilient Coding Methods Compatible with MPEG-2 . Methods for Concealment of Cell Loss . Experimental Procedure . Experimental Results . Conclusions Critical Issues and Challenges . Picture Coding Structures . Vision Modeling Issues . Spatio-Temporal Masking in Video Coding . Picture Quality Assessment . Challenges in Perceptual Coder Design . Codec System Design Optimization . Summary Appendix: VQM Performance Metrics . Metrics Relating to Model Prediction Accuracy . Metrics Relating to Prediction Monotonicity of a Model . Metrics Relating to Prediction Consistency . MATLAB(R) Source Code . Supplementary Analyses INDEX

333 citations


Journal ArticleDOI
TL;DR: A new perceptually-adaptive video coding (PVC) scheme for hybrid video compression is explored, in order to achieve better perceptual coding quality and operational efficiency and to integrate spatial masking factors with the nonlinear additivity model for masking (NAMM).
Abstract: We explore a new perceptually-adaptive video coding (PVC) scheme for hybrid video compression, in order to achieve better perceptual coding quality and operational efficiency. A new just noticeable distortion (JND) estimator for color video is first devised in the image domain. How to efficiently integrate masking effects together is a key issue of JND modelling. We integrate spatial masking factors with the nonlinear additivity model for masking (NAMM). The JND estimator applies to all color components and accounts for the compound impact of luminance masking, texture masking and temporal masking. Extensive subjective viewing confirms that it is capable of determining a more accurate visibility threshold that is close to the actual JND bound in human eyes. Secondly, the image-domain JND profile is incorporated into hybrid video encoding via the JND-adaptive motion estimation and residue filtering process. The scheme works with any prevalent video coding standards and various motion estimation strategies. To demonstrate the effectiveness of the proposed scheme, it has been implemented in the MPEG-2 TM5 coder and demonstrated to achieve average improvement of over 18% in motion estimation efficiency, 0.6 dB in average peak signal-to perceptual-noise ratio (PSPNR) and most remarkably, 0.17 dB in the objective coding quality measure (PSNR) on average. Theoretical explanation is presented for the improvement on the objective coding quality measure. With the JND-based motion estimation and residue filtering process, hybrid video encoding can be more efficient and the use of bits is optimized for visual quality.

305 citations


Journal ArticleDOI
TL;DR: A wavelet based logo-watermarking scheme for copyright protection of digital image using a visually meaningful gray scale logo as watermark is presented and is robust to wide variety of attacks.

247 citations


Proceedings ArticleDOI
31 Jul 2005
TL;DR: This work presents a method for automatically replacing one material with another, completely different material, starting with only a single high dynamic range image as input, exploiting the fact that human vision is surprisingly tolerant of certain physical inaccuracies, while being sensitive to others.
Abstract: Photo editing software allows digital images to be blurred, warped or re-colored at the touch of a button. However, it is not currently possible to change the material appearance of an object except by painstakingly painting over the appropriate pixels. Here we present a method for automatically replacing one material with another, completely different material, starting with only a single high dynamic range image as input. Our approach exploits the fact that human vision is surprisingly tolerant of certain (sometimes enormous) physical inaccuracies, while being sensitive to others. By adjusting our simulations to be careful about those aspects to which the human visual system is sensitive, we are for the first time able to demonstrate significant material changes on the basis of a single photograph as input.

220 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: This work presents a stable and robust algorithm which grasps dynamic audio-visual events with high spatial resolution, and derives a unique solution based on canonical correlation analysis (CCA), which effectively detects pixels that are associated with the sound, while filtering out other dynamic pixels.
Abstract: People and animals fuse auditory and visual information to obtain robust perception. A particular benefit of such cross-modal analysis is the ability to localize visual events associated with sound sources. We aim to achieve this using computer-vision aided by a single microphone. Past efforts encountered problems stemming from the huge gap between the dimensions involved and the available data. This has led to solutions suffering from low spatio-temporal resolutions. We present a rigorous analysis of the fundamental problems associated with this task. Then, we present a stable and robust algorithm which overcomes past deficiencies. It grasps dynamic audio-visual events with high spatial resolution, and derives a unique solution. The algorithm effectively detects pixels that are associated with the sound, while filtering out other dynamic pixels. It is based on canonical correlation analysis (CCA), where we remove inherent ill-posedness by exploiting the typical spatial sparsity of audio-visual events. The algorithm is simple and efficient thanks to its reliance on linear programming and is free of user-defined parameters. To quantitatively assess the performance, we devise a localization criterion. The algorithm capabilities were demonstrated in experiments, where it overcame substantial visual distractions and audio noise.

Journal ArticleDOI
TL;DR: A new numerical measure for visual attention's modulatory aftereffects, perceptual quality significance map (PQSM), is proposed and demonstrates the performance improvement on two PQSM-modulated visual sensitivity models and two P QSM-based visual quality metrics.
Abstract: With the fast development of visual noise-shaping related applications (visual compression, error resilience, watermarking, encryption, and display), there is an increasingly significant demand on incorporating perceptual characteristics into these applications for improved performance. In this paper, a very important mechanism of the human brain, visual attention, is introduced for visual sensitivity and visual quality evaluation. Based upon the analysis, a new numerical measure for visual attention's modulatory aftereffects, perceptual quality significance map (PQSM), is proposed. To a certain extent, the PQSM reflects the processing ability of the human brain on local visual contents statistically. The PQSM is generated with the integration of local perceptual stimuli from color contrast, texture contrast, motion, as well as cognitive features (skin color and face in this study). Experimental results with subjective viewing demonstrate the performance improvement on two PQSM-modulated visual sensitivity models and two PQSM-based visual quality metrics.

Journal ArticleDOI
TL;DR: A new JND estimator for color video is devised in image-domain with the nonlinear additivity model for masking and is incorporated into a motion-compensated residue signal preprocessor for variance reduction toward coding quality enhancement, and both perceptual quality and objective quality are enhanced in coded video at a given bit rate.
Abstract: We present a motion-compensated residue signal preprocessing scheme in video coding scheme based on just-noticeable-distortion (JND) profile Human eyes cannot sense any changes below the JND threshold around a pixel due to their underlying spatial/temporal masking properties An appropriate (even imperfect) JND model can significantly help to improve the performance of video coding algorithms From the viewpoint of signal compression, smaller variance of signal results in less objective distortion of the reconstructed signal for a given bit rate In this paper, a new JND estimator for color video is devised in image-domain with the nonlinear additivity model for masking (NAMM) and is incorporated into a motion-compensated residue signal preprocessor for variance reduction toward coding quality enhancement As the result, both perceptual quality and objective quality are enhanced in coded video at a given bit rate A solution of adaptively determining the parameter for the residue preprocessor is also proposed The devised technique can be applied to any standardized video coding scheme based on motion compensated prediction It provides an extra design option for quality control, besides quantization, in contrast with most of the existing perceptually adaptive schemes which have so far focused on determination of proper quantization steps As an example for demonstration, the proposed scheme has been implemented in the MPEG-2 TM5 coder, and achieved an average peak signal-to-noise (PSNR) increment of 0505 dB over the twenty video sequences which have been tested The perceptual quality improvement has been confirmed by the subjective viewing tests conducted

Journal ArticleDOI
TL;DR: An improved scheme for estimating just-noticeable distortion (JND) is proposed in this paper and is proved to outperform the DCTune model, with the major contributions of a new formula for luminance adaptation adjustment and the incorporation of block classification for contrast masking.

Journal ArticleDOI
03 Mar 2005-Neuron
TL;DR: Viewpoint aftereffects were found within, but not across, categories of objects tested and support the existence of object-selective neurons tuned to specific viewing angles in the human visual system.


Journal ArticleDOI
TL;DR: It is suggested that the visual system does not verify the global consistency of locally derived estimates of illumination direction, which suggests that the human visual system is remarkably insensitive to illumination inconsistencies, both in experimental stimuli and in altered images of real scenes.
Abstract: The human visual system is adept at detecting and encoding statistical regularities in its spatiotemporal environment. Here, we report an unexpected failure of this ability in the context of perceiving inconsistencies in illumination distributions across a scene. Prior work with arrays of objects all having uniform reflectance has shown that one inconsistently illuminated target can 'pop out' among a field of consistently illuminated objects (eg Enns and Rensink, 1990 Science 247 721 723; Sun and Perona, 1997 Perception 26 519-529). In these studies, the luminance pattern of the odd target could be interpreted as arising from either an inconsistent illumination or inconsistent pigmentation of the target. Either cue might explain the rapid detection. In contrast, we find that once the geometrical regularity of the previous displays is removed, the visual system is remarkably insensitive to illumination inconsistencies, both in experimental stimuli and in altered images of real scenes. Whether the target is interpreted as oddly illuminated or oddly pigmented, it is very difficult to find if the only cue is deviation from the regularity of illumination or reflectance. Our results allow us to draw inferences about how the visual system encodes illumination distributions across scenes. Specifically, they suggest that the visual system does not verify the global consistency of locally derived estimates of illumination direction.

Journal ArticleDOI
TL;DR: It is argued that no valid comparison between visual representations can arise unless provision is made for three critical properties: their direction of fit, theirdirection of causation and the level of their conceptual content.

Journal ArticleDOI
TL;DR: Attentional cueing influenced the perceived order of lateralized visual events but not the timing of event-related potentials in visual cortex, which shows that attention-induced shifts in visual time-order perception can arise from modulations of signal strength rather than processing speed in the early visual-cortical pathways.
Abstract: Attended objects are perceived to occur before unattended objects even when the two objects are presented simultaneously. This finding has led to the widespread view that attention modulates the speed of neural transmission in the various perceptual pathways. We recorded event-related potentials during a time-order judgment task to determine whether a reflexive shift of attention to a sudden sound modulates the speed of sensory processing in the human visual system. Attentional cueing influenced the perceived order of lateralized visual events but not the timing of event-related potentials in visual cortex. Attentional cueing did, however, enhance the amplitude of neural activity in visual cortex, which shows that attention-induced shifts in visual time-order perception can arise from modulations of signal strength rather than processing speed in the early visual-cortical pathways.

Journal ArticleDOI
TL;DR: An in-depth analysis of the saliency-based model of visual attention by assessing the contribution of different cues to visual attention as modeled by different versions of the computer model is presented.

Journal ArticleDOI
TL;DR: The proposed metric is particularly effective to visual signal with blurring and luminance fluctuations as the major artifacts, and brings about the fundamental improvement when sharpened image edges are involved.
Abstract: This paper presents a method to discriminate pixel differences according to their impact toward perceived visual quality. Noticeable local contrast changes are formulated firstly since contrast is the basic sensory feature in the human visual system (HVS) perception. The analysis aims at quantifying the actual impact of such changes (further divided into increases and decreases on edges) in different signal contexts. An associated full-reference distortion metric proposed next provides better match with the HVS viewing. Experiments have used two independent visual data sets and the related subjective viewing results, and demonstrated the performance improvement of the proposed metric over the relevant existing ones with various video/images and under diversified test conditions. The proposed metric is particularly effective to visual signal with blurring and luminance fluctuations as the major artifacts, and brings about the fundamental improvement when sharpened image edges are involved.

Proceedings ArticleDOI
07 Nov 2005
TL;DR: An Eclipse plug-in generating Java code for visual modeling plug-ins which can be directly executed in the Eclipse Runtime-Workbench and given in a visual manner and precise enough to completely generate the visual environment is presented.
Abstract: Visual Languages (VLs) play an important role in software system development. Especially when looking at well-defined domains, a broad variety of domain specific visual languages are used for the development of new applications. These languages are typically developed specifically for a certain domain in a way that domain concepts occur as primitives in the language alphabet. Visual modeling environments are needed to support rapid development of domain-specific solutions.In this contribution we present a general approach for defining visual languages and for generating language-specific tool environments. The visual language definition is again given in a visual manner and precise enough to completely generate the visual environment. The underlying technology is Eclipse with its plug-in capabilities on the one hand, and formal graph transformation techniques on the other hand. More precisely, we present an Eclipse plug-in generating Java code for visual modeling plug-ins which can be directly executed in the Eclipse Runtime-Workbench.

Journal ArticleDOI
TL;DR: This paper proposes a novel method to dramatically reduce the number of extra subpixels to construct the aspect ratio invariant VSS schemes.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: This work introduces the mixture of dynamic texture mixture, which models a collection of videos consisting of different visual processes as samples from a set of dynamic textures, and derives the EM algorithm for learning a mixture ofynamic textures.
Abstract: A dynamic texture is a linear dynamical system used to model a single video as a sample from a spatio-temporal stochastic process. In this work, we introduce the mixture of dynamic textures, which models a collection of videos consisting of different visual processes as samples from a set of dynamic textures. We derive the EM algorithm for learning a mixture of dynamic textures, and relate the learning algorithm and the dynamic texture mixture model to previous works. Finally, we demonstrate the applicability of the proposed model to problems that have traditionally been challenging for computer vision.

Journal ArticleDOI
TL;DR: A novel objective no-reference metric is proposed for video quality assessment of digitally coded videos containing natural scenes and experiments indicate that the objective scores obtained by the proposed metric agree well with the subjective assessment scores.
Abstract: A novel objective no-reference metric is proposed for video quality assessment of digitally coded videos containing natural scenes. Taking account of the temporal dependency between adjacent images of the videos and characteristics of the human visual system, the spatial distortion of an image is predicted using the differences between the corresponding translational regions of high spatial complexity in two adjacent images, which are weighted according to temporal activities of the video. The overall video quality is measured by pooling the spatial distortions of all images in the video. Experiments using reconstructed video sequences indicate that the objective scores obtained by the proposed metric agree well with the subjective assessment scores.

Journal ArticleDOI
TL;DR: A novel deblocking algorithm based on three filtering modes in terms of the activity across block boundaries that outperforms methods of deblocking MPEG-4 with respect to peak signal-to-noise ratios and computational complexity is proposed.
Abstract: Increasing the bandwidth or bit rate in real-time video applications to improve the quality of images is typically impossible or too expensive. Postprocessing appears to be the most feasible solution because it does not require any existing standards to be changed. Markedly reducing blocking effects can increase compression ratios for a particular image quality or improve the quality with respect to the specific bit rate of compression. This paper proposes a novel deblocking algorithm based on three filtering modes in terms of the activity across block boundaries. By properly considering the masking effect of the human visual system, an adaptive filtering decision is integrated into the deblocking process. According to three different deblocking modes appropriate for local regions with different characteristics, the perceptual and objective quality are improved without over smoothing the image details or insufficient reducing the strong blocking effect on the flat region. According to the simulation results, the proposed method outperforms methods of deblocking MPEG-4 with respect to peak signal-to-noise ratios and computational complexity.

Journal ArticleDOI
TL;DR: This work investigates the contribution of luminance and contrast information to global form detection, a stage between the extraction of local orientation and the recognition of objects, and finds that signals in the On-, Off- and second-order pathways are segregated at both stages of processing.

Journal ArticleDOI
TL;DR: A novel support vector regression based color image watermarking scheme is proposed that outperform the Kutter's method and Yu's method against different attacks including noise addition, shearing, luminance and contrast enhancement, distortion, etc.

Book ChapterDOI
TL;DR: Making visual queries a central concept opens the door to a theory of how the authors think visually with interactive displays, thought of as constructing and executing queries on displays.
Abstract: There is no visual model of the world in our heads. Over the past few years the phenomena of change blindness and inattentional blindness as well as studies of the capacity of visual working memory all point to the fact that we do not retain much about the world from one fixation to the next. The impression we have of a detailed visual environment comes from our ability to make rapid eye movements and sample the environment at will. What we see at any given instant in time is determined by what we are trying to accomplish. We see what we need to see. If we need to find a path through a crowd we see the openings. If we are trying to find a friend we see the faces. We can think of this process of seeing as the execution of a continuous stream of visual queries on the environment. Depending on the task at hand the brain constructs a visual query and we execute a visual search to satisfy that query. Making visual queries a central concept opens the door to a theory of how we think visually with interactive displays. The process can be thought of as constructing and executing queries on displays. Problem components are formulated into questions (or hypotheses) that can be answered (or tested) by means of pattern discovery. These are formulated into visual queries having the form of search patterns. Visual eye-movement scanning strategies are used to search the display. Within each fixation, active attention determines which patterns are pulled from visual cortex subsystems that do pattern analysis. Patterns and objects are formed as transitory object files from a proto-pattern space. Elementary visual queries can be executed at a rate of 40 msec per simple pattern. Links to non-visual propositional information are activated by icons or familiar patterns, bringing visual information simultaneously into verbal working memory.

Proceedings ArticleDOI
26 Aug 2005
TL;DR: In this article, the authors investigate the influence of sound effects on the perception of motion smoothness in an animation (i.e. on the perceived delivered frame rate) and find that participants who watched audiovisual walkthroughs gave more erroneous answers while performing their task compared to the subjects in the "No Sound" group, regardless of their familiarity with animated CG.
Abstract: The developers and users of interactive computer graphics (CG), such as 3D games and virtual reality, are demanding ever more realistic computer generated imagery delivered at high frame rates, to enable a greater perceptual experience for the user. As more computational power and/or transmission bandwidth are not always available, special techniques are applied that trade off fidelity in order to reduce computational complexity, while trying to minimise the perceptibility of the resulting visual defects. Research on human visual perception has promoted the development of perception driven CG techniques, where knowledge of the human visual system and its weaknesses are exploited when rendering/displaying 3D graphics. It is well known in the human perception community that many factors, including audio stimuli, may influence the amount of cognitive resources available to perform a visual task. In this paper we investigate the influence sound effects have on the perceptibility of motion smoothness in an animation (i.e. on the perception of delivered frame rate). Forty participants viewed pairs of computer-generated walkthrough animations (with the same visual content within the pair) displayed at five different frame rates, in all possible combinations. Both walkthroughs in each test pair were either silent or accompanied by sound effects and the participant had to detect the one that had a smoother motion (i.e. was delivered at higher frame rate). A significant effect of sound effects on the perceived smoothness was revealed. The participants who watched the audiovisual walkthroughs gave more erroneous answers while performing their task compared to the subjects in the "No Sound" group, regardless of their familiarity with animated CG. Especially the unfamiliar participants failed to notice motion smoothness variations which were apparent to them in the absence of sound. The effect of the type of camera movement in the scene (translation or rotation) on the viewers' perception of the motion smoothness/jerkiness was also investigated, but no significant association between them was found. Our results should lead to new insights in 3D graphics regarding the requirements for the delivered frame rate in a wide range of applications.