scispace - formally typeset
Search or ask a question

Showing papers on "Human visual system model published in 2008"


Journal ArticleDOI
TL;DR: For certain classes that are particularly prevalent in the dataset, such as people, this work is able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.
Abstract: With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.

1,871 citations


Proceedings Article
08 Dec 2008
TL;DR: A dynamic visual attention model based on the rarity of features is proposed and the Incremental Coding Length (ICL) is introduced to measure the perspective entropy gain of each feature to maximize the entropy of the sampled visual features.
Abstract: A visual attention system should respond placidly when common stimuli are presented, while at the same time keep alert to anomalous visual inputs. In this paper, a dynamic visual attention model based on the rarity of features is proposed. We introduce the Incremental Coding Length (ICL) to measure the perspective entropy gain of each feature. The objective of our model is to maximize the entropy of the sampled visual features. In order to optimize energy consumption, the limit amount of energy of the system is re-distributed amongst features according to their Incremental Coding Length. By selecting features with large coding length increments, the computational system can achieve attention selectivity in both static and dynamic scenes. We demonstrate that the proposed model achieves superior accuracy in comparison to mainstream approaches in static saliency map generation. Moreover, we also show that our model captures several less-reported dynamic visual search behaviors, such as attentional swing and inhibition of return.

528 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: This work proposes a tone mapping operator that can minimize visible contrast distortions for a range of output devices, ranging from e-paper to HDR displays, and shows that the problem can be solved very efficiently by employing higher order image statistics and quadratic programming.
Abstract: We propose a tone mapping operator that can minimize visible contrast distortions for a range of output devices, ranging from e-paper to HDR displays. The operator weights contrast distortions according to their visibility predicted by the model of the human visual system. The distortions are minimized given a display model that enforces constraints on the solution. We show that the problem can be solved very efficiently by employing higher order image statistics and quadratic programming. Our tone mapping technique can adjust image or video content for optimum contrast visibility taking into account ambient illumination and display characteristics. We discuss the differences between our method and previous approaches to the tone mapping problem.

410 citations


Journal ArticleDOI
01 Feb 2008
TL;DR: Two novel image enhancement algorithms are introduced: edge-preserving contrast enhancement, which is able to better preserve edge details while enhancing contrast in images with varying illumination, and a novel multihistogram equalization method which utilizes the human visual system to segment the image, allowing a fast and efficient correction of nonuniform illumination.
Abstract: Varying scene illumination poses many challenging problems for machine vision systems. One such issue is developing global enhancement methods that work effectively across the varying illumination. In this paper, we introduce two novel image enhancement algorithms: edge-preserving contrast enhancement, which is able to better preserve edge details while enhancing contrast in images with varying illumination, and a novel multihistogram equalization method which utilizes the human visual system (HVS) to segment the image, allowing a fast and efficient correction of nonuniform illumination. We then extend this HVS-based multihistogram equalization approach to create a general enhancement method that can utilize any combination of enhancement algorithms for an improved performance. Additionally, we propose new quantitative measures of image enhancement, called the logarithmic Michelson contrast measure (AME) and the logarithmic AME by entropy. Many image enhancement methods require selection of operating parameters, which are typically chosen using subjective methods, but these new measures allow for automated selection. We present experimental results for these methods and make a comparison against other leading algorithms.

270 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: In this article, the human visual system is used to detect and classify visible changes in the image structure, and a new metric for image quality assessment is proposed based on the detection and classification of visible changes.
Abstract: The diversity of display technologies and introduction of high dynamic range imagery introduces the necessity of comparing images of radically different dynamic ranges. Current quality assessment metrics are not suitable for this task, as they assume that both reference and test images have the same dynamic range. Image fidelity measures employed by a majority of current metrics, based on the difference of pixel intensity or contrast values between test and reference images, result in meaningless predictions if this assumption does not hold. We present a novel image quality metric capable of operating on an image pair where both images have arbitrary dynamic ranges. Our metric utilizes a model of the human visual system, and its central idea is a new definition of visible distortion based on the detection and classification of visible changes in the image structure. Our metric is carefully calibrated and its performance is validated through perceptual experiments. We demonstrate possible applications of our metric to the evaluation of direct and inverse tone mapping operators as well as the analysis of the image appearance on displays with various characteristics.

234 citations


Proceedings ArticleDOI
TL;DR: The design and implementation of a new stereoscopic image quality metric is described and it is suggested that it is a better predictor of human image quality preference than PSNR and could be used to predict a threshold compression level for stereoscope image pairs.
Abstract: We are interested in metrics for automatically predicting the compression settings for stereoscopic images so that we can minimize file size, but still maintain an acceptable level of image quality. Initially we investigate how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly coded stereoscopic image pairs. Our results suggest that symmetric, as opposed to asymmetric stereo image compression, will produce significantly better results. However, PSNR measures of image quality are widely criticized for correlating poorly with perceived visual quality. We therefore consider computational models of the Human Visual System (HVS) and describe the design and implementation of a new stereoscopic image quality metric. This, point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes at regions of high spatial frequency, using Michelson's Formula and Peli's Band Limited Contrast Algorithm. To establish a baseline for comparing our new metric with PSNR we ran a trial measuring stereoscopic image encoding quality with human subjects, using the Double Stimulus Continuous Quality Scale (DSCQS) from the ITU-R BT.500-11 recommendation. The results suggest that our new metric is a better predictor of human image quality preference than PSNR and could be used to predict a threshold compression level for stereoscopic image pairs.

167 citations


Journal ArticleDOI
TL;DR: This work evaluates SSIM metrics and proposes a perceptually weighted multiscale variant of SSIM, which introduces a viewing distance dependence and provides a natural way to unify the structural similarity approach with the traditional JND-based perceptual approaches.
Abstract: Perceptual image quality metrics have explicitly accounted for human visual system (HVS) sensitivity to subband noise by estimating just noticeable distortion (JND) thresholds. A recently proposed class of quality metrics, known as structural similarity metrics (SSIM), models perception implicitly by taking into account the fact that the HVS is adapted for extracting structural information from images. We evaluate SSIM metrics and compare their performance to traditional approaches in the context of realistic distortions that arise from compression and error concealment in video compression/transmission applications. In order to better explore this space of distortions, we propose models for simulating typical distortions encountered in such applications. We compare specific SSIM implementations both in the image space and the wavelet domain; these include the complex wavelet SSIM (CWSSIM), a translation-insensitive SSIM implementation. We also propose a perceptually weighted multiscale variant of CWSSIM, which introduces a viewing distance dependence and provides a natural way to unify the structural similarity approach with the traditional JND-based perceptual approaches.

157 citations


Journal ArticleDOI
TL;DR: The results demonstrate that successful acquisition of a perceptual skill can produce long-lasting changes for initial sensory inputs in the adult human visual system.

147 citations


Journal ArticleDOI
TL;DR: A new adaptive digital image watermarking method that is built according to the image features such as the brightness, edges, and region activities and extended to the DCT domain by searching the extreme value of the quadratic function subject to the bounds on the variables.

137 citations


Journal ArticleDOI
TL;DR: An image quality criterion is proposed, called C4, which is fully generic and based on a rather elaborate model of the human visual system and shows a high correlation between produced objective quality scores and subjective ones, even for images that have been distorted through several different distortion processes.
Abstract: When an image is supposed to have been transformed by a process like image enhancement or lossy image compression for storing or transmission, it is often necessary to measure the quality of the distorted image. This can be achieved using an image processing method called ''quality criterion''. Such a process must produce objective quality scores in close relationship with subjective quality scores given by human observers during subjective quality assessment tests. In this paper, an image quality criterion is proposed. This criterion, called C4, is fully generic (i.e., not designed for predefined distortion types or for particular images types) and based on a rather elaborate model of the human visual system (HVS). This model describes the organization and operation of many stages of vision, from the eye to the ventral and dorsal pathways in the visual cortex. The novelty of this quality criterion relies on the extraction, from an image represented in a perceptual space, of visual features that can be compared to those used by the HVS. Then a similarity metric computes the objective quality score of a distorted image by comparing the features extracted from this image to features extracted from its reference image (i.e., not distorted). Results show a high correlation between produced objective quality scores and subjective ones, even for images that have been distorted through several different distortion processes. To illustrate these performances, they have been computed using three different databases that employed different contents, distortions type, displays, viewing conditions and subjective protocols. The features extracted from the reference image constitute a reduced reference which, in a transmission context with data compression, can be computed at the sender side and transmitted in addition to the compressed image data so that the quality of the decompressed image can be objectively assessed at the receiver side. More, the size of the reduced reference is flexible. This work has been integrated into freely available applications in order to formulate a practical alternative to the PSNR criterion which is still too often used despite its low correlation with human judgments. These applications also enable quality assessment for image transmission purposes.

135 citations


Journal ArticleDOI
TL;DR: It is shown that salience in fact drives vision only during the short time interval immediately following the onset of a visual scene.
Abstract: A salient event in the visual field tends to attract attention and the eyes. To account for the effects of salience on visual selection, models generally assume that the human visual system continuously holds information concerning the relative salience of objects in the visual field. Here we show that salience in fact drives vision only during the short time interval immediately following the onset of a visual scene. In a saccadic target-selection task, human performance in making an eye movement to the most salient element in a display was accurate when response latencies were short, but was at chance when response latencies were long. In a manual discrimination task, performance in making a judgment of salience was more accurate with brief than with long display durations. These results suggest that salience is represented in the visual system only briefly after a visual image enters the brain.

01 Jan 2008
TL;DR: A new database of distorted test images TID2008 is exploited for verification of full-reference metrics of image visual quality and for particular subsets of TID 2008 that include distortions most important for digital image processing applications.
Abstract: In this paper, we exploit a new database of distorted test images TID2008 for verification of full-reference metrics of image visual quality A comparative analysis of TID20008 and its nearest analog LIVE Database is presented For a wide variety of known metrics, their correspondence to human visual system is evaluated The values of rank correlations of Spearman and Kendall with the considered metrics and Mean Opinion Score (MOS) obtained by exploiting TID2008 in experiments are presented The metrics are verified for both full set of distorted test images in TID2008 (1700 distorted images, 17 types of distortions) and for particular subsets of TID2008 that include distortions most important for digital image processing applications

Journal ArticleDOI
TL;DR: FMRI selectivity for global forms in the human visual pathways is examined using sensitive multivariate analysis methods that take advantage of information across brain activation patterns to suggest that the humanvisual system uses a code of increasing efficiency across stages of analysis that is critical for the successful detection and recognition of objects in complex environments.
Abstract: Extensive psychophysical and computational work proposes that the perception of coherent and meaningful structures in natural images relies on neural processes that convert information about local edges in primary visual cortex to complex object features represented in the temporal cortex. However, the neural basis of these mid-level vision mechanisms in the human brain remains largely unknown. Here, we examine functional MRI (fMRI) selectivity for global forms in the human visual pathways using sensitive multivariate analysis methods that take advantage of information across brain activation patterns. We use Glass patterns, parametrically varying the perceived global form (concentric, radial, translational) while ensuring that the local statistics remain similar. Our findings show a continuum of integration processes that convert selectivity for local signals (orientation, position) in early visual areas to selectivity for global form structure in higher occipitotemporal areas. Interestingly, higher occipitotemporal areas discern differences in global form structure rather than low-level stimulus properties with higher accuracy than early visual areas while relying on information from smaller but more selective neural populations (smaller voxel pattern size), consistent with global pooling mechanisms of local orientation signals. These findings suggest that the human visual system uses a code of increasing efficiency across stages of analysis that is critical for the successful detection and recognition of objects in complex environments.

Journal ArticleDOI
TL;DR: A proposed scheme for estimating JND (just-noticeable difference) with explicit formulation for image pixels, by summing the effects of the visual thresholds in sub-bands, demonstrates favorable results in noise shaping and perceptual visual distortion gauge for different images, in comparison with the relevant existing JND estimators.

Journal ArticleDOI
TL;DR: Experiments show that the fusion method proposed can improve spatial resolution and keep spectral information simultaneously, and that there are improvements both in visual effects and quantitative anal ysis compared with the traditional principle component analysis (PCA) method.

Proceedings ArticleDOI
12 May 2008
TL;DR: An approach by following a two-stage framework for saliency detection by extending an existing spectrum residual model for better locating visual pop-outs and making use of coherence based propagation for further refinement of the results from the first step.
Abstract: Researches in psychology, perception and related fields show that there may be a two-stage process involved in human vision. In this paper, we propose an approach by following a two-stage framework for saliency detection. In the first stage, we extend an existing spectrum residual model for better locating visual pop-outs, while in the second stage we make use of coherence based propagation for further refinement of the results from the first step. For evaluation of the proposed approach, 300 images with diverse contents were manually and accurately labeled. Experiments show that our approach achieves much better performance than that from the existing state-of-art.

Proceedings ArticleDOI
12 Dec 2008
TL;DR: A no-reference perceptual sharpness quality metric, inspired by visual attention information, is presented for a better simulation of the Human Visual System response to blur distortions.
Abstract: A no-reference perceptual sharpness quality metric, inspired by visual attention information, is presented for a better simulation of the Human Visual System (HVS) response to blur distortions. Saliency information about a scene is used to accentuate blur distortions around edges present in conspicuous areas and attenuate those distortions present in the rest of the image. Simulation results are presented to illustrate the performance of the proposed metric.

Journal ArticleDOI
TL;DR: This work shows how a cortical-like hierarchy obtains recognition and localization of objects and parts at multiple levels nearly simultaneously by a single feed-forward sweep from low to high levels of the hierarchy, followed by a feedback sweep from high- to low-level areas.
Abstract: The human visual system recognizes objects and their constituent parts rapidly and with high accuracy. Standard models of recognition by the visual cortex use feed-forward processing, in which an object's parts are detected before the complete object. However, parts are often ambiguous on their own and require the prior detection and localization of the entire object. We show how a cortical-like hierarchy obtains recognition and localization of objects and parts at multiple levels nearly simultaneously by a single feed-forward sweep from low to high levels of the hierarchy, followed by a feedback sweep from high- to low-level areas.

Journal ArticleDOI
TL;DR: The proposed method utilizes the temporal contrast thresholds of HVS to determine the maximum strength of watermark, which still gives imperceptible distortion after watermark insertion but gives much better robustness against common video distortions, such as additive Gaussian noise, video coding, frame rate conversions, and temporal shifts.
Abstract: Imperceptibility requirement in video watermarking is more challenging compared with its image counterpart due to the additional dimension existing in video. The embedding system should not only yield spatially invisible watermarks for each frame of the video, but it should also take the temporal dimension into account in order to avoid any flicker distortion between frames. While some of the methods in the literature approach this problem by only allowing arbitrarily small modifications within frames in different transform domains, some others simply use implicit spatial properties of the human visual system (HVS), such as luminance masking, spatial masking, and contrast masking. In addition, some approaches exploit explicitly the spatial thresholds of HVS to determine the location and strength of the watermark. However, none of the former approaches have focused on guaranteeing temporal invisibility and achieving maximum watermark strength along the temporal direction. In this paper, temporal dimension is exploited for video watermarking by means of utilizing temporal sensitivity of the HVS. The proposed method utilizes the temporal contrast thresholds of HVS to determine the maximum strength of watermark, which still gives imperceptible distortion after watermark insertion. Compared with some recognized methods in the literature, the proposed method avoids the typical visual degradations in the watermarked video, while still giving much better robustness against common video distortions, such as additive Gaussian noise, video coding, frame rate conversions, and temporal shifts, in terms of bit error rate.

Proceedings ArticleDOI
14 Feb 2008
TL;DR: This work investigates the first step toward an objective visual information evaluation: predicting the recognition threshold of different image representations, and advocates a multi-scale image structure analysis for a rudimentary evaluation of visual information.
Abstract: Natural images are meaningful to humans — the physical world exhibits statistical regularities that permit the human visual system (HVS) to infer useful interpretations. These regularities communicate the visual structure of the physical world and govern the statistics of images (image structure). A signal processing framework is sought to analyze image characteristics for a relationship with human interpretation. This work investigates the first step toward an objective visual information evaluation: predicting the recognition threshold of different image representations. Given a image sequence, whose images begin as unrecognizable and are gradually refined to include more information according to some measure, the recognition threshold corresponds to first the image in the sequence in which an observer accurately identifies the content. Sequences are produced using two types of image representations: signal-based and visual structure preserving. Signal-based representations add information as dictated by conventional mathematical characterizations of images based on models of low-level HVS processing and use basis functions as the basic image components. Visual structure preserving representations add information to images attributed to visual structure and attempt to mimic higher-level HVS processing by considering the scene’s objects as the basic image components. An experiment is conducted to identify the recognition threshold image. Several full-reference perceptual quality assessment algorithms are evaluated in terms of their ability to predict the recognition threshold of different image representations. The cross-correlation component of a modified version of the multi-scale structural similarity (MS-SSIM) metric, denoted MS-SSIM*, exhibits a better overall correlation with the signal-based and visual structure preserving representations’ average recognition thresholds than the standard MS-SSIM cross-correlation component. These findings underscore the significance of visual structure in recognition and advocate a multi-scale image structure analysis for a rudimentary evaluation of visual information.

Journal ArticleDOI
TL;DR: This paper proposes a low-complexity algorithm that executes at resource-limited user end to quantitatively and perceptually assess video quality under different spatial, temporal and SNR combinations and proposes an efficient adaptation algorithm, which dynamically adapts scalable video to a suitable three dimension combination.
Abstract: For wireless video streaming, the three dimensional scalabilities (spatial, temporal and SNR) provided by the advanced scalable video coding (SVC) technique can be directly utilized to adapt video streams to dynamic wireless network conditions and heterogeneous wireless devices. However, the question is how to optimally trade off among the three dimensional scalabilities so as to maximize the perceived video quality, given the available resource. In this paper, we propose a low-complexity algorithm that executes at resource-limited user end to quantitatively and perceptually assess video quality under different spatial, temporal and SNR combinations. Based on the video quality measures, we further propose an efficient adaptation algorithm, which dynamically adapts scalable video to a suitable three dimension combination. Experimental results demonstrate the effectiveness of our proposed perceptual video adaptation framework.

Journal ArticleDOI
29 Dec 2008-PLOS ONE
TL;DR: Information diagnostic for face detection and individuation is roughly separable; the human visual system is independently sensitive to both types of information; neural responses differ according to the type of task-relevant information considered.
Abstract: Background: The variety of ways in which faces are categorized makes face recognition challenging for both synthetic and biological vision systems. Here we focus on two face processing tasks, detection and individuation, and explore whether differences in task demands lead to differences both in the features most effective for automatic recognition and in the featural codes recruited by neural processing. Methodology/Principal Findings: Our study appeals to a computational framework characterizing the features representing object categories as sets of overlapping image fragments. Within this framework, we assess the extent to which task-relevant information differs across image fragments. Based on objective differences we find among task-specific representations, we test the sensitivity of the human visual system to these different face descriptions independently of one another. Both behavior and functional magnetic resonance imaging reveal effects elicited by objective task-specific levels of information. Behaviorally, recognition performance with image fragments improves with increasing task-specific information carried by different face fragments. Neurally, this sensitivity to the two tasks manifests as differential localization of neural responses across the ventral visual pathway. Fragments diagnostic for detection evoke larger neural responses than non-diagnostic ones in the right posterior fusiform gyrus and bilaterally in the inferior occipital gyrus. In contrast, fragments diagnostic for individuation evoke larger responses than non-diagnostic ones in the anterior inferior temporal gyrus. Finally, for individuation only, pattern analysis reveals sensitivity to task-specific information within the right ‘‘fusiform face area’’. Conclusions/Significance: Our results demonstrate: 1) information diagnostic for face detection and individuation is roughly separable; 2) the human visual system is independently sensitive to both types of information; 3) neural responses differ according to the type of task-relevant information considered. More generally, these findings provide evidence for the computational utility and the neural validity of fragment-based visual representation and recognition.

Journal ArticleDOI
TL;DR: A new computational framework for modelling visual-object-based attention and attention-driven eye movements within an integrated system in a biologically inspired approach is presented, resulting in sophisticated performance in complicated natural scenes.

Journal ArticleDOI
TL;DR: The results obtained indicate that the proposed algorithm exhibits at least comparable results in contrast modification tasks to the other algorithms, in significantly reduced execution times.
Abstract: A new algorithm for fast contrast modification of standard dynamic range (SDR) images (8 bits/ channel) is presented. Its thrust is to enhance the contrast in the under-/over-exposed regions of SDR images, caused by the low dynamic range of the capturing device. It is motivated by the attributes of the shunting centre - surround cells of the human visual system. The main advantage of the proposed algorithm is its O(N ) complexity which results in very fast execution, even when executed on a conventional personal computer (0.2 s/frame for a 640 � 480 pixel resolution on a 3 GHz Pentium 4). Thus, it moderately increases the computational burden if it is used as a pre-processing stage for other image processing algorithms. The proposed method is compared with other established algorithms, which can enhance the contrast in the under-/over-exposed regions of SDR images: the multi-scale Retinex with colour rendition, the McCann Retinex (McCann99), the rational mapping function and the automatic colour equalisation. The results obtained by this comparison indicate that the proposed algorithm exhibits at least comparable results in contrast modification tasks to the other algorithms, in significantly reduced execution times.

Journal ArticleDOI
TL;DR: This work approximates the spatially variant properties of the human visual system with multiple low-cost off-the-shelf imaging sensors and maximizes the information throughput and bandwidth savings of the foveated system.
Abstract: Conventional imaging techniques adopt a rectilinear sampling approach, where a finite number of pixels are spread evenly across an entire field of view (FOV). Consequently, their imaging capabilities are limited by an inherent trade-off between the FOV and the resolving power. In contrast, a foveation technique allocates the limited resources (e.g., a finite number of pixels or transmission bandwidth) as a function of foveal eccentricities, which can significantly simplify the optical and electronic designs and reduce the data throughput, while the observer's ability to see fine details is maintained over the whole FOV. We explore an approach to a foveated imaging system design. Our approach approximates the spatially variant properties (i.e., resolution, contrast, and color sensitivities) of the human visual system with multiple low-cost off-the-shelf imaging sensors and maximizes the information throughput and bandwidth savings of the foveated system. We further validate our approach with the design of a compact dual-sensor foveated imaging system. A proof-of-concept bench prototype and experimental results are demonstrated.

Journal ArticleDOI
TL;DR: A novel two-stage noise removal algorithm to deal with impulse noise and fuzzy decision rules inspired by the human visual system are proposed to classify the image pixels into human perception sensitive class and nonsensitive class and to compensate the blur of the edge and the destruction caused by the median filter.
Abstract: In this paper, a novel two-stage noise removal algorithm to deal with impulse noise is proposed. In the first stage, an adaptive two-level feedforward neural network (NN) with a backpropagation training algorithm was applied to remove the noise cleanly and keep the uncorrupted information well. In the second stage, the fuzzy decision rules inspired by the human visual system (HVS) are proposed to classify the image pixels into human perception sensitive class and nonsensitive class, and to compensate the blur of the edge and the destruction caused by the median filter. An NN is proposed to enhance the sensitive regions with higher visual quality. According to the experimental results, the proposed method is superior to conventional methods in perceptual image quality as well as the clarity and smoothness in edge regions.

Journal ArticleDOI
TL;DR: Visualizations leverage the human visual system to support the process of sensemaking, in which information is collected, organized, and analyzed to generate knowledge and inform action.
Abstract: Visualizations leverage the human visual system to support the process of sensemaking, in which information is collected, organized, and analyzed to generate knowledge and inform action. Although m...

Journal ArticleDOI
TL;DR: In this paper, an image is decomposed into multiscale coefficients with a dyadic number of wedges constructed from a variety of neighboring scales to support the feasibility of using the curvelet transform for multipurpose watermarking.
Abstract: Multipurpose watermarking for content authentication and copyright verification are accomplished by using the multiscale curvelet transform. A curvelet transform gains better and sparser representation than most traditional multiscale transforms. In this paper, an image is decomposed into multiscale coefficients with a dyadic number of wedges constructed from a variety of neighboring scales. Image hash is designed to extract image features from an approximate scale. The image features represented in the form of bit sequences are then embedded onto the wedges by a quantization based on human visual system behavior. The implementation strategy achieves content authentications for fatigue watermarking and copyright verifications for robust watermarking. The experiments demonstrate good results to support the feasibility of using this method in multipurpose applications.

Journal ArticleDOI
TL;DR: Two different methods are proposed for computing an importance map that indicates the masking potential of the visual patterns on the surface, based on the Sarnoff visual discrimination metric and on the visual masking tool available in the current JPEG2000 standard.
Abstract: The properties of the human visual system are taken into account, along with the geometric aspects of an object, in a new surface remeshing algorithm and a new mesh simplification algorithm. Both algorithms have a preprocessing step and are followed by the remeshing or mesh simplification steps. The preprocessing step computes an importance map that indicates the visual masking potential of the visual patterns on the surface. The importance map is then used to guide the remeshing or mesh simplification algorithms. Two different methods are proposed for computing an importance map that indicates the masking potential of the visual patterns on the surface. The first one is based on the Sarnoff visual discrimination metric, and the second one is inspired by the visual masking tool available in the current JPEG2000 standard. Given an importance map, the surface remeshing algorithm automatically distributes few samples to surface regions with strong visual masking properties due to surface texturing, lighting variations, bump mapping, surface reflectance and inter-reflections. Similarly, the mesh simplification algorithm simplifies more aggressively where the light field of an object can hide more geometric artifacts.

Book
20 Oct 2008
TL;DR: This fully revised second edition concentrates on describing and analyzing the underlying concepts of image processing, and imparts a good conceptual understanding of the topic, and is suitable both as a textbook and a professional reference.
Abstract: Image processing is concerned with the analysis and manipulation of images by computer Providing a thorough treatment of image processing, with an emphasis on those aspects most used in computer graphics and vision, this fully revised second edition concentrates on describing and analyzing the underlying concepts of this subject As befits a modern introduction to this topic, a good balance is struck between discussing the underlying mathematics and the main topics of signal processing, data discretization, the theory of color and different color systems, operations in images, dithering and half-toning, warping and morphing, and image processing Significantly expanded and revised, this easy-to-follow text/reference reflects recent trends in science and technology that exploit image processing in computer graphics and vision applications Stochastic image models and statistical methods for image processing are covered, as is probability theory for image processing, and a focus on applications in image analysis and computer vision Features: Includes 5 new chapters and major changes throughout Adopts a conceptual approach with emphasis on the mathematical concepts and their applications Introduces an abstraction paradigm that relates mathematical models with image processing techniques and implementation methods - used throughout to help understanding of the mathematical theory and its practical use Motivates through an elementary presentation, opting for an intuitive description where needed Contains adopted innovative formulations whenever necessary for clarity of exposition Provides numerous examples and illustrations, as an aid to understanding Focuses on the aspects of image processing that have importance in computer graphics and vision applications Offers a comprehensive introductory chapter for instructors This comprehensive text imparts a good conceptual understanding of the topic, as a basis for further study, and is suitable both as a textbook and a professional reference The current extended edition is a must-have resource and guide for all studying or interested in this field